close
close

topicnews · October 16, 2024

The only reason I will never use AI text to video generators

The only reason I will never use AI text to video generators

Key insights

  • AI video generators are showing impressive development, but the uncanny valley effect remains.
  • Advances in AI technology promise more realistic videos, but they still seem unsettling and unsettling.
  • Newer AI generators strive for highly polished perfection, but run the risk of looking lifeless and sterile.



I’ve tried several AI text-to-video generators, and while the technology is undeniably impressive, there’s always something a little off about the end results. It took a while to pinpoint the problem, but I finally realized that it boils down to one thing: the uncanny valley effect.

While I use some AI-powered visual effects tools in my video projects, I can’t bring myself to use AI to create video footage because it just looks too… scary.


The biggest problem with AI text-to-video generators

Thanks to advances in deep learning, AI video generation has made great strides in a short period of time. If you were online in 2023 when AI video generation exploded, you may remember this clip of Will Smith eating spaghetti and making the rounds. As groundbreaking as this type of technology was at the time, there’s no denying how unnatural and unsettling it looks.


In 2024, these generative AI video tools will become more sophisticated, providing smoother images and more realistic movements. Take a look at the difference between the videos created with Runway Gen-2 in 2023 and the ones OpenAI unveiled in 2024 introducing Sora AI. Sora is not yet available for public use, but we are promised this quality:

Despite the improvement, I’m still not sold. For one thing, Sora isn’t available yet, so we still have to use less sophisticated generators that produce the same scary results as Will Smith’s spaghetti video.

Just watch this video I made using PixVerse with the prompt “A person walks through a park on a sunny day, smiling and waving at the camera. Birds fly overhead and trees sway gently in the wind.


The first two seconds look decent until the person’s fingers, hair and face start to merge in the air! Even as more advanced generators like Sora come along and give us more accurate and beautiful videos, there is still something unsettling about the people and landscapes generated by AI.

While older models typically produce videos with clear AI overtones, such as: B. claymation-style visuals, the improvements of the newer generators are almost visible to perfect. When I look at these clips of Sora, it feels like trying to refine the results is veering into hyperpolished territory, where everything looks so pristine that it ends up feeling sterile and lifeless.

Unnatural, disturbing, sterile and lifeless. That’s exactly what the Uncanny Valley effect is – human-like, but not quite human.


No matter how good these generators get, the uncanny valley effect will always persist. Unless I go for an abstract aesthetic as surreal as one can only see in dreams, I will not rely on an AI text-to-video generator for any of my video projects.