In case you didn’t know, generative AI is taking over everywhere. There’s a new ChatGPT story appearing daily, and every major player is jumping into the space. Now, NVIDIA is showing off its impressive text-to-video AI generator.
NVIDIA’s Toronto AI Lab division recently launched a website and released a research paper with some neat results. The “High-Resolution Video Synthesis with Latent Diffusion Models” project can turn text into video, or a GIF, in a matter of seconds. Better yet, it can do so while being computationally efficient and still output high-resolution files.
However, in nearly every example, it’s relatively easy to see the source of its images comes directly from Shutterstock. Of course, it’s important to remember that this is still an emerging technology and only a research project for now, but it again raises the question of AI and copyright.
Adobe’s Firefly AI creates some pretty stunning artwork, which it says happens without copyright concerns. Or, that’s the idea, at least. Even Shutterstock itself recently launched its own AI tool to try and combat the issue.
Either way, it’s interesting to see nearly every sample from NVIDIA with a Shutterstock watermark or blurred lines where it would be. That all aside, the tool still spits out some pretty impressive short 4-second videos.
According to NVIDIA, the tool is built on top of current text-to-image generator technologies like Stable Diffusion. By adding other dimensions to the model, we get life-like results. For example, NVIDIA asked the AI tool to “make a video of a Panda standing on a surfboard in the ocean at sunset, in 4K high-definition.” What you see below is the result it produced.
Again, eagle-eyed viewers will see a blurry outline of the Shutterstock logo. So while the video is wildly realistic, and the tool created an HD 1280×2048 resolution video that’s 4.7 seconds long, all from a line of text, it’s still not perfect.
Specific samples from the website show off artifacts around hands, especially when the AI is trying to make a video with a lot of motion. It’s also always blurry right around where the Shutterstock logo is located. The artifacts and watermarks make the tool best suited for small GIFs and thumbnails, but in the future, anything is possible.
Either way, typing a few lines of text into an AI tool and getting usable video in HD is certainly impressive. Plus, keep in mind that this is the worst AI photos and videos will ever look. From here on out, everything will continue to improve or become even more convincing.