Generative AI is coming for video. A new website, QuickVid, combines multiple generative AI systems into a single tool for automatically creating YouTube, Instagram, TikTok, and Snapchat short videos. With just a single word, QuickVid selects a background video from a library, writes a script and keywords, overlays images generated by DALL-E 2, and adds synthetic voiceover and background music from YouTube’s royalty-free music library.
QuickVid creator Daniel Habib says he’s building the service to help creators meet the “ever-growing” demand of their fans.
“By providing developers with tools to quickly and easily create quality content, QuickVid helps them increase their content output and reduce the risk of burnout,” Habib said in an email interview with TechCrunch. “Our goal is to empower your favorite creators to keep up with the demands of their audience by leveraging advances in AI.”
But depending on how they’re used, tools like QuickVid threaten to flood already-crowded channels with spam and duplicate content. They also face potential backlash from creators who choose not to use the tools, whether for reasons of cost ($10 a month) or out of principle, but may have to compete with a slew of new AI-generated videos.
go to video
QuickVid, which took Habib, a self-taught developer who previously worked on Facebook Live and video infrastructure at Meta, to create in just a few weeks, launched on December 27th. It’s relatively bare right now – Habib says more personalization options will arrive in January – but QuickVid can cobble together the components that make up a typical informational YouTube Short or TikTok video, including captions and even avatars.
It’s easy to use. First, a user enters a prompt describing the subject of the video they want to create. QuickVid uses the command prompt to generate a script, taking advantage of GPT-3’s generative text capabilities. Using keywords either automatically extracted from the script or entered manually, QuickVid selects a background video from Pexels’ royalty-free stock media library and uses DALL-E to generate 2 overlay images. Then there’s voiceover via Google Cloud’s text-to-speech API — Habib says users will soon be able to clone their voice — before combining all of those elements into one video.
Check out this video made with the Cats prompt:
QuickVid certainly doesn’t push the limits of what’s possible with generative AI. Both Meta and Google have introduced AI systems that can generate completely original clips given a text prompt. But QuickVid fuses existing AI to take advantage of the repetitive, template-based format of b-roll-heavy short-form videos, bypassing the problem of having to generate the footage itself.
“Successful creators have extremely high quality and aren’t interested in releasing content that they don’t feel is in their own voice,” Habib said. “That’s the use case we’re focusing on.”
Since that’s supposedly the case, QuickVid’s videos are generally a mixed bag when it comes to quality. The background videos are rather random or only marginally related to the topic, which is not surprising given that QuickVid is currently limited to the Pexels catalog. The images generated by DALL-E 2, meanwhile, exhibit the limitations of today’s text-to-image technology, such as garbled text and incorrect proportions.
In response to my feedback, Habib said that QuickVid is “tested and tinkered with daily”.
copy protection problems
According to Habib, QuickVid users retain the right to commercially exploit the content they create and have permission to monetize it on platforms like YouTube. But the copyright status around AI-generated content is…nebulous, at least for now. The US Patent and Trademark Office (USPTO) recently revoked copyright protection for an AI-generated comic, saying, for example, that copyrighted works require human authorship.
When asked how the USPTO decision could affect QuickVid, Habib said he believes it’s all about the “patentability” of AI-generated products and not the rights of creators to use and monetize their content . Creators, he pointed out, don’t often file patents for videos and typically lean into the creator economy, letting other creators reuse their clips to increase their own reach.
“Creators make a point of posting quality content with their voice that will help grow their channel,” Habib said.
Another legal challenge could affect QuickVid’s DALL-E 2 integration – and by extension, the site’s ability to generate image overlays. Microsoft, GitHub and OpenAI are being sued in a class action lawsuit alleging they violated copyright law by allowing Copilot, a code generation system, to regurgitate portions of licensed code without citing the source. (Copilot was developed jointly by OpenAI and GitHub, which are owned by Microsoft.) The case has implications for generative art AI like DALL-E 2, which was also copied and pasted from the datasets it was trained on (i.e. images) .
Habib isn’t concerned, arguing that the generative AI genius is out of the bottle. “If another lawsuit comes up tomorrow and OpenAI goes away, there are several alternatives that could power QuickVid,” he said, referring to the open-source DALL-E-2-like system, Stable Diffusion. QuickVid is already testing stable diffusion to generate avatar images.
moderation and spam
Legal dilemmas aside, QuickVid could soon face a moderation issue. While OpenAI has implemented filters and techniques to prevent them, generative AI has known issues of toxicity and factual accuracy. GPT-3 spreads misinformation, especially about current events, beyond the confines of its knowledge base. And ChatGPT, a finely tuned descendant of GPT-3, has been proven to use sexist and racist language.
This is of particular concern to people who would use QuickVid to create informational videos. In a quick test, I had my partner—who is far more creative than I am in this area in particular—type in a few obnoxious prompts to see what QuickVid would generate. To QuickVid’s credit, obviously problematic prompts like “Jewish New World Order” and “9/11 conspiracy theory” didn’t spawn toxic scripts. But for Critical Race Theory Indoctrining Students, QuickVid created a video implying that Critical Race Theory could be used to brainwash school children.
Habib says he relies on OpenAI’s filters to do most of the moderation work, and claims that it’s up to users to manually review every video created by QuickVid to ensure “everything is within the bounds of the law.” located”.
“As a general rule, I believe people should be able to express themselves and create content that they want,” Habib said.
Apparently, this includes spam content. Habib argues that the video platforms’ algorithms, not QuickVid, are best placed to determine a video’s quality, and that people who produce low-quality content “only tarnish their own reputations.” Of course, the reputational damage will discourage people from creating bulk spam campaigns with QuickVid, he says.
“If people don’t want to see your video, you won’t be shared on platforms like YouTube,” he added. “Producing low-quality content will also make people see your channel in a negative light.”
But it’s instructive to look at advertising agencies like Fractl, which used an AI system called Grover to create an entire website of marketing materials in 2019 – reputation be damned. In an interview with The Verge, Fractl partner Kristin Tynski said she foresees generative AI that will enable “a massive tsunami of computer-generated content in every imaginable niche.”
In any case, video platforms like TikTok and YouTube have not had to contend with moderation of AI-generated content on a large scale. Deepfakes — synthetic videos that replace an existing person with someone else’s likeness — began populating platforms like YouTube a few years ago, fueled by tools that simplified the production of deepfake footage. But unlike today’s most convincing deepfakes, the types of videos QuickVid creates aren’t obviously generated by AI in any way.
Google Search’s AI-generated text policy could be a preview of what’s to come in the video space. Google doesn’t treat synthetic text any differently than human-written text when it comes to search rankings, but takes action on content “intended to manipulate search rankings and not help users.” This includes merged or combined content from different websites that “[doesn’t] create sufficient added value” and content generated by purely automated processes, both of which could apply to QuickVid.
In other words, AI-generated videos may not be immediately banned from platforms when they take off at scale, but simply become a cost of doing business. That may not allay fears from experts who believe platforms like TikTok will become a new home for deceptive videos, but – as Habib said during the interview – “the generative AI revolution is unstoppable.”