Write text, get video: Meta announces AI video generator - Ars Technica

1 year ago 59

Seeing video is not believing —

Using a substance statement oregon an existing image, Make-A-Video tin render video connected demand.

- Sep 29, 2022 3:39 p.m. UTC

Still representation  from an AI-generated video of a teddy carnivore  coating  a portrait.

Enlarge / Still representation from an AI-generated video of a teddy carnivore coating a portrait.

Today, Meta announced Make-A-Video, an AI-powered video generator that tin make caller video contented from substance oregon representation prompts, akin to existing representation synthesis tools similar DALL-E and Stable Diffusion. It tin besides marque variations of existing videos, though it's not yet disposable for nationalist use.

On Make-A-Video's announcement page, Meta shows illustration videos generated from text, including "a young mates walking successful dense rain" and "a teddy carnivore coating a portrait." It besides showcases Make-A-Video's quality to instrumentality a static root representation and animate it. For example, a inactive photograph of a oversea turtle, erstwhile processed done the AI model, tin look to beryllium swimming.

The cardinal exertion down Make-A-Video—and wherefore it has arrived sooner than some experts anticipated—is that it builds disconnected existing enactment with text-to-image synthesis utilized with representation generators similar OpenAI's DALL-E. In July, Meta announced its ain text-to-image AI exemplary called Make-A-Scene.

Instead of grooming the Make-A-Video exemplary connected labeled video information (for example, captioned descriptions of the actions depicted), Meta alternatively took representation synthesis information (still images trained with captions) and applied unlabeled video grooming information truthful the exemplary learns a consciousness of wherever a substance oregon representation punctual mightiness beryllium successful clip and space. Then it tin foretell what comes aft the representation and show the country successful question for a abbreviated period.

  • A video of a teddy carnivore coating a portrait, created with Meta's Make-A-Video AI exemplary (converted to GIF for show here).

  • A video of "a young mates walking successful a dense rain" created with Make-A-Video.

  • Video of a oversea turtle, animated from a inactive representation with Make-A-Video.

"Using function-preserving transformations, we widen the spatial layers astatine the exemplary initialization signifier to see temporal information," Meta wrote successful a white paper. "The extended spatial-temporal web includes caller attraction modules that larn temporal satellite dynamics from a postulation of videos."

Meta has not made an announcement astir however oregon erstwhile Make-A-Video mightiness go disposable to the nationalist oregon who would person entree to it. Meta provides a sign-up form radical tin capable retired if they are funny successful trying it successful the future.

Meta acknowledges that the quality to make photorealistic videos connected request presents definite societal hazards. At the bottommost of the announcement page, Meta says that each AI-generated video contented from Make-A-Video contains a watermark to "help guarantee viewers cognize the video was generated with AI and is not a captured video."

If history is immoderate guide, competitory unfastened root text-to-video models whitethorn travel (some, similar CogVideo, already exist), which could marque Meta's watermark safeguard irrelevant.

Read Entire Article