Skip to content

Runway Gen-3: A Leap Forward in AI Video Generation

  • 6 min read

Runway's New Gen-3 Video Generation Model Receives Praise: Surpassing Sora

AI company Runway, known for its popular video generation tools, has recently released the latest version of Runway Gen-3. Gen-3 Alpha is the first member of the model family trained on Runway's newly built infrastructure specifically for large-scale multimodal training. Compared to Gen-2, Gen-3 sees significant improvements in fidelity, consistency, and motion performance, taking a solid step towards building a universal world model.

The new model is still in the alpha testing phase and has not been publicly released. However, based on a series of demonstration videos, the next-generation model appears to have made significant leaps in continuity, realism, and prompt compliance compared to the currently open Gen-2.

Fine-grained temporal control

Gen-3 Alpha is trained with finely detailed, time-intensive descriptors, enabling imaginative transition effects and precise keyframe generation for scene elements.

Title: Runway Gen-3: A Leap Forward in AI Video Generation

Realistic human figures

Gen-3 Alpha excels at generating expressive human figures with a variety of actions, gestures, and emotions, opening up unprecedented narrative methods and spaces.

Made for artists, by artists

The training of Gen-3 Alpha is a collaborative effort by a cross-disciplinary team of research scientists, engineers, and artists, aiming to interpret various visual styles and camera languages.

Videos generated by the Gen-3 model, especially those containing close-ups of large-scale human faces, possess extremely realistic imagery. This has led members of the AI art community to compare it with OpenAI's yet-to-be-released but highly anticipated Sora.

User reviews

A Reddit user commented in a high-upvoted post under the Runway Gen-3 discussion topic, "Even though what's shown now are carefully selected high-quality works, the effects look much better than Sora. Sora's effects and visuals still have a stylized trace, but the videos here are more realistic and the best AI-generated videos I've seen so far."

Another user wrote in the Reddit AI Video subreddit with 66,000 members, "If you don't tell me, I would definitely think these scenes were shot for real."

AI filmmaker and self-proclaimed Runway creative partner user PZF tweeted, "These Runway Gen-3 clips are very appealing to me—they look very cinematic. The images are smooth and plain (I mean very natural) and quite credible."

In addition to the Gen-3 video generator, Runway has also launched a set of fine-tuning tools, offering more flexible image and camera control options. The company tweeted, "Gen-3 Alpha will support Runway's text-to-video, image-to-video, and text-to-image tools, existing control modes (such as motion brushes, advanced camera control, and director mode), and upcoming tools, providing unprecedented fine control over structure, style, and motion."

Gen-3 Alpha is the first member of the model family trained on Runway's newly built infrastructure specifically for large-scale multimodal training, representing a solid step towards building a universal world model.

Gen-3 Alpha has been jointly trained with videos and images, aiming to support Runway's text-to-video, image-to-video, and text-to-image tools, existing control modes (such as motion brushes, advanced camera control, director mode), and upcoming tools, providing unprecedented fine control over structure, style, and motion.

Title: Runway Gen-3: A Leap Forward in AI Video Generation

Runway claims that Gen-3 is an important step towards achieving the ambitious goal of building a "universal world model." These models enable AI systems to build internal representations of environments and simulate future events that will occur in that environment. This approach distinguishes Runway from traditional prediction technologies that focus only on the next possible frame within a specific timeline.

Although Runway has not yet revealed the specific release date for Gen-3, co-founder and CTO Anastasis Germanidis announced that Gen-3 Alpha "will soon appear within Runway's products." He also revealed that it will include existing modalities as well as "some new modalities that can only be achieved with more powerful foundational models."

Runway Gen-3 Alpha will soon appear in Runway's products and will support all familiar existing modalities (text-to-video, image-to-video, video-to-video), as well as some new modalities that can only be achieved with more powerful foundational models.

Competitor comparison

Runway's AI exploration journey began in 2021 when they collaborated with researchers at the University of Munich to develop the first version of Stable Diffusion. Stability AI later intervened, citing the need to help the project bear the computational costs, and propelled the global craze for AI video generation.

Since then, Runway has been an important player in the field of AI video generation, competing with rivals such as Pika Labs. However, as OpenAI announced the launch of Sora, which surpasses the capabilities of existing models, the market landscape has also changed. Hollywood actor Ashton Kutcher recently stated that tools like Sora could completely disrupt the logic of film and television creation, causing a sensation.

As the world eagerly awaits the release of Sora, new competitors have emerged, including Kling, created by Kuaishou, and Luma AI's Dream Machine.

Kling is a video generator from China that can generate 1080p resolution videos up to 2 minutes long at 30 frames per second, a significant improvement over existing models. This Chinese model has been released, but users need to register with a Chinese phone number. Kuaishou stated that a global version of the model will be launched later.

Another rising star, Dream Machine, is a free platform that can convert written text into dynamic videos, and the generated results surpass Runway Gen-2 in terms of quality, continuity, and prompt compliance. Users only need to submit a Google account to log in, but due to its popularity, content generation is often slow or even unable to complete video generation.

Title: Runway Gen-3: A Leap Forward in AI Video Generation

In the open-source field, although Stable Video Diffusion is not outstanding in generation effects, its open nature provides a solid foundation for subsequent improvements and development of the model. Vidu is another AI video generator developed by Beijing Shengshu Technology and Tsinghua University, using a proprietary visual transformation model architecture called Universal Vision Transformer (U-ViT), which can generate 1080p resolution videos 16 seconds long with a single click.

As for the aforementioned Pika Labs, since no significant updates have been released, its current generation effects are basically on par with Runway Gen-2.

Title: Runway Gen-3: A Leap Forward in AI Video Generation

Leave a Reply

Your email address will not be published. Required fields are marked *