OpenAI's Anticipated Sora Release Scheduled for This Year
OpenAI's Chief Technology Officer, Mira Murati, has revealed in a recent interview with The Wall Street Journal that the company's new AI platform, Sora, is set to launch "this year," potentially taking "a few months" to finalize. Murati, who joined OpenAI in 2018 as Vice President of Applied AI and Partnerships, has seen the company's research efforts surge and its expenses grow beyond manageable limits. In 2019, OpenAI transitioned into a for-profit entity while implementing a profit cap.
Rapid Ascension and Notable AI Releases under Murati's Leadership
Murati's swift rise within OpenAI includes her roles as Senior Vice President of Product and Partnerships and eventually CTO. During her tenure as CTO, OpenAI has released high-profile AI products such as DALL-E 2 and ChatGPT, garnering significant public interest.
AI Engine Capabilities and Safety Measures Discussed
The interview with The Wall Street Journal broadly covered topics including the types of content AI engines can generate and the safety measures currently in place. Combating misinformation has become a focal point for OpenAI. Murati stated that Sora will implement multiple safety guards to prevent misuse of the technology, with the development team aiming to avoid releasing features that could "affect global elections." It was noted that Sora will adhere to the same prompt policy as Dall-E, refusing to generate images of public figures such as the President of the United States.
Watermarking and Metadata for AI-Generated Content
The official version of Sora will also include watermarks on outputs, with a translucent OpenAI logo displayed in the bottom right corner to indicate AI-generated content. Murati added that the development team is considering content sourcing as another crucial factor, using metadata to provide information about the origins of digital media. Despite these efforts, concerns remain, as researchers have previously broken existing image watermark protection mechanisms, including those of OpenAI.
Enhanced Generation Capabilities
Contrary to rumors that generating videos takes hours, a live demonstration showed Sora creating a 20-second, 720P resolution video in just minutes. Additionally, Sora's operational costs are significantly higher than Dall-E's, with OpenAI striving to make the tool's costs similar to its AI text-to-image model, DALL-E, upon public release.
Future Developments and Editing Tools
When discussing Sora's future, Murati shared several intriguing updates. The development team plans to "eventually" add sound to videos for a more realistic experience and is preparing editing tools to provide online creators with a method to correct AI mistakes.
Despite its advanced capabilities, Sora is not infallible and often makes errors. A prominent example from the interview involved a prompt requesting a video of a robot stealing a camera from a woman, which resulted in a clip showing part of the woman's body turning into a mechanical structure. Murati acknowledged the need for improvement, stating that while Sora AI is "quite good in terms of continuity, it's not perfect."
Addressing Nudity and Artistic Expression
The topic of displaying nudity was also broached, with Murati noting that OpenAI is "exploring with artists… what kind of nudity can be shown." She emphasized that artists may desire more control during the creative process. OpenAI is collaborating with artists and creators from various fields to explore the most practical features and the appropriate level of flexibility the tool should offer.
The development team has found no insurmountable contradiction between artistic nudity and the prohibition of non-consensual deepfakes. OpenAI's sole aim is to establish their product as a platform for expanding creativity.
Ongoing Testing and Data Source Concerns
When questioned about the training data used for Sora, Murati was evasive. OpenAI has recently faced copyright infringement lawsuits, accusing the AI company of scraping content to train ChatGPT without permission.
She initially claimed that, to her knowledge, no data other than "publicly available and licensed data" was used to train AI. However, Murati admitted uncertainty regarding the use of videos from YouTube, Facebook, or Instagram during training. She later confirmed the use of Shutterstock media content for training, noting the partnership between Shutterstock and OpenAI.
Sora Project Personnel Tim Brooks also evaded questions about training data specifics, stating, "It's not convenient to go into too much detail, but broadly, it includes public data and licensed data from OpenAI."
Brooks shared a detail about training Sora with a vast amount of video data: "Previously, whether for image or video models, everyone usually trained on a fixed size. We trained Sora with videos of varying lengths, ratios, and resolutions. As for the approach, we took a variety of images and videos, whether widescreen, long-format, small clips, high-definition, or low-definition, and divided them into small pieces. Then, based on the size of the input video, we trained the model to recognize different numbers of pieces. This way, our model can learn from a variety of data more flexibly and generate content in different resolutions and sizes."
Murati promised that Sora "definitely" will be released by the end of the year but did not provide an exact date, only indicating it would be within the next few months. The development team continues to conduct safety tests on the engine, aiming to identify any "vulnerabilities, biases, and other harmful outcomes."
For those eager to experience Sora firsthand, it is recommended to become familiar with editing software. After all, it's important to remember that Sora will make many mistakes, even after the official release. In conclusion, let's all look forward to the debut of this promising newcomer in the AI landscape!