Generative AI videos advanced from comical to photorealistic within a single year. This is uncharted, dangerous territory.
By
Andrew Paul
|
Published Feb 16, 2024 1:15 PM EST
A screenshot from one of the many hyperrealistic videos generated by OpenAI’s Sora program. OpenAI
It’s hard to write about Sora without feeling like your mind is melting. But after OpenAI’s surprise artificial intelligence announcement yesterday afternoon, we have our best evidence yet of what a yet unregulated, consequence-free tech industry wants to sell you: a suite of energy-hungry black box AI products capable of producing photorealistic media that pushes the boundaries of legality, privacy, and objective reality.
Barring decisive, thoughtful, and comprehensive regulation, the online landscape could very well become virtually unrecognizable, and somehow even more untrustworthy, than ever before. Once the understandable “wow” factor of hyperreal woolly mammoths and paper art ocean scapes wears off, CEO Sam Altman’s newest distortion project remains concerning.
Introducing Sora, our text-to-video model.
Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. https://t.co/7j2JN27M3W
Prompt: “Beautiful, snowy… pic.twitter.com/ruTEWn87vf
— OpenAI (@OpenAI) February 15, 2024
The concept behind Sora (Japanese for “sky”) is nothing particularly new: It apparently is an AI program capable of generating high-definition video based solely on a user’s descriptive text inputs. To put it simply: Sora reportedly combines the text-to-image diffusion model powering DALL-E with a neural network system known as a transformer. While generally used to parse massive data sequences such as text, OpenAI allegedly adapted the transformer tech to handle video frames in a similar fashion.
“Apparently,” “reportedly,” “allegedly.” All these caveats are required when describing Sora, because as MIT Technology Review explains, OpenAI only granted access to yesterday’s example clips after media outlets agreed to wait until after the company’s official announcement to “seek the opinion of outside experts.” And even when OpenAI did preview their newest experiment, they did so without releasing a technical report or a backend demonstration of the model “actually working.”
This means that, for the conceivable future, not a single outside regulatory body, elected official, industry watchdog, or lowly tech reporter will know how Sora is rendering the most uncanny media ever produced by AI, what data Altman’s company scraped to train its new program, and how much energy is required to fuel these one-minute video renderings. You are at the mercy of what OpenAI chooses to share with the public—a company whose CEO repeatedly warned the extinction risk from AI is on par with nuclear war, but that only men like him can be trusted with the funds and resources to prevent this from happening.
Prompt: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.” pic.twitter.com/0JzpwPUGPB
— OpenAI (@OpenAI) February 15, 2024
The speed at which we got here is as dizzying as the videos themselves. New Atlas offered a solid encapsulation of the situation yesterday—OpenAI’s sample clips are by no means perfect, but in just nine months, we’ve gone from the “comedic horror” of AI Will Smith eating spaghetti, to near-photorealistic, high-definition videos depicting crowded city streets, extinct animals, and imaginary children’s fantasy characters. What will similar technology look like nine months from now—on the eve of potentially one of the most consequential US presidential elections in modern history.
Once you get over Sora’s parlor trick impressions, it’s hard to ignore the troubling implications. Sure, the videos are technological marvels. Sure, Sora could yield innovative, fun, even useful results. But what if someone used it to yield, well, anything other than “innovative,” “fun,” or “useful?” Humans are far more ingenious than any generative AI programs. So far, jailbreaking these things has only required some dedication, patience, and a desire to bend the technology for bad faith gains.
Prompt: “Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. the art style is 3d and realistic, with a focus on lighting and texture. the mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with… pic.twitter.com/aLMgJPI0y6
— OpenAI (@OpenAI) February 15, 2024
Companies like OpenAI promise they are currently developing security protocols and industry standards to prevent bad actors from exploiting our new technological world—an uncharted territory they continue to blaze recklessly into with projects like Sora. And yet they have failed miserably in implementing even the most basic safeguards: Deepfakes abuse human bodies, school districts harness ChatGPT to acquiesce to fascist book bans, and the lines between fact and fiction continue to smear.
[Related: Generative AI could face its biggest legal tests in 2024.]
OpenAI says there are no immediate plans for Sora’s public release, and that they are conducting red team tests to “assess critical areas for harms or risks.” But barring any kind of regulatory pushback, it’s possible OpenAI will unleash Sora as soon as possible.
“Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving [Artificial General Intelligence],” OpenAI said in yesterday’s announcement, once again explicitly referring to the company’s goal to create AI that is all-but-indistinguishable from humans.
Sora, a model to understand and simulate the real world—what’s left of it, at least.
>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : Popular Science – https://www.popsci.com/technology/openai-sora-generative-video/