This is all very impressive. I can't help to wonder though. How is text-to-video going to benefit humanity? That's what OpenAI is supposedly about, right?
We'll get some groundbreaking film content out of this in the hands of a few talented creatives, and a vast ocean of mediocre content from the hands of talentless people who know how to type. What's the benefit to humanity, concretely?
> Sora serves as a foundation for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.
For models to interact with real-world objects, they first need to understand those objects. These videos demonstrate just how advanced that awareness is. The goal is not to generate videos. Of course, they could and likely will build products on this capability, but the long-term goal is bigger.
Sure, if that's not just marketing. I haven't seen enough evidence to conclude this will go towards that kind of thing yet, but I'm open to the possibility.
They can probably reverse engineer this to build a multi-modal GPT that is fed video and understands what is going on. That's how you get "smart" robots. Active scene understanding via the video modality + conversational capabilities via the text/audio modality.
I'm not quite sure what you mean, so I'll ask for clarification. Are you saying this technology can be channeled into fighting disease and death, or that the man hours and computational freed up by this technology can be channeled?
Yeah, this is a very real issue with a lot of Silicon Valley tech, unfortunately. They're perfecting the art of pretending everything is fine, I feel like.
Biologists, chemists, and researchers can be all automated and trained on a very big LLM that OpenAI eventually creates. Then, more cures to diseases and technological advances can be invented. This technology can soon run entire countries and emulate humanity / society.
We'll get some groundbreaking film content out of this in the hands of a few talented creatives, and a vast ocean of mediocre content from the hands of talentless people who know how to type. What's the benefit to humanity, concretely?