How is it not faked? This is the equivalent of that Nikola demo of their electric car, where they set it up to roll down a hill so that it looked like it was working.
The Google video demo is hugely faked, to the point of being smoke & mirrors.
The inputs to the AI are still frames, not video.
The input images are different to the video frames, often substantially different.
The prompts were provided as text, not audio as shown.
Both the prompts and responses are significantly longer.
The latency is much higher than shown, which appears to be near real-time.
Etc...
The demo they showed would be a mindblowing improvement in LLM technology and applicability to use-cases such as controlling a home robot.
Instead, we were shown what is essentially a short science fiction movie "inspired" by possible future capabilities, not current capabilities.
Google literally said so: "The video illustrates what the multimodal user experiences built with Gemini could look like. We made it to inspire developers."
PS: Their other benchmark results are also highly suspect, because there is a high chance that Gemini trained on the exam question-answer pairs inadvertently. They even admit this in several places, such as in their technical paper. So... they knew the results are bogus and published it anyway!
They overstated the capabilities. But even with some additional prompts, the results were amazing! If this had been shown 15 months ago, we would all be going gaga over it.
> This is the equivalent of that Nikola demo of their electric car, where they set it up to roll down a hill so that it looked like it was working.
Idk how the Gemini demo (a thing that actually does work and do the things displayed, just not in the exact way shown) is "equivalent" to a literally non functioning car...
> The inputs to the AI are still frames, not video.
A video is just a sequence of frames. The input is always going to be frames when it actually goes into the model, and you don't need every single frame to understand what's happening in the video.
> The prompts were provided as text, not audio as shown.
That's trivial to do now. Using Whisper, you can just turn voice into text and do the exact same thing. They don't really need to demonstrate that.
So sure, they definitely embellished, made it seem real-time and as if it didn't need more target-specific prompting per task. But saying it is completely fake is foolish.
Precisely! I don’t see how people don’t see this as outright fraud.
The demonstration was a machine intelligence picking out meaning from a video.
The reality was a HUMAN using their meat brain to pick out the meaningful still frames and feeding that in to an AI that couldn’t have completed the demonstrated task on its own!
This is like making a demo of a robot cleaning a house, without acknowledging the janitorial staff doing the actual cleaning off-camera.
It’s absurdly fraudulent and should never have been made public.
Videos like this for such an existential product ought to have been reviewed by the CEO. After all, Google’s future relevance as a corporation depends on it.
It was requested, made, reviewed, approved, and then published.
A faked video of science fiction wishful thinking for major product launch.
The Google video demo is hugely faked, to the point of being smoke & mirrors.
The inputs to the AI are still frames, not video.
The input images are different to the video frames, often substantially different.
The prompts were provided as text, not audio as shown.
Both the prompts and responses are significantly longer.
The latency is much higher than shown, which appears to be near real-time.
Etc...
The demo they showed would be a mindblowing improvement in LLM technology and applicability to use-cases such as controlling a home robot.
Instead, we were shown what is essentially a short science fiction movie "inspired" by possible future capabilities, not current capabilities.
Google literally said so: "The video illustrates what the multimodal user experiences built with Gemini could look like. We made it to inspire developers."
PS: Their other benchmark results are also highly suspect, because there is a high chance that Gemini trained on the exam question-answer pairs inadvertently. They even admit this in several places, such as in their technical paper. So... they knew the results are bogus and published it anyway!