Both OnLive and Gaikai tried this, but I don't think the lag nor the economics really works out.
Consider the following: Let's say the server gets 60fps, that is a frame is rendered in 16ms. If the compression takes another 16ms (charitable!), and the ping time is another 16ms, plus 16ms to decode on the client (plus 16ms to display!), then by the time you press a the 'fire' button to shoot, until you see the output, 5-6 frames have going by, or about 80-96ms, and that's assuming some pretty optimal best case.
OnLive was measured in the range of 150ms to 250ms input lag. This is just not acceptable for twitch gaming IMHO.
Now that's some data Carmack's not going to like ;-)
> The debug stats in the top corner showed that he was using 60-80KB/sec of bandwidth while playing. He explained that each frame was being sent down over the internet from a co-lo center.
I'm wondering what kind of miracle makes a video game display streamable at that speed with any form of acceptable quality (not even talking about 1080p@60Hz), especially given the following:
> The codec in the prototype was nothing more than a series of JPEG images sent at 15-60+ frames per second.
A single acceptably lossy typical JPEG game shot at 1080p is north of 600~800kB, while even one at 720p (typical of a '08 laptop) come up being 150~300kB. Below that you start to get noticeable artifacts.
Even if the game engine/platform knows what has been updated and only send deltas of sort (again taking the series of JPEG images case), you still have to account for the worst case, that is basically the whole screen changes at once, sixty times per second.
And indeed, in the Steam video, you can see throughput data at the top: when nothing's moving the value is as expected very low, but you immediately notice the peak at 9Mbps when the L4D video ends[0], with typical values during the video at 4~6Mbps, which is much more inline with a 1080p stream.
Even the fading Source logo produces a 2.5Mbps stream (which produces very small diff areas), at 85% quality. There seems to be some cap at 48Mbps.
You can also see various timings, and the adaptative compression kicking in on the 9Mbps section, where you can read:
There seems to be another timing value at the beginning, with an "encryption" label, hovering at 9ms.
BTW, what's with the monotonically increasing with time stream = ${p}% value? That makes it look like a streamed video, not an interactive one. Not being suspicious but it felt weird.
Uhm - I'd be inclined to hope that a cluster of GPUs (or just the high end ones being used for something like this) would be capable of rendering at rather higher than 60fps - whilst screen refresh rates may not match that, I would expect each frame to be rendered in rather less than 16ms.
Similarly, by using a decently parallelisable image compression codec (JPEG2000?) it should be possible to compress with the advantage of the many GPUs, too.
16ms transfer time is certainly kept generously low, but beyond that, 16ms decode is something I won't comment on due to the variability of implementation speeds (though it would be done in native code, of course) but 16ms further to display... 32ms to decode and display a JPEG? I'm pretty sure I've seen MJPEG streams at higher than 30fps before!
Back of a napkin they might be, but I think the figures in this post might require further thought.
Edit: again, to be clear, I'm not suggesting that they are using all of these advantages right now, but the idea that this can't reasonably be done for twitch gaming, even today, strikes me as bizarre, when they are trying to set up a system with whatever custom technology is required to make it work.
A GPU (or a cluster of GPUs) might be able to process, say, 10,000 frames in one second. This does not mean that the same GPUs can process one frame in (1/10,000) of a second.
Even with an infinite number of parallel GPUs, there will be an amount of latency required in copying memory to the GPU, running a job, and copying it back. After the frame is compressed, sent over the network, and picked up by the client, further delay (possibly tens of milliseconds) is added on before pixels appear on the screen.
Consider the following: Let's say the server gets 60fps, that is a frame is rendered in 16ms. If the compression takes another 16ms (charitable!), and the ping time is another 16ms, plus 16ms to decode on the client (plus 16ms to display!), then by the time you press a the 'fire' button to shoot, until you see the output, 5-6 frames have going by, or about 80-96ms, and that's assuming some pretty optimal best case.
OnLive was measured in the range of 150ms to 250ms input lag. This is just not acceptable for twitch gaming IMHO.