Unfortunately, webp claims to have same image quality with smaller file size, but the 'real world analysis' compares images quality at the same file size.
Further, image quality is measured in terms of mean RGB or value difference, which is very technical and does not matter to the human eye (a lot). For example, in the portrait, the last picture, the JPEG artifacts in the person's face really hurt more than the blurring of webp, yet both have about the same statistical mean values.
Last, without the original lossless image (at hand), it is hard to tell which lossy encoder is better. Again with the portrait picture as an example, you have to download the provided lossless image to see that webp blurred the face too much.
Still, better than other quick-and-dirty 'analysis' I've seen so far.
1. How else would you construct an objective test? Creating images of the same quality and then comparing their file sizes seems prohibitively difficult if not impossible. This test, otoh, implies the same information (minus the exact difference in file size), but is pretty easy to set up.
2. Those were the reported numbers, but it's not like that was the only data point discussed in the article. Every single photo had deeper analysis and included subjective evaluation. He even directly made the point that you did about the last photo.
3. Loading megabytes of photo data into folks browsers by default doesn't seem worth the benefit (the page is already sluggish). If you're interested, the originals are available for download.
There's more than one "objective" test for image quality, and some of them are closer to human perception than PSNR. The trouble with comparisons like this is that the JPG encoder might be tuned to minimize PSNR, but if the WebP encoder is tuned to optimize for something else, of course it won't perform as well when you compare the PSNR!
More info on the issues with this type of benchmark from Jason Garrett-Glaser (a.k.a. Dark Shikari), an x264 developer: http://x264dev.multimedia.cx/?p=458
That's fine, I agree. My point was not "His metric is the most awesome, shut up", it was "Despite only giving one kind of number, the author still included other analysis and subjective opinion in the writeup for each photo, so it's not as bad as you're trying to imply".
You asked How else would you construct an objective test? And my answer is, just make sure you're being fair in measuring the same thing the encoders are optimising for.
That was in response to the complaint that the post looked at images of the same size and compared their quality, rather than looking at images of the same quality and comparing their size. If you know of a practical way to objectively do the latter, please share.
On two occasions I have been asked,—"Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" .... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
--Babbage (1864), Passages from the Life of a Philosopher, ch. 5 "Difference Engine No. 1"
You can't reasonably use an encoder optimized for PSNR and then "ding" it for producing output with bad SSIM, or vice-versa.
I'm not familiar enough with the image encoding algorithms (and the tradeoffs they are faced with) to comment on that, specifically. But I've done enough research with systems in general to disagree with your statement. No matter what I optimize a system for, it's always fair for someone to test it under adverse conditions. If we're going to adopt something, we need to know its limitations.
Further, my understanding of Google's intent with webp is that it is being offered as a replacement to jpeg. In that case, even if it's optimized for one thing, it's not just fair but necessary to see how it works under all other things.
About 1: This test looks at the dual claim, and that's more or less justified. (Look up duality in linear optimization for the definition. It's useful.)
Well, if you test an algorithm that claims to have same image quality with smaller file size, you should compare images of the same quality and look at the files size, not the other way round.
Besides, I don't see why you need megabytes of photo data, regular internet photo data would have sufficed (for which the authors say they have developed webp anyways).
I agree with your point that he included subjective evaluation, though.
I'm the author of the article. You make an excellent point here. It's very hard to quantify image quality. While true, you do need to have the lossless source files (http://jjcm.org:8081/webp) if you want to do a true comparison from the original, but in the real world I'd venture to guess you wont have those; that's why I felt it was unnecessary for most readers.
To my eye, webp reduces compression artifacts at the expense of losing detail (this is especially visible in the portrait example).
Another way to accomplish a similar effect without requiring a new file format would be to apply a selective gaussian blur (or another de-artifacting filter) to highly compressed jpegs before displaying them.
Of course, no one does that because the trade-off is generally not considered to be a good one.
Compression artifacts are loss of detail. Applying any kind of blur filter to an image that had compression artifacts would result in more loss of detail (and applying it strongly enough to remove the artifacts would pretty much obliterate any recognizable details).
Actually, while applying a blur does obviously eliminate detail, you can retain most of the important detail with a selective blur. Give it a try in gimp and see for yourself. The default selective gaussian settings work pretty well for de-artifacting, and the loss of detail arguably isn't much worse that that in webp compression. Some of the commercial de-artifacting tools use different techniques (I don't know what they are) that result in even less loss of detail.
Compression artifacts add noise via coefficient rounding, just like audio codecs add noise instead of "removing sound you can't hear". However, encoders with poor psy-optimization (i.e. not x264) usually choose blurry quantizations that don't add much noise but also remove the original detail.
Also, the deblocking filter in VP8/H264 is nothing but a selectively applied blur that hides previous compression steps. It works pretty well.
Those who don't learn from history are doomed to repeat it. This is a futile effort by Google and will go no where.
If they somehow managed to chop file sizes by 90% or more, it would have a small chance (it still wouldn't be guaranteed, as in the era of CDNs and caching and large pipes, static images just aren't a big concern). Instead they've marginally chopped file sizes in only certain scenarios, while adding numerous new downsides, while actually reducing the feature set.
Are they insane? I'm surprised they actually announced this.
Like MP3, JPEG is good enough unless the improvement of a replacement format is overwhelming. It doesn't have the political baggage of something like h264, so that argument doesn't apply.
More interesting than this silliness from Google are formats that actually bring new and impressive features. I recall that JPEG2000 could do incremental, stallable loading (or maybe I'm thinking of something else), such that as you scaled an image it wasn't loading an entirely new image, but instead was loading incrementally more data to provide the detail for that level. IP stopped it from taking off, but that was actually interesting. This isn't.
Give it 15 years. If it's free and better, then at some point, all browsers and tools will have it as an option. If CPU keeps improving faster than bandwidth, and free codebases keep growing, then an option to create webP on the fly (or downconvert to jpg for older clients) will become effortless.
In the meantime, if early adopters get a slightly better web experience -- that's a win for Google. They want more marginal pressure to upgrade.
Further, image quality is measured in terms of mean RGB or value difference, which is very technical and does not matter to the human eye (a lot). For example, in the portrait, the last picture, the JPEG artifacts in the person's face really hurt more than the blurring of webp, yet both have about the same statistical mean values.
Last, without the original lossless image (at hand), it is hard to tell which lossy encoder is better. Again with the portrait picture as an example, you have to download the provided lossless image to see that webp blurred the face too much.
Still, better than other quick-and-dirty 'analysis' I've seen so far.