That animated gif is ridiculously misleading. To someone who doesn't know what's going on, it would look like this can hallucinate Emma Watson's face out of a 5x5 px image.
Instead, this is essentially a fancy 2x upsampling, and the gif shows every other frame as a super-resolution result. In fact, if you look at details like her eye, it's not even doing that great of a result (unsure if this is supposed to be impressive given the current state-of-the-art).
It is impressive given current state-of-art. Your summary doesn't seem fair. Take a look at the first set (Set5) [1], and set upsampling scale to 4x and flip between Bicubic and LapSRN. What seems most exciting to me, is that LapSRN is more resource efficient than bicubic.
Whether or not their sampling is impressive given the current state of the art (though in that same set the other, non-bicubic methods seem to do pretty well compared to LapSRN), that gif definitely seems misleading. It's not obvious at all what is happening here, and it doesn't really even show off their method very well (as, say, separate gifs for each of the downsampling rates and/or clearly labeled frames). The way it is now, it really seems like each progressive enhancement is another stage in the image reconstruction.
I remember a proprietary image format (complete with a proprietary converter and viewer for Windows) in the 1990s that dubbed itself a "fractal compression algorithm" -- and did an impressive job with things like water ripples in rivers or grassy terrain.
You could "zoom in" 8-10X in an original nature photograph and be semi-convinced.
When I was still in computer vision, "super-resolution" meant taking multiple images and combining them to really be able to see what was in the scene.
Now it often means: take a single image and have the computer guess/make up image content. Not the same at all, IMO. There are worse examples than this, though.
I would also call it a perverted use of the term super-resolution. NNs perform image manipulations beyond resolution enhancement. They add features based on trained sources.
And? Do a search on scholar for "Example-based Super Resolution" or "Markov Networks for Superresolution". The term has been used this way in literature for maybe 20 years now.
OK. In my (limited) view of the field these kind of things started with "Super-Resolution From a Single Image" in 2009 but apparently it is older than that.
To me personally, it feels weird to call this super-resolution, but I realize I may be in the minority.
One is attempting to ascertain an underlying commonality given multiple measurements, the other is attempting to infer more information than is present in any measurement.
The results look more blurry than necessary, and in some instances they look too sharp.
I think deep learning is a more promising approach, as for superresolution you really need to "invent" missing pieces of the image, and this can be highly context dependent.
Reminds me of Bourne-style "enhance" feature. Too bad the GIF on the main page appears to be marketing (seems impossible from a single frame like that) and takes away from the achievements of the paper.
Yes, that image is deliberately misleading to the lay person... it is showing how it up-samples from various inputs, each having greater resolution. It is impressive, but it seems to imply that it is doing something magical to produce a full resolution picture from a few mere pixels. What is actually doing is impressive enough, but it seems like an odd choice of promoting the work given the obvious visual implication which can't be an accident.
Am I missing something, or where are the originals?
(I'm assuming "super-resolution" in this context is like a function B = super_resolution(A). I can see lots of what I think are Bs, but where are the As? Aren't they super-relevant?)
As far as I can see there's only other kinds of Bs there (B = cubic(A), B = some_other_super_resolution_menthod(A), etc, etc). Can't find the real As... :P
There are signs in the data that the downsampling technique that is being used is not gamma-correct. That would somewhat undermine the results (and also the NNs, if they were trained on similarly broken inputs). Can one of the authors clarify that gamma-correct downsampling/blurring/convolution was used?
Please be sure to leave time to start loading the various filtered versions, if you hover too fast the alternate image won't load, silently, and you'll be left with the last successfully loaded one.
Almost every site like this gets this wrong - if the switching only happens once the image has finished loading, you need to otherwise hide the now-wrong image. Having to second-guess whether an image has loaded is asinine.
some have a slightly different UX making it more obvious, IIRC fabrice bellard site for BPG format uses mouse down events to show the filtered version, it's way random than hovering on a link so you intuitively wait more on the first held down.
In the trivial sense, yes of course it could be applied. The real question is "would super-resolution be useful for images from large telescopes?", and I believe the answer is "mostly no".
I guess (one of) the exception(s) would be if you have many smaller telescopes that could scan the sky much faster than the few big ones, the small ones could use super-resolution techniques to look for objects that the big ones might find interesting. But I think "astronomer time" rather than "telescope time" is usually the limiting factor.
Mind you, telescopes already do a host of physical tricks to improve resolution, like sensor cooling, stacking images, adaptive optics with laser guide stars, advanced noise filtering etc. Whether ML super-resolution stuff would actually be useful on top of all those tricks is a question for the astronomers in the crowd.
Before the inevitable "enhance" comments come in, please note that these nets are making up the information to insert based on information from the training set.
How do you mean? AFAIK since those networks "halucinate" the additional data based on similar but actually unrelated data, its not of much use in science where accurate data is important.
Instead, this is essentially a fancy 2x upsampling, and the gif shows every other frame as a super-resolution result. In fact, if you look at details like her eye, it's not even doing that great of a result (unsure if this is supposed to be impressive given the current state-of-the-art).