Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution

Mr_P · on April 17, 2017

That animated gif is ridiculously misleading. To someone who doesn't know what's going on, it would look like this can hallucinate Emma Watson's face out of a 5x5 px image.

Instead, this is essentially a fancy 2x upsampling, and the gif shows every other frame as a super-resolution result. In fact, if you look at details like her eye, it's not even doing that great of a result (unsure if this is supposed to be impressive given the current state-of-the-art).

dmos62 · on April 17, 2017

It is impressive given current state-of-art. Your summary doesn't seem fair. Take a look at the first set (Set5) [1], and set upsampling scale to 4x and flip between Bicubic and LapSRN. What seems most exciting to me, is that LapSRN is more resource efficient than bicubic.

[1] http://vllab1.ucmerced.edu/~wlai24/LapSRN/results/Set5/

x1798DE · on April 17, 2017

Whether or not their sampling is impressive given the current state of the art (though in that same set the other, non-bicubic methods seem to do pretty well compared to LapSRN), that gif definitely seems misleading. It's not obvious at all what is happening here, and it doesn't really even show off their method very well (as, say, separate gifs for each of the downsampling rates and/or clearly labeled frames). The way it is now, it really seems like each progressive enhancement is another stage in the image reconstruction.

dmos62 · on April 17, 2017

The banner gif is confusing, reminds me of the Super Troopers Enhance scene. Not fitting for an academic publication.

thanatropism · on April 17, 2017

I remember a proprietary image format (complete with a proprietary converter and viewer for Windows) in the 1990s that dubbed itself a "fractal compression algorithm" -- and did an impressive job with things like water ripples in rivers or grassy terrain.

You could "zoom in" 8-10X in an original nature photograph and be semi-convinced.

vernie · on April 17, 2017

It's still around: https://en.wikipedia.org/wiki/Genuine_Fractals

petters · on April 17, 2017

When I was still in computer vision, "super-resolution" meant taking multiple images and combining them to really be able to see what was in the scene.

Now it often means: take a single image and have the computer guess/make up image content. Not the same at all, IMO. There are worse examples than this, though.

BugsJustFindMe · on April 17, 2017

I don't know when you were in computer vision, but when I was in computer vision a bit more than a decade ago it meant both.

yousry · on April 17, 2017

I would also call it a perverted use of the term super-resolution. NNs perform image manipulations beyond resolution enhancement. They add features based on trained sources.

BugsJustFindMe · on April 17, 2017

And? Do a search on scholar for "Example-based Super Resolution" or "Markov Networks for Superresolution". The term has been used this way in literature for maybe 20 years now.

petters · on April 17, 2017

OK. In my (limited) view of the field these kind of things started with "Super-Resolution From a Single Image" in 2009 but apparently it is older than that.

To me personally, it feels weird to call this super-resolution, but I realize I may be in the minority.

posterboy · on April 17, 2017

Isn't either essentially the same. To say guessing neural networks combine the input of multiple images would be an understatement.

ISL · on April 17, 2017

One is attempting to ascertain an underlying commonality given multiple measurements, the other is attempting to infer more information than is present in any measurement.

They're quite different.

petters · on April 17, 2017

Have you seen the "super-resolution" results with faces and NN?

Imagine grainy CCTV footage as evidence in court. A serious image analyst may attempt using super-resolution to obtain the face of a suspect.

But a NN could happily insert a very plausible face that looks great. It could have little to do with reality, though.

amelius · on April 17, 2017

The results look more blurry than necessary, and in some instances they look too sharp.

I think deep learning is a more promising approach, as for superresolution you really need to "invent" missing pieces of the image, and this can be highly context dependent.

TuringNYC · on April 17, 2017

Reminds me of Bourne-style "enhance" feature. Too bad the GIF on the main page appears to be marketing (seems impossible from a single frame like that) and takes away from the achievements of the paper.

th0ma5 · on April 17, 2017

Yes, that image is deliberately misleading to the lay person... it is showing how it up-samples from various inputs, each having greater resolution. It is impressive, but it seems to imply that it is doing something magical to produce a full resolution picture from a few mere pixels. What is actually doing is impressive enough, but it seems like an odd choice of promoting the work given the obvious visual implication which can't be an accident.

rini17 · on April 17, 2017

Something like every odd frame is input, every even frame is upsampled output?

breeze_em_out · on April 17, 2017

That gif is not even close to how it actually works, that's such a dirty thing to do, that's no accident...

bjornsing · on April 17, 2017

Am I missing something, or where are the originals?

(I'm assuming "super-resolution" in this context is like a function B = super_resolution(A). I can see lots of what I think are Bs, but where are the As? Aren't they super-relevant?)

() No pun intended.

dmos62 · on April 17, 2017

The example sets are clickable. That's where the As are.

bjornsing · on April 17, 2017

As far as I can see there's only other kinds of Bs there (B = cubic(A), B = some_other_super_resolution_menthod(A), etc, etc). Can't find the real As... :P

MagnumOpus · on April 17, 2017

The left-most item in the examples is labeled "ground truth". That is your "A" unless we misunderstand you.

x1798DE · on April 17, 2017

I think that the parent means that the "Ground Truth" is the high resolution original, and they never show a "low resolution" original.

Presumably the process goes:

High res. (A') --(Downsample)--> Low res. (A) -- (Upscale) --> Reconstructed (B)

The sets show A' and B, but not A (unless I'm way off on what's happening in this process).

bjornsing · on April 17, 2017

Yes, that's what I meant!

If the Ground Truth where the A then I'd implement my very own "optimal super-resolution" like this:

def optimal_super_resolution(A): return A

It would be a huge breakthrough. :P

ClassyJacket · on April 17, 2017

They don't appear to show the input anywhere.

ghusbands · on April 17, 2017

There are signs in the data that the downsampling technique that is being used is not gamma-correct. That would somewhat undermine the results (and also the NNs, if they were trained on similarly broken inputs). Can one of the authors clarify that gamma-correct downsampling/blurring/convolution was used?

vanderZwan · on April 17, 2017

With this idea of NNS to predict how to upscale images, I'm kind of wondering if LenPEG isn't becoming more relevant than ever:

http://www.dangermouse.net/esoteric/lenpeg.html

agumonkey · on April 17, 2017

Please be sure to leave time to start loading the various filtered versions, if you hover too fast the alternate image won't load, silently, and you'll be left with the last successfully loaded one.

ghusbands · on April 17, 2017

Almost every site like this gets this wrong - if the switching only happens once the image has finished loading, you need to otherwise hide the now-wrong image. Having to second-guess whether an image has loaded is asinine.

agumonkey · on April 17, 2017

some have a slightly different UX making it more obvious, IIRC fabrice bellard site for BPG format uses mouse down events to show the filtered version, it's way random than hovering on a link so you intuitively wait more on the first held down.

Folkloretiger · on April 17, 2017

Could this be applied to cosmological images from large telescopes.

semi-extrinsic · on April 17, 2017

In the trivial sense, yes of course it could be applied. The real question is "would super-resolution be useful for images from large telescopes?", and I believe the answer is "mostly no".

I guess (one of) the exception(s) would be if you have many smaller telescopes that could scan the sky much faster than the few big ones, the small ones could use super-resolution techniques to look for objects that the big ones might find interesting. But I think "astronomer time" rather than "telescope time" is usually the limiting factor.

Mind you, telescopes already do a host of physical tricks to improve resolution, like sensor cooling, stacking images, adaptive optics with laser guide stars, advanced noise filtering etc. Whether ML super-resolution stuff would actually be useful on top of all those tricks is a question for the astronomers in the crowd.

deepnotderp · on April 17, 2017

Before the inevitable "enhance" comments come in, please note that these nets are making up the information to insert based on information from the training set.

goombastic · on April 17, 2017

I wonder what this can do for astronomy.

hobofan · on April 17, 2017

How do you mean? AFAIK since those networks "halucinate" the additional data based on similar but actually unrelated data, its not of much use in science where accurate data is important.