Image Kernels Explained Visually

JosephRedfern · on Jan 29, 2015

I first learnt about Kernels and Convolution a few months ago during a Computer Vision module at University - was really insightful. The exact methods used to perform Gaussian blurring/edge detection etc was something I hadn't given much though to before.

A cool fact about the Gaussian filter is that it's separable - you can convolve in the X direction (using a 1 * n kernel), and then convolve the result again along the Y direction (using a m * 1 kernel) - the final result will be the same as convolving using a single m * n kernel, but can be done in O(N) time rather than O(N^2) (you only have m+n multiplications per pixel rather than m * n per pixel).

Not every filer is separable - it's only possible an n * m filter can be expressed as the product of a 1 * m and a n * 1 matrix.

Another cool fact is that you can perform convolution in as a point-wise multiplication in Fourier space (see http://en.wikipedia.org/wiki/Convolution_theorem).

ansgri · on Jan 30, 2015

Two really cool facts about Gaussian filter are: 1) it's the only separable isotropic (ie 'round') kernel, and 2) there's a Deriche approximation where the number of operations per pixel doesn't depend on filter size: https://espace.library.uq.edu.au/view/UQ:10982/IIR.pdf

choppaface · on Jan 29, 2015

A few other useful resources to play with convolutional kernels:

* The linked demo really focuses only on layer 1. See Layers >=2 plotted here: http://arxiv.org/pdf/1311.2901v3.pdf

* DeepViz is a nice tool: https://github.com/bruckner/deepViz

* Another tool one can use to play with convolutional kernels is ShaderToy ( https://www.shadertoy.com/ ). Here's a Gaussian blur: https://www.shadertoy.com/view/XdfGDH

* If you like playing with kernels in shaders, see also Brad Larson's GPUImage: ( https://github.com/BradLarson/GPUImage ) -- the demo app has a bunch of standard kernels.

Just for fun, convolutional shader porn: https://www.shadertoy.com/view/4d2Xzc

Bjartr · on Jan 30, 2015

I've always been a fan of the non-linear kernel Kuwahara filter

https://www.shadertoy.com/view/lls3WM

http://en.wikipedia.org/wiki/Kuwahara_filter

ansgri · on Jan 30, 2015

It's the same as the bilateral filter, isn't it?

crux · on Jan 29, 2015

I made a custom one that I like:

   -.5 | 1.5 | -.5
   1.5 | -3  | 1.5
   -.5 | 1.5 | -.5

It produces a cool digital edge blurring effect. What's that called?

pdq · on Jan 29, 2015

There's a few examples here: http://docs.gimp.org/en/plug-in-convmatrix.html

Try the "edge detection" example with -3 instead of -4, and it looks very similar to yours.

neaanopri · on Jan 29, 2015

For the custom, it would be great if there was an option to keep the matrix normalized.

leeoniya · on Jan 29, 2015

and if interested, error-diffusion dithering kernels:

http://www.tannerhelland.com/4660/dithering-eleven-algorithm...

blt · on Jan 29, 2015

> They're also used in machine learning for 'feature extraction'... In this context the process is referred to more generally as "convolution"

It's referred to as "convolution" in the image processing community too.

raverbashing · on Jan 29, 2015

Not surprising, because the kernel is convoluted on the pixels

http://en.wikipedia.org/wiki/Convolution

Now, to what's happening, you're basically applying a FIR filter to each pixel, so that each one depends also on the frequency information of adjacent pixels (in 2 dimensions)

If someone wants to know more: http://en.wikipedia.org/wiki/Finite_impulse_response http://en.wikipedia.org/wiki/Z-transform

lp251 · on Jan 29, 2015

Minor nitpicks- "Convolved" with the pixels. And the FIR filter doesn't depend on the frequency information in the adjacent pixels, but rather the intensity of the pixels. A short FIR filter must have large frequency support, so the filtering depends on the frequency information given by all pixels.

darkmighty · on Jan 29, 2015

That's a much more profound explanation than the given. You can actually come up with values yourself then, and it reveals why cases like "edge detection" and "blur" work so nicely (edge detection approximates differentiation; blur acts as a low pass filter;...).

ashmud · on Jan 29, 2015

SAR Image Processor was my first exposure to convolution. http://www.general-cathexis.com/

daniel-levin · on Jan 29, 2015

For interest's sake, note that the blur kernel used here is an approximation of the Gaussian [1]. Also, the vImage documentation includes a brief discussion on where the values in these kernels came from [2]

[1] http://en.wikipedia.org/wiki/Gaussian_blur

[2] https://developer.apple.com/library/ios/documentation/Perfor...

placebo · on Jan 29, 2015

The world needs more visual hands-on explanations :) Very nice work

dxoapv · on Jan 30, 2015

Slightly offtopic but I really like the effect with the side to side images. The html tags are <image-as-matrix></image-as-matrix> and <kernel-matrix></kernel-matrix>.

Does anyone know which lib (if any) he's using or how it's done? Read the source, couldn't figure it out.

rogerhoward · on Jan 30, 2015

Angular as the overall app framework, and D3.js for the canvas fun. Take a look at his script.js, it's pretty straightforward.

dxoapv · on Jan 30, 2015

Can it be done without Angular.js?

ibrahima · on Jan 30, 2015

AngularJS is just being used to make those custom HTML tags (called "directives" in Angular-speak). It likely has nothing to do with the actual functionality.

vicapow · on Jan 30, 2015

JoshTriplett · on Jan 29, 2015

Interesting to try a directionally-biased blur filter, which produces an out-of-focus effect but only in that dimension. For instance, put 0.5 in the left and right cells, or the top and bottom cells, with 0 everywhere else.

lxe · on Jan 29, 2015

This is my favorite Explained Visually. Don't forget to try the live video!

anonymousDan · on Jan 29, 2015

Thanks! I didn't even notice the live video option. How is that done I wonder? Is it all happening client side in Javascript?

vicapow · on Jan 29, 2015

Yep! It was tricky to make it performant but I mostly just used chromes performance profiling tool to pinpoint optimization compiler bailouts. A big one was just casting to ints with |0 inside the kernel function.

wodenokoto · on Jan 30, 2015

When you calculate the next pixel, do you use the updated value of the previous pixel or do you use the original?

davmre · on Jan 30, 2015

Aside from making conceptual sense, using the original value has the nice property that there's no dependence between the calculation of separate pixels, so the filtered image can be computed fully in parallel. GPUs are very good at this, which obviously makes them useful for graphics rendering, but also at other tasks that involve the same mathematical operation, such as deep learning (the first few stages of a convolution deep neural net are just the application of specially tuned convolution kernels).

teraflop · on Jan 30, 2015

The original value. You're computing a new image as a function of the input, not modifying the input in place.

vsbuffalo · on Jan 29, 2015

Really awesome stuff! I think there's a very minor bug in which missing pixels are treated as black, which is what adds a black border around the output image in the second example.