I thought it was interesting the same technique (convolution) was used in two different applications: photoshop/gimp for image filters and in convolutional neural networks (CNNs). What is the purpose of convolution in CNNs?
the setosa interactive says convolution is "a technique for determining the most important portions of an image"
the hubel and wiesel experiment showed "there is a topographical map in the visual cortex that represents the visual field, where nearby cells process information from nearby visual fields. Moreover, their work determined that neurons in the visual cortex are arranged in a precise architecture. Cells with similar functions are organized into columns, tiny computational machines that relay information to a higher region of the brain, where a visual image is formed"
I'm interested in David Marr's work, about to order Vision: A Computational Investigation into the Human Representation and Processing of Visual Information, his level of analysis approach is interesting. I think he did work on edge detection, which must of been involved some kind of convolution/filter
This setosa interactive is great, and actually references the gimp documentation
An image kernel is a small matrix used to apply effects like the ones you might find in Photoshop or Gimp, such as blurring, sharpening, outlining or embossing. They're also used in machine learning for 'feature extraction', a technique for determining the most important portions of an image. In this context the process is referred to more generally as "convolution"
The classic experiments by Hubel and Wiesel are fundamental to our understanding of how neurons along the visual pathway extract increasingly complex information from the pattern of light cast on the retina to construct an image. For one, they showed that there is a topographical map in the visual cortex that represents the visual field, where nearby cells process information from nearby visual fields. Moreover, their work determined that neurons in the visual cortex are arranged in a precise architecture. Cells with similar functions are organized into columns, tiny computational machines that relay information to a higher region of the brain, where a visual image is formed.
Computer vision as analysis by synthesis has a long tradition and remains central to a wide class of generative methods. In this top-down approach, vision is formulated as the search for parameters of a model that is rendered to produce an image (or features of an image), which is then compared with image pixels (or features). The model can take many forms of varying realism but, when the model and rendering process are designed to produce realistic images, this process is often called inverse graphics. In a sense, the approach tries to reverse-engineer the physical process that produced an image of the world.
> I thought it was interesting the same technique (convolution) was used in two different applications: photoshop/gimp for image filters and in convolutional neural networks (CNNs). What is the purpose of convolution in CNNs?
IMO, convolution in CNNs would be better denoted as correlation. In CNNs the feature maps are convolved (correlated), eg swept over the image and multiple at each spot with the results being accumulated, in order to find where in the image these filters fit. The output produces a map of how strongly each filter (there are usually multiple in a convolution layer, that is the third dimension of the layer), fits with the image. These become the weights passed onto the next layers.
As an electrical engineer by training, who used convolution all the time, I didn't understand how the two were related, especially because of the third dimension.
the setosa interactive says convolution is "a technique for determining the most important portions of an image"
the hubel and wiesel experiment showed "there is a topographical map in the visual cortex that represents the visual field, where nearby cells process information from nearby visual fields. Moreover, their work determined that neurons in the visual cortex are arranged in a precise architecture. Cells with similar functions are organized into columns, tiny computational machines that relay information to a higher region of the brain, where a visual image is formed"
I'm interested in David Marr's work, about to order Vision: A Computational Investigation into the Human Representation and Processing of Visual Information, his level of analysis approach is interesting. I think he did work on edge detection, which must of been involved some kind of convolution/filter
This setosa interactive is great, and actually references the gimp documentation
http://setosa.io/ev/image-kernels/
An image kernel is a small matrix used to apply effects like the ones you might find in Photoshop or Gimp, such as blurring, sharpening, outlining or embossing. They're also used in machine learning for 'feature extraction', a technique for determining the most important portions of an image. In this context the process is referred to more generally as "convolution"
https://knowingneurons.com/2014/10/29/hubel-and-wiesel-the-n...
The classic experiments by Hubel and Wiesel are fundamental to our understanding of how neurons along the visual pathway extract increasingly complex information from the pattern of light cast on the retina to construct an image. For one, they showed that there is a topographical map in the visual cortex that represents the visual field, where nearby cells process information from nearby visual fields. Moreover, their work determined that neurons in the visual cortex are arranged in a precise architecture. Cells with similar functions are organized into columns, tiny computational machines that relay information to a higher region of the brain, where a visual image is formed.
https://ps.is.tuebingen.mpg.de/research_fields/inverse-graph...
Computer vision as analysis by synthesis has a long tradition and remains central to a wide class of generative methods. In this top-down approach, vision is formulated as the search for parameters of a model that is rendered to produce an image (or features of an image), which is then compared with image pixels (or features). The model can take many forms of varying realism but, when the model and rendering process are designed to produce realistic images, this process is often called inverse graphics. In a sense, the approach tries to reverse-engineer the physical process that produced an image of the world.