First Iāll make clear that my use of āconvolutionā above is only to follow the ANN story. Brains were a thing before we knew of mathematicsā¦
So, what we know for sure is that there exist cells in V1 which react to some well identified situations. Iāll describe some below. Between neurons and ANNs, the ones trying to copy the others are of course our modern notions of āconvolutionsā and ākernelā.
Those well identified situations date back to Hubel&Wiesel studies. In particular they identified cells which reacted quite characteristically to edges seen in the visual field. Each such cell fired vigorously to the perception of edges of a particular orientation, and was mostly silent at other orientations. They were named āsimple cellsā.
One of the earliest large-scale, scientific visualizations of their organization is, to my knowledge, this plate by Gary Blasdel:
(see http://hubel.med.harvard.edu/book/b30.htm)
This colors a large patch of primary visual cortex, each color representing sensitivity to a particular orientation for those edge detectors. This was from a monkeyās brain, but rest assured I have very similar stuff in mine.
Now, this is almost textbook data. You may find more recent imagery (and possibly papers) on the subject, googling for āfunctional maps of orientation preferenceā.
What one needs to consider when trying to interpret any of the images above is that, beside this orientation thing, V1 layout is largely retinotropic, hence two close regions on the cortical patch react to events occurring in also two close regions in the visual field.
You can thus infer from the beautiful patchworks above that each local area in the visual field is associated to a patch of nicely arranged cells, covering the whole set of possible orientations for an edge happening to appear in that area.
So when a CNN uses a fixed edge-detection function over a local area, with a different output for different orientations, and applies this function as a convolution kernel over the entire input as its first filtering step, it is in essence trying to simulate the output of those āsimple cellsā.
V1 simple cells do that in a massively parallel way across each local area of its surface (and hence, of the visual field). Note that the concept of a ālocal areaā for visual field in brains is a lot more fuzzier that what is used in CNNs. And it could be arguably more continuous than discrete - Nevertheless. Whatās alike is that both are local, and both perform similar local edge-detection across the whole input.
What can be linked to @Bitkingās remark about the scanning nature of our visual perception, is that, in modern CNNs which may allow their convolution kernels to learn, what differentiates a āconvolutionā layer from a classical one with respect to learning is that same set of convolution cells in the model is fed repeatedly the input for each and all ālocal areasā composing the full input picture. Allegedly, the structure of our visual world and the fact that we constantly move our eyes over it, would have exposed each of our ānatural kernelsā to statistically similar data. Once again, this part of a CNN model and what V1 does would match.
Now, thereās more to V1 than āsimple cellsā, which weād have a few proposed models for already. There are complex cells. Color blobs. Cells concerned with stereoscopy, and all kinds of stuff. There is still a lot to be found. The very layout of this organization in cortical topology (in contrast to, say, simply studying CNN kernels with an ability to learn) is allegedly interesting in itself.
Anyway. There are large holes to our knowledge even about V1, but itās also one of the best specified patches of cortex we have, straight from the lab.
As I expressed in the very first postā¦ Iām far from the first person being interested in V1, by precisely this reason. Andā¦ maybe Iām a train late here. Could be. Iām just willing to try and inject our new understandings about NMDA spikes, also JH-style prediction, also possibly wave-interference ideasā¦ lots of stuff, really, in those kinds of studies, and see what comes out of it.