I’ve been increasingly inclined to believe that how the eyes move over an image has a lot to do with abstracting and recognising what is in an image. Convolutional ANNs have made progress with visual invariance by looking for overlap in small chunks of an image, rather than the whole image. I think another key to visual invariance is the information sent to the cortex about eye movement.
Saccades tend to follow contrast (edges?) and the movement of the eyes when following the edges can give a lot of invariant information about an object because eye movements are relative to the image. The image below shows lines representing edges and red dots representing saccadic points. The arrows represent the common eye direction over the edges. If you were to use convolution to find the overlap between A and B there would be a fairly low score in all the levels of features. But to us they look similar because they both share characteristics of corners. This could be due to the very similar eye movements that occur on both. The motor commands to the eyes go up, up, up, … then right, right, right, … Very similar, and when representing that as an SDR over time there would be a massive overlap.
The same kind of invariance can be used on C, even though the direction of eye movement is different. The similarity comes in the sequence of saccades. Instead of up, up, up, … right, right, right, … the sequence is down, down down, … left, left, left, … the abstract representation being a, a, a, … b, b, b, … They key point is the absolute delta of the eye movements in the [x,y] directions between each saccade. [0,1], [0,1], [0,1] … [1,0], [1,0], [1,0].
Given another shape like an oblong, although there is very little overlap between the two orientations, we see they are very similar. When your eyes move over the edges the movements are very similar, especially when you compare it to a rectangle.
Using saccading information could solve the problem of position, scale and rotation invariance. I’d like to read other people’s thoughts on this, and I’m curious if there’s any relevant research already out there on how saccades feed into the cortex.