This may be a non-biological way to do image processing; the programming community has been coming up with all sorts of hacks to to do this or that part of image processing for many years.
Keep in mind that for mammals, we look at things by moving our eyes. The acute vision part of your eye can see a spot about the size of your thumbnail at arms length. (OK, a little more - but not much more)
See this post to get a better idea of what the brain has to process as fed by your roving eye.
The thumbnails are all projected onto the same area of cortex in rapid succession. There is no tracing of an objects outlines. I have a great deal of trouble matching this known fact of human vision with your proposal.
How about in v1 we have lines, in v2 we combine lines into an angel.
Then it gradually integrates to shapes.
We don’t need to trace the outline of everything.
In the walking among rocks example maybe the person doesn’t need to recognize the shapes of all the stones, but only a surface to land feet. Thus, the eye is tracing fitting surface. In this case, I think light is one of the most important things to take into consideration.
Is there an eye-tracing character gif like “A, B, C”? That should help analysis.
Sorry, I don’t know much neuron science, since I am only a cs undergrad.
But thanks for replying to me
Not necessary. Muscle control all the way from the tension and joint angle control through to the cortical control is a series of simple nested servo loops.
A large part of this is handled in the spinal column, notably, pair control in opposing muscles in joints and pair control in opposing limbs.
Perhaps it’s time to check out some physiology texts? There are good ones that describe the neurological system in great detail.
I don’t think so.
When you have the feet close to the Giza pyramid, you may be able to deduce, from large movements of the eye or head, that those moves were triangular… and thus the shape looks like a triangle (at least in 2D). But (unless specifically trained for this ?), this seems a rather slow “understanding”
Now, when presented with a long-range picture of Giza, in which you see the whole in one shot, my bet is that the realization “oh it’s a triangle” is blinding-fast.
What’s happening is (imho) closer to your first lines of thoughts :
Although it’s not v1->v2 per se which goes from lines to angel… (more like, say v2->v4->somewhere-angel-reactive ?) but, yeah, this kind of hierarchy makes more sense (to me). Also note that such a model leaves out a lot of the real complexities found in those areas.
However, I wouldn’t bet on that whole sequential nature of visual recognition you propose here. I’m not sure you could draw a strong parallel with, say, music.
To be clear : recognizable, temporarily sequential patterns of vision surely exist, right ? … But they’d most certainly involve recognition of motion
Edit : woopsie, not angel… you were certainly speaking of “angle”. Nevermind, it works with angels too ! So I’ll leave it there
But seriously, I’m not sure anyone at Numenta thinks something like vector calculus is happening in the brain. We think more along the lines of movements being represented by some form of displacement cells, where displacements are the differences between locations represented by grid cells.
Cortex is inherently 2D. It takes multiple maps working together to extract the bag of features that clue us into 3D relationships.
We keep switching 2D reference planes as we internally process 3D space. Many optical illusions flow from this.
You are trying to map the way things should work on the way they actually do work in the brain. And it doesn’t fit, the brain is not a well-designed system. To put it very mildly. If you want to explore functional approaches, check Capsule Networks. Or, if you are feeling insanely overconfident, mine: https://github.com/boris-kz/CogAlg/
Sorry to rain on Numenta’s parade, their work is very interesting in terms of neuroscience.
Thanks for introducing me this network
I am more in the side of bringing cognitive science into AI. (Although I don’t know too much )
The thing stops the human from making progress in visual and language processing,
at least for me, is the fact that we don’t understand what is to “understand.”
We cannot even see the problem, and that is why we cannot even start answering it.
On this problem, I am with Jeff that brain is the shortest path.
If I ask which sun is hotter?
A human will need to see the emotions in lines and colors to answer.
It is not merely a classification problem.
It has a lot of things to do with experience and other sense.
Thus, I am only going to give a theoretical arrangement of memory in a hypothetical brain.
I know it smells fishy, but please bear with me.