Visual Processing and Vector Calculus

So another thing I am thinking about during my vector calculus class is to represent visual information as a sequence of vectors.

first, we see a line and gradually integrate to a closed area.

Then a sequence of the closed area makes an object.

Multiple objects then make a space.

According to Green’s theorem, it is also possible to estimate the area given the trace of Perimeter.

Then if we could represent the visual information as a sequence, we could process it the same way we process language ;D

This may be a non-biological way to do image processing; the programming community has been coming up with all sorts of hacks to to do this or that part of image processing for many years.

Keep in mind that for mammals, we look at things by moving our eyes. The acute vision part of your eye can see a spot about the size of your thumbnail at arms length. (OK, a little more - but not much more)
See this post to get a better idea of what the brain has to process as fed by your roving eye.

The thumbnails are all projected onto the same area of cortex in rapid succession. There is no tracing of an objects outlines. I have a great deal of trouble matching this known fact of human vision with your proposal.

1 Like

How about in v1 we have lines, in v2 we combine lines into an angel.
Then it gradually integrates to shapes.

We don’t need to trace the outline of everything.
In the walking among rocks example maybe the person doesn’t need to recognize the shapes of all the stones, but only a surface to land feet. Thus, the eye is tracing fitting surface. In this case, I think light is one of the most important things to take into consideration.

Is there an eye-tracing character gif like “A, B, C”? That should help analysis.

Sorry, I don’t know much neuron science, since I am only a cs undergrad.
But thanks for replying to me :smiley:

Still does not match up to the known properties. How do you form “lines” when you have little postage stamp fragments of a scene projected one atop another?

And yes, I have been studying this since the 1980’s. Keep at it - there is a lot to learn but this is the area where the cool stuff will be happening for the next 30 years.

1 Like

How about express a vector using the linear combination of those four directions?
Like the intensity on muscles?

Not necessary. Muscle control all the way from the tension and joint angle control through to the cortical control is a series of simple nested servo loops.
A large part of this is handled in the spinal column, notably, pair control in opposing muscles in joints and pair control in opposing limbs.

Perhaps it’s time to check out some physiology texts? There are good ones that describe the neurological system in great detail.

1 Like

Yeah, maybe not the muscle.
The system I am thinking about is more like a vector field.

We could express the position of the eyeball by two variable (marked as p(x,y), maybe a third one for depth . . .)

Then we could calculate the line (vector) by two points. The end and beginning.

However, I don’t know whether it works in human or not :frowning:

I don’t think so.
When you have the feet close to the Giza pyramid, you may be able to deduce, from large movements of the eye or head, that those moves were triangular… and thus the shape looks like a triangle (at least in 2D). But (unless specifically trained for this ?), this seems a rather slow “understanding”

Now, when presented with a long-range picture of Giza, in which you see the whole in one shot, my bet is that the realization “oh it’s a triangle” is blinding-fast.

What’s happening is (imho) closer to your first lines of thoughts :

Although it’s not v1->v2 per se which goes from lines to angel… (more like, say v2->v4->somewhere-angel-reactive ?) but, yeah, this kind of hierarchy makes more sense (to me). Also note that such a model leaves out a lot of the real complexities found in those areas.

However, I wouldn’t bet on that whole sequential nature of visual recognition you propose here. I’m not sure you could draw a strong parallel with, say, music.

To be clear : recognizable, temporarily sequential patterns of vision surely exist, right ? … But they’d most certainly involve recognition of motion

Edit : woopsie, not angel… you were certainly speaking of “angle”. Nevermind, it works with angels too ! So I’ll leave it there :smiley:



I see what you did there. :wink:


But seriously, I’m not sure anyone at Numenta thinks something like vector calculus is happening in the brain. We think more along the lines of movements being represented by some form of displacement cells, where displacements are the differences between locations represented by grid cells.

We talk first about this in the Columns Paper, more in the Columns+ Paper. And while not necessarily directly defining movement vectors, a lot of interesting capabilities of grid cell structures are investigated in Flexible representation and memory of higher-dimensional cognitive variables with grid cells.

We think this type of representation of location allows us to represent movement without vector calculus.

Btw, nice drawings!


Oh, it is not vector calculus, it takes the idea of integration.
The main idea is to translate 3D to 1D.
represent an object with a sequence of vectors.
Then vectors to cells.

It could fit well in a sequential hierarchy structure.
I think I could entail more once I finished my group project for my “Scrum” class :face_with_raised_eyebrow:

1 Like

Cortex is inherently 2D. It takes multiple maps working together to extract the bag of features that clue us into 3D relationships.
We keep switching 2D reference planes as we internally process 3D space. Many optical illusions flow from this.


You are trying to map the way things should work on the way they actually do work in the brain. And it doesn’t fit, the brain is not a well-designed system. To put it very mildly. If you want to explore functional approaches, check Capsule Networks. Or, if you are feeling insanely overconfident, mine:
Sorry to rain on Numenta’s parade, their work is very interesting in terms of neuroscience.


Thanks for introducing me this network :blush:
I am more in the side of bringing cognitive science into AI. (Although I don’t know too much :joy:)
The thing stops the human from making progress in visual and language processing,
at least for me, is the fact that we don’t understand what is to “understand.”

We cannot even see the problem, and that is why we cannot even start answering it.
On this problem, I am with Jeff that brain is the shortest path.


Problem is, there is very little science in “Cognitive Science”. Neuroscience is the safest path, shortest is an open question.

1 Like

15 posts were split to a new topic: Action origination, old brain vs cortex

A post was merged into an existing topic: Action origination, old brain vs cortex

Okay, it is time to explain bear brain (It is an imagined brain because I don’t know how the real brain works. :joy:)

I won’t discuss visual recognition because it is way beyond my capability.
Here is an example.

If I ask which sun is hotter?
A human will need to see the emotions in lines and colors to answer.
It is not merely a classification problem.
It has a lot of things to do with experience and other sense.

Thus, I am only going to give a theoretical arrangement of memory in a hypothetical brain.
I know it smells fishy, but please bear with me.:bear:


The first relationship is “contains.”
There are three suns in this picture.
Thus, we could partition it into three blocks to create a containing hierarchy.

“Contains” could only express a set.
However, different position of image could deliver different meaning.

Then let’s make a “Memory Interface

It is using fewer neurons to represent memory.
This interface also tells the location of memory.

finally, we address the spatial relationship among interfaces.
(I would say grid cells could do it, but I am not sure.)

Then we could arrange the memory of a picture like this.

Have you read Numenta’s Frameworks Paper? We talk a lot about how we think object representation and recognition work with respect to grid cells. It might inform your thinking on this.

1 Like