Visual Processing and Vector Calculus

Bear · July 14, 2019, 3:41pm

So another thing I am thinking about during my vector calculus class is to represent visual information as a sequence of vectors.

first, we see a line and gradually integrate to a closed area.

Then a sequence of the closed area makes an object.

Multiple objects then make a space.

According to Green’s theorem, it is also possible to estimate the area given the trace of Perimeter.

Then if we could represent the visual information as a sequence, we could process it the same way we process language ;D

Bitking · July 15, 2019, 1:53am

This may be a non-biological way to do image processing; the programming community has been coming up with all sorts of hacks to to do this or that part of image processing for many years.

Keep in mind that for mammals, we look at things by moving our eyes. The acute vision part of your eye can see a spot about the size of your thumbnail at arms length. (OK, a little more - but not much more)
See this post to get a better idea of what the brain has to process as fed by your roving eye.

The thumbnails are all projected onto the same area of cortex in rapid succession. There is no tracing of an objects outlines. I have a great deal of trouble matching this known fact of human vision with your proposal.

Bear · July 15, 2019, 2:28am

How about in v1 we have lines, in v2 we combine lines into an angel.
Then it gradually integrates to shapes.

We don’t need to trace the outline of everything.
In the walking among rocks example maybe the person doesn’t need to recognize the shapes of all the stones, but only a surface to land feet. Thus, the eye is tracing fitting surface. In this case, I think light is one of the most important things to take into consideration.

Is there an eye-tracing character gif like “A, B, C”? That should help analysis.

Sorry, I don’t know much neuron science, since I am only a cs undergrad.
But thanks for replying to me

Bitking · July 15, 2019, 2:34am

Still does not match up to the known properties. How do you form “lines” when you have little postage stamp fragments of a scene projected one atop another?

And yes, I have been studying this since the 1980’s. Keep at it - there is a lot to learn but this is the area where the cool stuff will be happening for the next 30 years.

Bear · July 15, 2019, 2:44am

How about express a vector using the linear combination of those four directions?
Like the intensity on muscles?

Bitking · July 15, 2019, 4:37am

Not necessary. Muscle control all the way from the tension and joint angle control through to the cortical control is a series of simple nested servo loops.
A large part of this is handled in the spinal column, notably, pair control in opposing muscles in joints and pair control in opposing limbs.

Perhaps it’s time to check out some physiology texts? There are good ones that describe the neurological system in great detail.

Bear · July 15, 2019, 3:10pm

Yeah, maybe not the muscle.
The system I am thinking about is more like a vector field.

We could express the position of the eyeball by two variable (marked as p(x,y), maybe a third one for depth . . .)

Then we could calculate the line (vector) by two points. The end and beginning.

However, I don’t know whether it works in human or not

gmirey · July 15, 2019, 3:36pm

I don’t think so.
When you have the feet close to the Giza pyramid, you may be able to deduce, from large movements of the eye or head, that those moves were triangular… and thus the shape looks like a triangle (at least in 2D). But (unless specifically trained for this ?), this seems a rather slow “understanding”

Now, when presented with a long-range picture of Giza, in which you see the whole in one shot, my bet is that the realization “oh it’s a triangle” is blinding-fast.

What’s happening is (imho) closer to your first lines of thoughts :

Although it’s not v1->v2 per se which goes from lines to angel… (more like, say v2->v4->somewhere-angel-reactive ?) but, yeah, this kind of hierarchy makes more sense (to me). Also note that such a model leaves out a lot of the real complexities found in those areas.

However, I wouldn’t bet on that whole sequential nature of visual recognition you propose here. I’m not sure you could draw a strong parallel with, say, music.

To be clear : recognizable, temporarily sequential patterns of vision surely exist, right ? … But they’d most certainly involve recognition of motion

Edit : woopsie, not angel… you were certainly speaking of “angle”. Nevermind, it works with angels too ! So I’ll leave it there

rhyolight · July 15, 2019, 3:49pm

I see what you did there.

But seriously, I’m not sure anyone at Numenta thinks something like vector calculus is happening in the brain. We think more along the lines of movements being represented by some form of displacement cells, where displacements are the differences between locations represented by grid cells.

We talk first about this in the Columns Paper, more in the Columns+ Paper. And while not necessarily directly defining movement vectors, a lot of interesting capabilities of grid cell structures are investigated in Flexible representation and memory of higher-dimensional cognitive variables with grid cells.

We think this type of representation of location allows us to represent movement without vector calculus.

Btw, nice drawings!

Bear · July 15, 2019, 8:38pm

Oh, it is not vector calculus, it takes the idea of integration.
The main idea is to translate 3D to 1D.
represent an object with a sequence of vectors.
Then vectors to cells.

It could fit well in a sequential hierarchy structure.
I think I could entail more once I finished my group project for my “Scrum” class

Bitking · July 15, 2019, 8:41pm

Cortex is inherently 2D. It takes multiple maps working together to extract the bag of features that clue us into 3D relationships.
We keep switching 2D reference planes as we internally process 3D space. Many optical illusions flow from this.

bkaz · July 16, 2019, 1:40pm

You are trying to map the way things should work on the way they actually do work in the brain. And it doesn’t fit, the brain is not a well-designed system. To put it very mildly. If you want to explore functional approaches, check Capsule Networks. Or, if you are feeling insanely overconfident, mine: https://github.com/boris-kz/CogAlg/
Sorry to rain on Numenta’s parade, their work is very interesting in terms of neuroscience.

Bear · July 16, 2019, 10:01pm

Thanks for introducing me this network
I am more in the side of bringing cognitive science into AI. (Although I don’t know too much )
The thing stops the human from making progress in visual and language processing,
at least for me, is the fact that we don’t understand what is to “understand.”

We cannot even see the problem, and that is why we cannot even start answering it.
On this problem, I am with Jeff that brain is the shortest path.

bkaz · July 16, 2019, 10:04pm

Problem is, there is very little science in “Cognitive Science”. Neuroscience is the safest path, shortest is an open question.

rhyolight · July 16, 2019, 11:52pm

15 posts were split to a new topic: Action origination, old brain vs cortex

rhyolight · July 16, 2019, 11:54pm

A post was merged into an existing topic: Action origination, old brain vs cortex

Bear · July 17, 2019, 1:37pm

Okay, it is time to explain bear brain (It is an imagined brain because I don’t know how the real brain works. )

I won’t discuss visual recognition because it is way beyond my capability.
Here is an example.

If I ask which sun is hotter?
A human will need to see the emotions in lines and colors to answer.
It is not merely a classification problem.
It has a lot of things to do with experience and other sense.

Thus, I am only going to give a theoretical arrangement of memory in a hypothetical brain.
I know it smells fishy, but please bear with me.

Bear · July 17, 2019, 1:40pm

The first relationship is “contains.”
There are three suns in this picture.
Thus, we could partition it into three blocks to create a containing hierarchy.

Bear · July 17, 2019, 1:50pm

“Contains” could only express a set.
However, different position of image could deliver different meaning.

Then let’s make a “Memory Interface”

It is using fewer neurons to represent memory.
This interface also tells the location of memory.

finally, we address the spatial relationship among interfaces.
(I would say grid cells could do it, but I am not sure.)

Then we could arrange the memory of a picture like this.

rhyolight · July 17, 2019, 2:57pm

Have you read Numenta’s Frameworks Paper? We talk a lot about how we think object representation and recognition work with respect to grid cells. It might inform your thinking on this.

Topic		Replies	Views
VISION, by David Marr General Neuroscience	13	2014	November 22, 2022
Prototype of Stability Mechanism for Viewpoint Invariance Engineering	7	923	April 16, 2020
FINST--Fingers of Instantiation (Visual Indexing Theory) Community	6	1612	November 24, 2021
Sensorimotor Importance to Vision with Precise Timing Numenta Theory	10	1275	February 13, 2019
Saccades key to vision? Lounge	32	3173	August 7, 2019

Visual Processing and Vector Calculus

Related topics