You mean the “same object” should cause the two groups of neurons to activate together? I think I understand that’s Hebbian learning working afterwards, but before that, it can either be one same object triggering their spiking, or two different objects triggering them separately (yet at the same time), how are these cases distinguished?
By the frequency with which these two cases occur.
If there is one object moving around then on each learning session the object will very reliably trigger the same activity at each location that it moves to, which is a strong signal for hebbian learning to pick up on.
If there are two separate objects moving around then typically they will have different motion-trajectories which will lead to different activity on each trial. In this case Hebbian learning will try to learn a different relationship on each trial and to unlearn the previous relationships of the past trials.
Hebbian learning is a statistical method to learn correlations that exist in the world (or rather in the input data).
It’s not foolproof but unless someone is concocting illusions to trick you, it works well enough to distinguish between your two cases.
So you need a combination of same-time and same-location, which is an addition to Hebbian learning. That discrimination by location is in separate “where” pathway, mapping inputs to locations, so you need a co-incident activation in what and where pathways? Or both are represented in the same column but different layers, like those hypothetical grid cells in layer 5?
I think there are two parts to this. Learning separate objects works with hebbian learning, but sensing two objects at the same time is another problem.
It might be sensing parts of a pencil and a coffee mug, for example. Until it senses enough features, it can’t be sure whether this is one object or two. Or it could even be feeling two instances of identical coffee mugs.
I don’t remember the exact mechanisms of voting well enough. It might work with multiple objects. Voting is explained thoroughly in this video: https://www.youtube.com/watch?v=XUpmN_CLOZc
To deal with multiple instances of an identical object, it must involve reference frames (coordinate systems of locations). Each kind of object has its own reference frame (I recall), but that wouldn’t distinguish identical objects. For those, it’d have to use other reference frames, like position relative to the body. There are many kinds of reference frames in the brain. Injuries lead to various strange problems, like hemineglect.
Incidentally, I’m skimming through Tutorial and Discussion on Cortical Column Voting Mechanisms Developed by Numenta - 5 April, 2022 right now, which addresses the voting mechanism too and appears more recent. I think it clears out my confusion in the OP, that there is no such thing as “SDR schema”, at least in Numenta’s SDR conception. Subutai Ahmad explicitly states that “Object IDs” will problematic at 10:27, and around that moment, he explained a lot that there’s no same SDRs among cortical columns, even no one-to-one mappings.
The HTM learning algorithm as discussed in that video seems doing heavy skyhooking by randomly allocating new SDRs for new objects to learn, and keeping their stable activation for learning purpose. So very true, learning should be very different to sensing.
This leads me to another interesting question: how about the single CC SDR for an object, across the (possibly long) course of learning it? Does it change while its features gradually learnt? How will it converge/diverge after it turns out to be same/different objects?
It’s made stable per current Numenta simulation algorithm, but as some asked in that video, in reality one can learn half-way for some object, then switch to learn other things, then resume learning the previous object. It doesn’t seem that an exact same SDR can be resumed perfectly in that case.
I later realized that “reference frames” is one of the key points of 《Thousand Brains》, my imaginary circles do occupy certain space (at certain location) in the canvas in my mind, so yes I think they use the reference frame of my imaginary canvas.
Yet there’s still one thing I’m not clear, that the “roundness” is not an object by itself but some attribute/property/feature. An imaginary circle is an object, I still wonder what SDRs have anything to do with associating the roundness to a circle, and further, for 2 circles, it’s very unintuitive for me to conceive they posses 2 separate “instances” of roundness.
@dmac This is very convincing, we are adapting creatures in the first place, I should put my software programming thinking (where data types have perfectly distinct semantics) aside, while wandering neuroscience .
If I’m understanding Bitking’s earlier comment correctly, there are serial processes involving the subcortex when processing 2 separate instances of a similar object, so they could indeed share some of the same SDRs that represent the same features of an object e.g. the feature of roundness. Our eyes has dedicated hardware/cells and processing to detect visual features usch as edges. With a CNN such feature detection is also trivial. I think Numenta has so far untouched the serial processing parts involving the subcortex and attention mechanisms necessary to answer your question.
I’m in progress grokking https://discourse.numenta.org/t/htm-mini-columns-into-hexagonal-grids per Bitking’s pointer, hopefully to update my mind to clear things out.
Am I understanding you correctly that each cortical column contains “an” SDR for “roundness”? While each such SDR is shared by other neurons, they are “local” to its CC as no 2 of them are the “same”?
I’m still skeptic about a feature to have any identity (whether in form of SDR or not) - unlike an object, features are attributive and way too much number of features can come along with an object in reality.
Or features do have SDRs but usually appear as parts of a “union” SDR perceived along with an object? In this case, if a circle and a square (i.e. 2 objects) are perceived by a CC at the same time, how one is not confused that the circle possesses “squareness” and the square possesses “roundness”?
Or can there be a bold assumption that, at any instant of time a single CC can only perceive a single object? So that my feeling of 2 objects recognized “at the same time” is actually implemented in that some CCs in my brain perceive one object meanwhile some other CCs perceive another object, and they gain roughly equal amount of my attentions?
Round-ness or any other shape that spans many CCs will have to be resolved by a hierarchy level higher than a single CC. If you think about visual acuity in relation to the receptive field of a single CC you should notice that a single CC can’t see very much. The work of Hubel and Wiesel that I have read talked about detecting edges (the transition between luminance levels) and that was about it.
https://www.cns.nyu.edu/~david/courses/perception/lecturenotes/V1/lgn-V1.html
A quick question about this: What’s the unit of such higher levels?
I roughly understand that “1000 brains” refers to each CC as “a brain” ? Or not so?
And a quick answer: Again, hierarchy.
My software engineering mindset would perceive “hierarchy” most likely from the idea of Hierarchical Storage or so, please some pointers for quick flashing of neural hierarchy concept?
Just seeing this phrase in Grids into Maps! , I realize that I heartly wanted to ask the Binding Problem around SDR/HTM, by this OP.
That’s a lot to digest but I feel enlightened
Here is some fuel to feed to that software mind:
There are roughly 100 postage stamp size processing areas in the cortex. Each area is composed of interconnected cortical columns. The cortical columns lateral connections are all local to these areas/maps:
These processing areas are stitched together with bundles of fibers that are arranged in general pathways that establish the topology that is the hierarchy. These connections are frequently bi-directional; axons project from each area layer 2/3 to the reciprocal map:
and
This general scheme works in tandem with another broad connection scheme involving projections that loop through the thalamus but I will leave that for a later post.
@Bitking Thanks very much!
And a little quick question:
I used to sense “hierarchy” always with “acyclic” levels (e.g. a social hierarchy or a computer storage hierarchy), but if cortical areas are vastly inter-connected, I feel barely withstanding that some “lower” level to actually appear “indirectly-higher” than its “directly-higher” level. Is this sorta a norm and I should just live with it, or there can be some settling ideas for my mind?
Yes, there are level skips and lateral map connections.
Sorting this all out is the work of the connectome project.
https://www.humanconnectome.org/
The location layer could probably learn partial objects. Like the location layer, it votes and gradually makes the location less ambiguous. However, while the sensor is moving, it changes the location over time. So it doesn’t hold a random SDR active while learning an object.
The brain is bad at path integration. With no sensory info, it gets lost quickly. So it anchors locations to sensory info, like landmarks in a town or parts of an object. As a result, the locations are almost like sensory information. They’re unique to the object. If the object isn’t known, the location is ambiguous. If the location is unambiguous, it knows the object.
You can think of an object as a set of features, with a displacement between each pair of features. Depending on how exactly the location layer works, it might be able to recognize partial objects. It might be able to learn shapes and apply those to other objects. That is getting towards hierarchical object composition.
We need to be able to recognize the same object at different scales. Another thing is concepts. The neocortex does a bunch of stuff for object recognition, but it’s basically the same circuit for all of intelligence. So the idea is, there might be reference frames for abstract concepts.
Yes, this is the norm.
In the paper below, look at figure 2 for some ideas about level skips.
Look to figure one for a starting point on how the brain represents higher level concepts.
https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(13)00122-8?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1364661313001228%3Fshowall%3Dtrue
I’m still wrapping my head around ideas of reference frames, as reading the chapter of 《Thousand Brains》, I think I need to finish the first pass of reading before asking meaningful more questions.
Please help me understanding the “displacement” here, I feel I’ve encountered the word here and there in neuroscience papers and books, but unclear about what exact operation/process/procedure it refers to.
Before I have a correct understanding of “displacement”, I wound think that if an object is represented by a set of features, then it’s meaningless for the object to have an SDR? Only for each feature to have an SDR, then the “object” SDR can just be the “union” of those “feature” SDRs?