This will be instead of our usual HTM Hackers’ Hangout. We’ll have @jhawkins and @subutai join us for some Q&A.
Watch here if you miss the live event:
I will add a link to join the Hangout immediately before the event starts right here on this thread.
Add questions below
If you can’t join the hangout, you have a chance to ask Jeff and Subutai questions on this thread ahead of time. We are specifically interested in talking about the Thousand Brains Model we’ve been promoting with our latest few papers.
one question though, is there any code out there yet where we can play with those concepts we’ve read on the 1kbt paper
and after HTM school is done will there be a subsequent lecture called HTM dev school where we teach people who are interested in building projects with HTM?
EDIT: Scratch that question regarding HTM dev school, there are examples out there with “One hot gym” and “Sine experiment” out there which helped a lot of people to get into all of that
Minicolumns seem to run through multiple layers within a column. How might minicolumns be related to grid cells or grid-like behavior?
Oh, that question brings up another question i was wondering regarding vision that i couldn’t articulate
how do grid cells translate into active or inactive stripes we see in V1 and V2 and what does that say about the communication between the neocortex and the thalamic system.
The question I haven’t found an answer for in the previous papers is about invariance of the pattern representation. It’s clear, that it’s possible to get quite a broad coverage of different shapes in this approach just saving many variations (what basically DNNs do), plus, rotation and scaling invariance can be easily implemented in this case, but what about the true invariance, which is tolerant to any transformations after the one-shot learning?
I just finished reading the Grid Cells paper. (Great stuff, thanks). Maybe this has been asked before elsewhere, but…
If the neocortex consists of modular columns (the big ones, not the mini-columns), and they are presumably functionally interchangeable, how would grid cells then be used in those regions that à priori don’t require them, like for language, music, abstract thought, etc. Would the location layer be suppressed there? Or could there be a use for location input on a more abstract level? Like for instance moving up and down in a tune for pitch or for volume.
It seems strange, but having all this hardware in every column, and then not have a use for it, seems even stranger. If it is missing or less important in some area’s, perhaps the layer would be less developed, and the neurons there less dense.
Furthermore I think the Numenta website should have a merchandise section offering the Numenta cup.
To be fair, even humans can sometimes struggle with this when we see an object from a rarely observed “transformation” of an object, such as looking at a scene from behind, in the shadows, or even upside-down. I’d be tickled pink for a system that does even as poorly as I do, while being “tolerant enough”. In my mind, that means beating the brittleness of Deep Learning based systems, where random pixel flipping can be enough to mess up an entire classification system.
I am beginning to suspect the invariance and even covariance requirements pursued in Deep Learning. The major problem with invariance is that its difficult to have many kinds at the same time. So for example, ConvNets have translational invariance, but don’t simultaneously have rotational invariance. This can be relaxed using covariance instead, but there still that complexity of satisfying many simultaneously. So, the more promising approach is what’s found in Transformer network that trade of invariance for attention instead.
SDRs in HTM model synchronized activity of a few cells in a layer of a cortical column, where activity could be a single output spike or a brief burst†. It seems that spike transmission time in long-range connections must result in some desynchronization?
I’m interested in whether this is seen as an issue in TBT, where long-range cross region connections seem to have a greater significance than in the traditional hierarchical model? Could temporal coding (spike bursts, produced by a single neuron or a cell assembly, in which the pattern of inter-spike intervals is significant) play a part in TBT models?
Also, the frameworks paper proposes that L5 displacement cells multiplex representations for movement vectors and object composition: would welcome more explanation/discussion of this.
† (is this a good enough summary of the neuroscience of SDRs?)
Hi, I also have 5 questions. I’ll try to be there live as well. @rhyolight thanks for organizing this session!
Learning new objects / environments
What triggers the re-anchoring of grid cells? What precisely makes an object/environment “new”? Sustained bursting of minicolumns somewhere, and where?
Can the re-anchoring fail, thus new object?
In the 2017 columns paper, the model was explicitly being told that it was seeing new objects (reset to learn a new object). Maybe a re-anchoring failure can inform the object layer (L2/3), and how?
Also, the model description says that “During learning, the location layer doesn’t update in response to sensory input”. Is this separation of learning and inferencing neurologically plausible?
Displacement cells
Me too. Furthermore:
Displacement cells reference 2 locations (or unions of locations). Within a single cortical column at each time one union of locations is represented. How do displacement cells reference the 2nd location? Where do they find it, do they have any long-range input and get it from another column?
In the 2017 columns paper, L5 is proposed to have long-range connections. Within the displacement cells model what is their purpose? Or are they not considered any more?
Composite objects in the hierarchy
In these models, objects are formed by (feature x location) pairs, where the features are implied to be low-level sensorial features.
In the case of a composite object that is made from already-learned pieces, can the exact same mechanism represent the object in a cortical column further up the hierarchy? Where:
“feature” inputs are the activations of the object layers (L2/3) of the lower columns
The movement vectors to update the “location” inputs are derived from the lower columns’ displacement cells
I’m in the middle of the Framework paper, and I won’t be able to finish it in time I’m afraid. (I’m a slow reader). However I have trouble understanding the displacement cells.
I understand how the displacement module represents a link between two objects or between two features. But how is this a displacement? Does that represent a vector (orientation + distance)? And if so, how big of a vector?
Imagine I know the Numenta logo, but I have never seen it on a cup. And I know a typical coffee cup, but I have never seen one with the Numenta logo on it. How would the displacement be determined to make this link? How would it be one down two to the right for instance? Would that displacement be related to one particular module of grid cells? Would it be defined the first time I anchor my grid cells to one of the objects?
Don’t get me wrong, the idea of having a logical link between features is mindboggling. But I feel I miss something essential to understand the full picture.
Edit: I think I just figured it out. Every feature of the cup has a displacement vector with every other feature of the cup, including every part of the logo on it. The more features my brain learns to identifie, the more displacement modules get defined.
This is profound… How on earth do you guys come up with this?
A quick clarification on the point I was making about continuous learning. In a system like this with voting between cortical columns, one way to decide if a thing being encountered is a new object or just a new feature of the current object, is by looking at the level of certainty across all of the cortical columns. If the majority of them are encountering anomalies, then it is more likely to be a new object. If only some of them are encountering anomalies, then it is less likely to be a new object.
For example, imagine you are looking at a hologram of a coffee cup. You reach out to grab it, and suddenly the cortical columns processing input from your fingers begin indicating anomalies. Even though these cortical columns are flagging virtually every input as anomalous, when taken in conjunction with the visual input, you know that you haven’t switched to a new object. Activity representing the room you are in, orientation, and position relative to the room are also voting that you are still encountering the same object, and haven’t switched to a new one.
This was a great hangout, folks. Thanks to everyone for participating. We had over 50 people watching live at one point, which is my best live stream by far.