Right way to get output from an HTM system

Hello,

My question today is motivated by a question yesterday asked by @iyerlr here: https://discourse.numenta.org/t/my-own-implementation-of-temporal-memory

So, say I have an HTM system that I want to set up as a classifier, to do everyone’s favorite artificial-neural-network teaching example of classifying hand-written MNIST digits… Now, I understand utilizing HTM in this way is an abuse of HTM and shoving it into a feed-forward-NN-shaped box is highly suboptimal. But I’m asking the question this way to elucidate something I don’t understand. Sorry about that.

So with that preemptive apology out of the way, imagine the system has an input space bit for each pixel, and it’s just running a plain vanilla spatial pooler as described in HTM School Ep. 7. Eventually, it would succeed in modeling the structure of the variations in the input digits. However, this isn’t the same thing as associating the meaning to the digits that we want it to associate.

So, how do I get the meaningful output from the HTM system?

In some of the HTM School videos, @rhyolight compares the similarities between SDRs representing the column states at various times. So I could imagine creating a composite SDR for each digit (maybe a union or perhaps intersection of the column activations produced by many versions of the same digit), and then the output answer would be whichever digit was associated with the composite SDR that has the most overlap with the current state, after observing a given input.

But this approach seems fragile - if the HTM system continued to learn, it may reorganize which columns responded to which features in the input data, and the fixed SDRs that represent the “essential” activation for each digit may drift.

So another approach might be to augment the input with some additional bits encoding the correct value of the digit, and over time the HTM system would likely learn the associations between the features in the drawn bitmap and the digit’s value.

But then, how can you get the HTM system to not rely on that additional input data? In other words, what happens when you don’t have the “training value” as part of the input?

It seems like you’d want to space the input from the training answer temporally so the system can observe the input and predict the meaning, which will arrive one time-step later. This way we can be sure the answer doesn’t contaminate the prediction.

But if the predicted answer doesn’t arrive, because the system is now deployed, (assuming continuous learning is still enabled because we want it to learn other stuff) how do we prevent the fact that the correct answer didn’t show up from degrading the system’s ability to recognize digits?

My 13 month old needs to be told every time she correctly identifies that something is red, but I, as an adult, can be confident in my ability to recognize primary colors.

It seems like HTM must have some kind of solution for this general class of problem, even if it’s not theoretically pure - basically a simple placeholder for the old-brain structures that allow a more complex animal to function.

Thank you in advance for any insight.

1 Like

Classification seems to be pretty complex process for HTM.
I’m wondering the same at the moment (when reading the papers below in the past didnt bother much with the Classification part)

So here is my take on it : You need first to explore the object/label to get representation to compare against. Then the task is RECOGNITION rather than CLASSIFICATION.
i.e. Unsupervised learning on all the objects, then trying to recognize old objects and/or learn new ones.

It is constant learn-predict cycle … it is hard do decouple in a batch manner

Papers

Locations in the Neocortex: A Theory of Sensorimotor Object Recognition Using Cortical Grid Cells

Grid cell path integration for movement based visual object recognition

1 Like

The image that comes to mind is driving a nail with a Rolex watch.
I am sure that you can do it but it is really the best tool?

There are many ways to get output from an HTM, but the simplest is to use a statistical classifier.

Both Nupic and htm.core contain a “classifier”.
The classifier takes SDR’s as input, and outputs a Label.
Inside of the classifier, it is a simple 1-layer NN that learns with backpropagation. There is a weight from every bit of the input SDR to every category of label.

See htm.core’s MNIST example: htm.core/mnist.py at master · htm-community/htm.core · GitHub

1 Like

Thank you for the replies and insights.

After I posted I also found this presentation where Subutai describes the CLA classifier: https://www.youtube.com/watch?v=QZBtaP_gcn0&t=857s

An adjunct classifier module definitely accomplishes the goal I outlined in the original post. But it wasn’t the answer I was driving at. It feels a bit like “inserting probes into a neocortex” as opposed to creating a simplified whole brain.

Here is a different formulation of the question: What is the architecture of the simplest HTM-based system capable of taking an action?

That action could be as simple as setting a one-bit output based on what it recognizes, but, as far as I understand, the neocortex itself isn’t capable of movement / aka output at all. 1000 Brains likens the neocortex to a map, and therefore there are some missing old-brain components that interpret and act based on that map.

So is there an information-flow schematic for a hyper-simplified system that uses HTM in a complete biologically-inspired sense-process-act loop?

Also, Thanks @mraptor for the paper suggestions! I can definitely see how displacements, path integrations, and reference frames become critical for any reasonably complex set of actions.

However, if we stub out the system’s ability to move down to just a tiny handful of actions… Like “move left” and “move right” like Jeff suggested in the book, it seems like I can postpone the implementation grid cells and reference frames for a little while.

I’m currently groking “A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex” and will begin on the papers you recommended shortly.

Thanks again!

1 Like

Yeah i had a post and my current thinking is to not implement numenta-location layer, but instead hardcode/implement Cartesian-location layer that receives a Cmd/Action and emits SDR-based-Location /also no feedback Sense–>Loc/

This saves alot of time…

Then I have to invent a mechanism to combine Sense+Location ==> TPooling and use this SDR as a Label later for Classification

1 Like

TBT does not have RL yet.

My current idea is :
TM-like structure.
Dendrite Input : State
FF : ??
Prediction: use Action SDR (the predicted SDR == actual Acion SDR bits) and a REWARD to figure out which dendrites to boost from t-1

The update rule has to keep the last step Active neurons and dendrites /in a buffer/, so that you can do Q-value calc like you do in RL.

Sort of like Ensemble RL that predict active bits of SDR which should match Action-SDR

Then pass this Action to CC. Sense is State.

So what is left is how you represent and apply a GOAL.

1 Like

A relevant thread from 2017:

1 Like

Thanks for the responses.

I’ve been pondering reinforcement learning in the context of HTM and I’m not convinced RL is inextricably linked to motor control. Don’t get me wrong, RL is definitely a necessary feature of a complete and competent intelligent system with agency. But my conjecture is that reinforcement learning and motor control are “peers” in the information-flow diagram. i.e. one doesn’t require the other. (How RL makes any sense at all without action probably needs its own post because I’ll need to be very clear about terminology. But the essence is what Jeff wrote in 1000 Brains: “Thinking is a form of movement”)

But back to motor control, Some relevant things I read / heard:

“invariant representation in motor cortex is, in some ways, the mirror image of invariant representation in sensory cortex”
-Jeff Hawkins, On Intelligence, Pg 54

“We don’t move to a position, we move to where we feel a sensation.”
-Joshua Brown

“layer 5 activity represents both the state and the action…”
“…the state maps itself to the action that results in itself.”
@sunguralikaan 's Master’s Thesis, pg. 40 & 41

So back to my previous question about a minimal HTM system with output - I think I see part of the answer now:

  • There needs to be a looping connection, connecting some columns’ activity state back around, effectively muxing it together with some components of the input vector. (the components associated with the output / motor commands)

  • I think Temporal Memory / Pooling is needed, but I’m not sure. Something may be possible with just spatial pooling, but I just think it makes more sense to include temporal memory.

  • Even without reinforcement learning, the HTM system would still learn associations between input patterns and the states of output neurons.

But now to part B of my question in the original post: How do you get the HTM system not to rely on (or learn not to bother setting) the output, once you take away the training support? I think, for that, we would need some additional mechanism to be present.

In REM sleep, my brain blocks (most) motor impulses. But if my dreaming self were aware of that fact, the subject of every dream would be: “Oh no! I’m paralyzed!!!” So somehow we have some brain structures capable of modeling - if not our entire body kinematics - at least integrating our motor impulses and adjusting the perception of our body position / muscle extension. Of course this could simply be spatio-temporal HTM circuit itself, that learns the way my body moves when I’m awake and simulates it when I’m asleep. Meaning my brain would need to tell that specific circuit not to learn (that I’d been paralyzed) when I am asleep.

So the missing mechanism is the ability to selectively disable learning on a fine grained region-by-region if not column-by-column basis.

I’m not totally sure about that last part… Does this resonate with you guys or have I gone into the weeds?

Thanks!

update to my diagram:

                     Action         Reward
                      ^                V       
      -- State --> [  TM  -  State:Action  ]
                               ^
                       Union of Actions 

Prediction happens by State + Lateral-tm-connections.
TM/S:A act like Temporal-Memory as Predictor/Selector/Filter for incoming union-of-actions.
The neuron permanence is substituted with Q-value and the update is using TD-algo.

The difference with RL is that we have context based SA i.e. multiple Sx:A pairs with the-same State-X’s … which differ by how you got to this Sx state.
Dont know the implication of that except it will learn slower, but will be more context specific i.e. multiple policies at the same time

2 Likes

Cortex probably has the full sensy-motor simulation outside of the old brain … so simple switch off will be enough

I don’t follow. So I agree that there is part of the cortex modeling the body and capable of integrating motor impulses into an updated sensory state.

Where I get lost is - wouldn’t that part of the cortex predict changes to the actual sensory inputs in response to the observed motor impulses? And wouldn’t it then start freaking out (bursting) when those predictions weren’t met? i.e. when the actual sensory input didn’t change in response to the movement impulses?

“But…” begins an imaginary voice in my head, “Couldn’t that cortex learn that the body behaves differently when I am dreaming? Learn two modes when predicting the body’s movement, based on the is_sleeping context?”

If that were true, however, wouldn’t the is_sleeping context bit also cause that cortical system to learn to predict no change in sensory input in response to movement? So now everything makes sense to that part of the cortex. But what then for another higher-level part of the cortex?

We’ve just kicked the problem up to another level. Now another piece of cortex encounters a paralyzed body that doesn’t move inline with predictions. Following this to the logical conclusion, dreaming, and perhaps also constructing hypotheticals, and other imaginative processes don’t work.

Turning off learning on a fine-grained level feels like it addresses this, but I really want to understand your thinking because there may be a simpler / better way.

Thanks!

Dreaming is the process where the EC/HC contents are pushed back onto the cortex to solidify long-term memory.

Indeed. But it seems to me that the mechanism by which that happens involves stimulating different cortical regions in a manner very similar to the way they’d be stimulated when awake. So how do some cortical regions learn what they are meant to learn, while others don’t learn that the body no longer responds to movement commands?

See spindle waves. It is not the same as awake and attending.

I’m aware of the existence of sleep spindles / sigma waves generally, but I don’t understand how differences in brain-wide activity patterns necessarily imply what’s going on within a small region of cortex.

Can you elaborate a bit more on which research you are referring to and how it fits into HTM / TB theory? Apologies if this is a big ask - I’m just trying to develop an internally consistent understanding of TBT that explains the phenomena that I observe my own brain doing, and this (motor control, but more generally the interfaces between macrocolumns) is an area I can’t quite reconcile.

Forget dreaming, because that’s taken the conversation into the weeds. Sensory gating generally is an important function for brains, both asleep and awake. So how does sensory gating fit into TBT? That’s my (sub)question in a nutshell. It seems like sensory gating would require a concomitant “learning gate”.

1 Like

You got there before I did.

Sorry - I was just responding to post #9 where you brought up REM sleep.

This is where the push-back from the EC/HC is happening.

The connections to the EC/HC are bidirectional and the learned events of the day are being played back to the cortex to consolidate the learning from recent experience.

I don’t pretend to explain how that fits with the sensor/motor theory, just that many experiments seem to indicate that this is what is happening.

To the best of my knowledge all mammals have a requirement to sleep and this seems to be necessary to form long term memories.

While you may say that this is “taking the discussion into the weeds” any successful theory will have to include and explain this consolidation of long term memories.

1 Like

Tangentially related to needing some equivalent of sleep, is the potential need to define and assign a reward after-the-fact… as I’ve been getting deeper into spiking neural networks and reading a lot of the work of Eugene M. Izhikevich (seems to be an expert in neurodynamic systems), found a paper of his that puts forward a decent attempt to show how the brain at least assign rewards for different actions which have taken place in the recent (past few seconds) past. Anything that makes it into a dream might first need to clear this reward threshold as well (thus my suggesting this is tangentially related to the topic of dreams and getting output from HTM).