Probabilistic Predictions and Reinforcement Learning Concept in HTM Theory

Usama_Furqan · January 11, 2019, 2:18pm

Dear All,

I have studied the literature related to HTM and might be I have not understand it deeply that is why I have two questions.

1- HTM network can be used for predicting up coming events or instances in a sequence and that’s how anomalies can be detected and it is possible that at a time instance there are multiple predictions but is it possible that HTM shows predictions with specific value of probability?
In short HTM return probabilistic predictions or non probabilistic predictions?

2- HTM learn sequences with its dendrites segments by making synapse strong or week and changing the permanence value of synapses by increment or decrement. After understanding this mechanism of learning it seems to me that it resembles reinforcement in which reward and punishment occurred.
So please confirm me that the concept of learning in HTM is inspired from reinforcement learning or am I wrong and mapping wrong concept with HTM learning?

Looking for your kind response.
Regards,
@Usama_Furqan

rhyolight · January 11, 2019, 3:12pm

Yes is gives probabilistic predictions.

No, HTM is not inspired by RL. They are very different things.

Remember almost all these concepts involve neural networks and Hebbian learning, but that does not make them the same.

Usama_Furqan · January 11, 2019, 3:47pm

Thanks for your explanation @rhyolight

Is it possible to get all predictions of HTM with specific probabilities?

Can you please refer me any literature material so that I can clear my self that how HTM learning approach is much different then neural network, RL and Hebbian learning?

Actually I admit that HTM is built on neuro-science (neo-cortex CLA) working principles. But I am going to present HTM in next week in my university and I want to be prepare for answering the questions of professors that belongs to the different domains of computer science.

rhyolight · January 11, 2019, 4:17pm

Yes, instructions are in the quick start. Extracted predictions contain both classified prediction values and probabilities.

Let me know if you have any other questions!

sheiser1 · January 11, 2019, 6:22pm

The TM’s predictions are composed of set of cells which become predictive. There can be any number of cells in the predictive state, so the system can make none, one or many predictions at each time step.

Each predictive cell is made predictive because it has active dendrite segment(s).
Each active dendrite segment is made active because it has enough active synapses (formed connections to active cells from the prior time step).
Each active synapse is made active because the ‘presynaptic’ cell it connects to is currently active, and the synapse’s permanence value (scalar from 0-1.0) is above a threshold (TM parameter connectedPermanence).

So each cell’s predictive state is ultimately binary, in that it is either predictive or not. It is not probabilistic, though the permanence values making them predictive can be checked to see how close each sequence is to being forgotten. For instance if the permanence values from the currently predictive cells to the prior active cells are ~1.0 then those memories would take longer to forget than if the permanence values were ~0.3.

Yep, each cell’s segment’s synapse’s permanence is learned locally. This gives each TM model very high potential for variance, and thus very high capacity for uniqueness.

cogmission · January 11, 2019, 8:16pm

…and in no small part there is the plasticity itself! It creates capacity to distinguish values by adding dendritic connections to parts of the knowledge space (and culling those which no longer express meaning along a certain vector).

Bitking · January 11, 2019, 8:34pm

There is research that indicates that the brain prunes heavily every single night. If I can find it tonight I will edit into this post.

Tresor · June 13, 2019, 8:22am

does it mean that it has the same function as Q function in RL?

rhyolight · June 13, 2019, 2:38pm

I wonder if @lucasosouza might say something about this? HTM does not keep a queue at all, it performs high-order memory without this non-biological trick.

lucasosouza · June 16, 2019, 4:36pm

I wouldn’t say so. A Q function is a function that maps state and action to a scalar value. The scalar value represents the expected reward at the next timestep plus the expected reward at all future timesteps discounted by a discount factor.

Like Matt said, HTM is not at all inspired in RL. But if we want to make this comparison, the TM prediction would be more related to a transition function, which outputs the next state given the current state and current action. The transition function in RL is one of the two functions that define the environment (the other is the reward function), and are not usually learned by the agent. The most common strategy in RL is to search directly in the space of policies or search in the space of value functions (or Q functions) and use the value function to derive an optimal policy.

There is a group of RL algorithms commonly referred to as model-based, that also seeks to model the transition function to accelerate the learning process (if you model the environment, you can generate possible transitions and learn offline as in planning). Model based algorithms go all the way back to Dyna-Q, in 1991, to recent sota algorithms like google’s SimPLe.

Topic		Replies	Views
Can HTM learn the probabilistic sequence? Numenta Theory	2	545	March 5, 2018
Reinforcement Learning and HTM Algorithm Machine Learning sequence-memory , encoders , question , community , nupic	26	3559	June 18, 2019
An open-source community research project on comparing HTM-RL to conventional RL Related Papers	63	3355	June 19, 2018
HTM + Logic for sequence learning Machine Learning sequence-memory	2	478	November 16, 2023
Deep Reinforcement Learning, HTM Numenta Theory	5	1270	May 14, 2016

Probabilistic Predictions and Reinforcement Learning Concept in HTM Theory

Related topics