Has anyone has tried creating a Bayesian inference model around an HTM model?
If you were going to do something like that, I guess you’d need to know the size of the input / output spaces, and include every possible input/output as part of the definition of both A
and B
in P(A|B)
. Is that a reasonable thing to do with Bayes?
By the way, I am only asking because I have been researching for an article. I have not lost the faith in HTMs.
I don’t have any experience in this area myself, but curious about the goal of the exercise. Is the idea to depict HTM predictions after it has been trained on a particular sequence? For example:
Learned sequence: ABACADABA
Input Predict
A B C D
A 0.00 0.50 0.25 0.25
B 1.00 0.00 0.00 0.00
C 1.00 0.00 0.00 0.00
D 1.00 0.00 0.00 0.00
–EDIT-- I suppose another way to look at it would be to have Activate and Predict, where activate connects to predict connects to activate. Ultimately this is just modeling the sequence (so could simplify to each input with each next possible input). I’m probably missing the point though
Perhaps it would be useful to look at some real-world examples to get a feel for Bayes in action.
Considering current topics of discussion you may want to look hard at the example on page 36 here:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.4220&rep=rep1&type=pdf
and
You have a position and heading. As time progresses you guess where you might be and combine that with feedback that hints at where you might be. The relative faith in the two estimates combines to form a new opinion on here you may be.
In a grid system, the centers of grid activation in the various spatial modules is driven by the same general system - an estimation of where you might be combined with hints of where you might be moving to. The size of the movement estimation pushes on the activation bundle. (estimation of position) Note the correction when you get a high-confidence observation such as a visual fix after pure self-motion due traveling in a poorly lighted area.
If we allow that HTM is part of grid-forming systems then this does apply to the original question.
1 Like
But is there a way to formulate a Bayesian learning model that uses the actual bit space of the input as part of the prior distribution? Or is that just too much to compute?
How about starting with a single SDR.
The set bits are the beliefs. The sampled vector is the observation.
What is the relationship via Bayes analysis?
It should be possible. The SDR Classifier does something similar. It computes P( e(t+1) | a(t) ), where e(t+1) represents the predicted bits in the encoder, and a(t) is the SDR representing the TM activity at time t.
It uses maximum likelihood, so it’s not full-on Bayesian.
2 Likes
Talking about hot gym always makes me feel better, so let’s start there. When taking a stream of scalar values over time to predict future values, we encode them first into a binary stream, and feed into HTM. HTM creates an output binary stream representing predictions, which we can decode into a scalar prediction stream.
What would the Bayesian solution to this problem look like?