I found this absolutely phenomenal paper called “Towards a Mathematical Theory of Cortical Micro-circuits” which proposes a method of bayesian belief updating and modelling for the HTM architecture. I desperately want to try my hand at implementing or testing this, but I can’t find much information about it.
Does anyone know if there are any Numenta repositories implementing this code? or alternatively, if anyone could help me understand some of the key algorithmic design decisions, that would be immensely appreciated. Is there a more in-depth discussion of pseudocode for this algorithm anywhere? Is there a more up to date version of this theory which deals with belief encoding?
Why is nobody talking about this? It seems like a HUGELY pivotal piece of research which links together a number of interesting concepts related to building hierarchical generative models of sensory data, involving the whole set of cells in the cortical microcolumn, basically a full hierarchical end-to-end abstraction prediction machine, like a transformer. Forgive my excitement, but why, as a community, are we NOT working on this? Is Numenta still working on this?
Just realised that this paper is relatively old (2009) and uses actual markov chains instead of the usual HTM circuitry. Can we make use of our improved understanding of HTM theory to implement this biologically with HTM and temporal memory?
Sorry for the nitpick but I’m not sure what you meant by “actual Markov chains” as HTM(SP+TM) is a honest-to-god Markov chain itself, meaning its immediate future states only depend on the current states and the next step observations. I assume you meant it as in that the units work with explicit probabilities instead of a proxy like “active” or “predictive” state?
It’s interesting to see that Numenta used to work with explicit Bayesian modelling though.
In 2009 everything worked with Bayesian modelling! There were t-shirts. They were probably building refrigerators with Bayesian modeling.
As I recall, the first AI MOOC from Stanford, in 2011, with Google’s Peter Norvig, and Sebastian Thrun, was all Bayes. It didn’t even mention neural nets. This is in 2011, right on the cusp of the deep learning revolution!
Why it’s not used now will be that Bayes statistics are inferior to distributed representation as a model for cognition. An improvement at the time as a first guess at the puzzling indeterminacy of cognitive representation when transitioning from symbolic AI. But the subsequent success of networks supports the idea that the underlying reality is distributed, not statistical.
You’re right, but specifically, if you want to identify a Markov chain, you can read the dendritic segments directly and their probabilities to identify exact sequences of neurons, but with a true neural system you would only be able to read proxies of those Markov chains as you do not have dendritic information. I have a theory that the union pooler/temporal pooler is essentially doing this identification of Markov chains. It works by being activated by a set of predictive neurons and working backwards through its history to associate past states with the current one, forming a sort of chain, except it’s a very messy heuristic for the actual chains.
I mean moreso, instead of dense mathematical notation, a biologically realistic mechanism for this belief updating and message passing, including something like the temporal pooler, including all the knowledge we’ve gained about lateral inhibition, hebbian learning, dendritic/apical predictive states etc
What do you make of the belief that the brain is quite literally doing a form of bayesian modelling? Physically trying to predict likelihoods given a prior distribution and new evidence? It seems like the cortical microcolumn has circuitry to actually implement that type of computation explicitly, and the predictive coding framework/energy minimisation seems to fit very well that the brain is forming hierarchical generative models to predict upcoming sensory information using belief propagation
Short story, I believe indeed that “the brain is forming hierarchical generative models to predict upcoming sensory information”. But specifically I believe those models are not statistical, but dynamical, even chaotic. And in detail I think they form as constantly changing resonance clusters in networks of sequences, clustering sub-sequences which share predictions, and which because they share predictions have dense sequence network connections.
In nature it’s easier to have the network, than any statistical summary over that network.
Statistics resembles chaos in many ways. It’s a way to deal with the indeterminacy which stumped the earlier symbolic ideas, and which I think is the ignored quality of deep learning. You can form reasonably good statistical approximations of chaos. But personally to me, chaos forming in an actual network, is a much more plausible model of indeterminacy in the brain than any kind of explicit statistical processing.
I think Bayes does have an advantage over current network tech by being generatively focused. That point of view might be most actively expressed at this time by Karl Friston’s use of Bayes in Active Inference.
That generative focus is what I think is missing from current neural network tech. And why HTM might be a good place to move forward with network tech. Because HTM is generally receptive to a generative paradigm, like Bayes, as you’ve observed with this paper.
But if you’re interested in Bayes as a specific line of enquiry, and want to know where Bayes and HTM went, possibly exploring some of what you found in that paper, you might write to original Numenta co-founder Dileep George. As I understand, he left Numenta in 2010, so one year after your 2009 paper, to found separate AI startup Vicarious. And as I understand it his motivation to leave was exactly because he became convinced that actual networks were not necessary, and it would be better to replace them with mathematical abstractions. So he may have done some of exactly the work you found. Vicarious got quite good funding initially, from Facebook(? or Thiel…), when they wowed people by solving some early captchas.
Oh, update from Wikipedia “Alphabet-owned company Intrinsic acquired Vicarious in 2022. The AI and robotics divisions merged with Intrinsic, while the research division (including George) joined DeepMind.[4] As of 2022, George is a Research Scientist at DeepMind.”
Markov models are only a small part of what the brain does. MM are interesting and easy to analyze so they’re a good starting point for trying to understand how the cortex works, but the cortex is also doing a lot of other things, such as grid cells and object recognition. In my experience from trying to extend the HTM model to cover more of the cortex’s functions, HTM is lacking important biological details which makes it difficult to model other phenomenons beyond markov models. IMO more biological accuracy is called for, not less.