Has any one attempted to formulate HTM using the notation of probabilistic graphical models?
I’m interested, because doing so could put HTM on a more firm mathematical grounding. Or at least, this could fruitful, because probability theory has a pretty big tool box for representing, poking, and probing models. So curious to find out if such attempts have been made.
The discussion in the HTM paper “Why Neurons Have Thousands of Synapses, a Theory of Sequence Memory in Neocortex” (Hawkins & Ahmad, 2016) compares HTMs to other models such as Markov chains and estimates an HTMs properties in terms of a sequences of events. The whole discussion is worth reading, but here is a pertinent excerpt:
Network Capacity and Generalization
[…]
The capacity of various forms of sequence memory has been
studied before (Sompolinsky and Kanter, 1986; Riedel et al., 1988;
Leibold and Kempter, 2006). In our model it is straightforward
to obtain an estimate of sequence capacity. Although we refer
to the network model as a “sequence memory,” it is actually a
memory of transitions. There is no representation or concept of
the length of sequences or of the number of stored sequences.
The network only learns transitions between inputs. Therefore,
the capacity of a network is measured by how many transitions a
given network can store. This can be calculated as the product
of the expected duty cycle of an individual neuron (cells per
column/column sparsity) times the number of patterns each
neuron can recognize on its basal dendrites. For example, a
network where 2% of the columns are active, each column has 32
cells, and each cell recognizes 200 patterns on its basal dendrites,
can store approximately 320,000 transitions ((32/0.02) ∗ 200). The
capacity scales linearly with the number of cells per column and
the number of patterns recognized by the basal synapses of each
neuron.
@Bitking That’s a great suggestion. A quick google search and I found some relevant papers. I imagine doing it for HTM would be a more difficult task.
@dmac Thanks, I did take a look at the paper. They do mention HMM briefly and make comparisons, but there is no in depth discussion. HMMs are a good example of a model that is captured very well by a graphical model. I wonder if the same can be done for HTM somehow, partly because PGMs make the implicit assumptions in the model about the data explicit. So they are helpful in analyzing models and their biases.
In general are there any works out there that try to analyze HTM theory from a probabilistic perspective?
PGMs have states, and transitions between states. If you look at a picture of a Bayes net, the states are shown as units (circles) with at most one arrow starting at one circle and going to another. If you fed an initial input into HTM and then viewed the columns evolve as a stored sequence plays back from there, .then you would be viewing states that are made of distributed units, and transitions that are made of distributed connections.