I am studying temporal memory and I understand its function, but I still have doubts on some points. The main question is in relation to the receptive field of the distal segments, I understand how it works in the SP between the proximal segment and the input space, but I still cannot see how it works in the lateral connections between the mini columns. The HTM theory does a simplification, where the receptive field would be all the winning or active cells of t-1? I believe that my doubt is related to my lack of knowledge about axonal projections of neurons in the TM layer, could anyone clarify me?
Another point is the synaptic learning of the distal and apical segments, is it pure Hebbian learning? This confuses me because of the modulatory characteristic of these segments, for the learning of Hebbian to work is not necessary a greater relationship between dendrite activation and the neuron action potential?
The TM receptive field is the set of winnerCells from t-1, yes. Even bursting columns still have 1 winnerCell each, and since the number of columns is always the same the receptive field should always be the same size.
I’d recommend the BAMI chapter on TM, which has good pseudocode a few pages in:
Not exactly sure what you mean here, but I bet understanding that pseudocode will help answer it.
Basically each distal segment is composed of synapses, which are either formed or unformed. Only the formed ones actually generate predictions.
Whether a synapse is formed or unformed depends on its permanence value (0 - 1). When a segment becomes correctly predictive, those synapses which activated (made it predictive) are rewarded by having their permanences incremented. Conversely when a segment becomes incorrectly predictive, those activating synapses have their permanences decremented.
So each column has its cells, each cell has its segments, each segments has its synapses and their permanence values. The learning happens in the modulation of these permanence values, enabling the TM to learn and potentially forget sequences.
Thanks for the clarifications! I read the BAMI pseudocode, but as I am interested in biological plausibility I always try to see how this is happening in the brain. The winning cells are those that contain the segments that best correspond to the previous activity in a mini column, if not have that segment, a new segment is formed in the cell with fewer segments, right? But it is still not clear to me why winning cells are differentiated from active cells to determine the potential connections of the distal segment. Is this an optimization or does it happen in some way in the brain?
But it is still not clear to me why winning cells are differentiated from active cells to determine the potential connections of the distal segment.
Because we need to confine the learning to one cell only, given that cell is encapsulated within an active mini-column. If we consider each active cell as a winning cell, during busting (massive neuron firing == all cells in the mini-column become active), all cells belong to a bursting mini-column will learn the same cellular activities and you do not want this to happen.
yes, I just miss an explanation of how the brain does this, I understand that it is fundamental for TM works. Perhaps it is something to get around the lack of a more detailed receptive field? An optimization?
Why do you have to choose only one winner from a bursting mini-column? I know that in some implementations this is needed because the synapses are created along with whole distal segments in response to bursting columns. However, in the original documentation, the dendrites and potential synapses are supposed to already exist. The fact that the column burst meant that the previous state was not connected strongly enough to any of the cells in the column to put them in a predictive state. In that case, all of the permanences associated with synapses between pre and post firing neurons should be strengthened with the expectation that the next time a similar sequence arrives, at least one of the cells in the column will have sufficient activations on its distal dendrites to put it into a predictive state.
I think it gives a specific context to the unexpected input that caused the burst.
The winnerCells from next time step (t+1) will also be forming synapses to the current winners (t).
So if all activeCells are treated as winners (eligible for their distal receptive fields), then the distal segments from the winnerCells(t+1) will mostly connect to non-overlapping sets winnerCells(t), which would seem to confuse the TM.
Rereading the pseudocode I realized that I missed this paragraph:
“The Temporal Memory algorithm allows a single distal dendrite segment to form connections to multiple patterns of previously active cells, but the pseudocode above generally does not do this. It learns one pattern per segment. This approach grows more dendrite segments than is strictly necessary, but this doesn’t have a fundamental impact on the algorithm.”
Now it makes more sense to me, thanks for the help.
I was recalling from a dusty old memory that in one of the early descriptions of TM (BAMI? CLA?), it seemed like all neurons in a bursting columns would strengthen the permanence values for the synapses in their respective potential pools which were active in the previous time step.
Thereafter, if the same pattern appeared again, the chances would be greater that one or more of the neurons in the column would have sufficient distal activations to enter a predictive state. If more than one neuron becomes predictive in a column, then the k neurons (where k can be 1) with the greatest distal activations are allowed to become active. All other predictive cells would then have their distal permanence values decremented, thereby avoiding the overlapping condition you describe.
That’s how I have it set up in my version and it runs pretty much the same. Theoretically, events with more than one predictive cell should be rare even when many are allowed because when a column bursts, one cell is selected to grow a new segment. Thereafter, multiple winners are unlikely even if allowed because first you’d have to have another burst event with similar cell activity that somehow didnt just activate the same cell with its new segment.
“The Temporal Memory algorithm allows a single distal dendrite segment to form connections to multiple patterns of previously active cells, but the pseudocode above generally does not do this. It learns one pattern per segment. This approach grows more dendrite segments than is strictly necessary, but this doesn’t have a fundamental impact on the algorithm.”
As I understand it, this paragraph says that in the pseudocode the receptive field of each segment is just the winnerCells (t-1), which makes it possible to learn only one pattern per segment. Different from the original temporal memory algorithm that allows each segment to learn multiple patterns, so the receptive field would be ActiveCells (t-1) or at least more than the winnerCells (t-1). However both methods work…
the receptive field of each segment is just the winnerCells (t-1)
Yes, but rather than forming connections with all the winning cells at t-1, we do subsampling, i.e. we form connections with ~20% (this number can change) of the winning cells at t-1. This enables one segment to response to multiple cellular patterns.