Countering multiple cells representing same context?

Since I made the change in my TM implementation to allow activation of multiple cells in a single column in case more than one are in predictive state, I have been noticing a phenomenon that affects sparsity. For certain input sequences, multiple cells in a single column can get connected to the same context.

The easiest way to understand the problem is to consider a single input is repeated, then the columns for that input later reactivated due to bursting (there are many other ways to produce the phenomenon, but this is an easy one to describe and understand).

For example, let’s input sequence: A,A,A,A

  1. First input A is unexpected, so the columns burst. Cells representing A’ are selected randomly for learning.
  2. Second input A is also unexpected, so the columns for A burst. Cells representing A’’ are selected randomly for learning. They form distal connections with cells representing A’.
  3. Third input A is also unexpected, so the columns burst. Cells representing A’’’ are selected randomly for learning. They form distal connections with cells representing A’’. Cells for A’’ also become predictive, because bursting column means the cells for A’ are also now active
  4. Fourth input A is expected, and cells for A’’ become active. They grow distal connections with cells representing A’’’.

At this point we now have a circular connection between cells representing A’’ and cells representing A’’’. In the future, any time A bursts, it will put two cells in the column into predictive state, and thus activate and enforce both cells in the case of a correct prediction, and equally degrade them in case of a wrong prediction. Thus there does not appear to be any way for one of the cells to ever win out over the other one in representing the context.

Here are a couple of screenshots of this happening in practice. In this case, there was a more complex sequence, but you can see that the phenomenon has occurred in four of the “C” columns:

Looking at this, I can see that one of the predictive cells is better connected to the input than the other one. The possible solution that comes to my mind would be to only select the better matching one for learning, and degrade the less matching one. Does that seem like a good solution, or could it lead to problems in other scenarios?

Anyway, thought I would get some feedback. Maybe there is a process that I have overlooked, or I might be interpreting something incorrectly.

The possible solution that comes to my mind would be to only select the
better matching one for learning, and degrade the less matching one.

If I recall correctly, in an algorithm I read, you select the best matching one for learning (based on actual excitation value if any is over threshold, or otherwise based on overlap, and breaking ties however you want). I think it didn’t degrade the less matching ones, because that could interfere with other known sequences.

As long as it doesn’t get out of control, it’s okay for multiple cells to be active in the same column. The sparsity changes anyway, because columns will burst at most steps of the sequence as a result of noise.

1 Like

It definitely seems reasonable, but it doesn’t appear to be in the documentation for the BAMI TM algorithm. Relevant section:

For each active column, if any cell was predicted, those predicted cells become active (lines 11-12). Each of these cells is marked as a “winner” (line 13), making them presynaptic candidates for synapse growth in the next time step. For each of these correctly active segments, reinforce the synapses that activated the segment, and punish the synapses that didn’t contribute (lines 16-20). If the segment has fewer than SYNAPSE_SAMPLE_SIZE active synapses, grow new synapses to a subset of the winner cells from the previous time step to make up the difference (lines 22-24)

The algorithm for bursting columns, however does the best matching logic, so it seems reasonable to do the same in other cases where more than one cell in the column are activated.

This could theoretically be avoided by only degrading in cases where there were one cell per column active in the previous time step (i.e. we know we are comparing two cells representing the same context and not two different contexts). Of course that type of logic is probably implausible in a biological system.