In the TM algorithm, if a segment was active in the previous timestep (i.e. post-synaptic cell was in a predictive state), but the post-synaptic cell did not become active in the current timestep, then the active synapses on that segment are punished.
I’ve found that this strategy leads to slow learning rates and high forgetting rates in certain scenarios. The reason is that before a new sequence is learned, there is a lot of bursting. Each time a particular set of minicolumns bursts, it predicts all possible next values in all learned sequences which contain that input. Since only one of those possible next values will ever be active next, all the other possibilities end up getting punished. This leads to a higher likelihood of forgetting transitions in previous sequences whenever new sequences are encountered. The longer and more complex the new sequences are, the more likely transitions in other sequences will be forgotten.
My initial thought is that the permanence decrement logic should take into account whether the pre-synaptic cell was predicted active (versus just active). Any thoughts on possible negative consequences that this change could cause?