We know how TM stores variableordermarkovchain (VOMC) of the first level i.e. by storing transitions of higherorder i.e. A:B:C;X:B:Y vs A:B1:C;X:B2:Y
I’ve been trying to figure out a way to recognize and store SequencetoSequence transitions f.e. AB => CD => FGH … known in other words as CHUNKING i.e. figuring the boundaries of a sequence. By comparison with VOMC it should be mechanism akin to BURSTING, which occurs on noticing a difference.
To notice difference you need to have expectations.
To have expectations you need to had already memorized previous patterns.
So the question then is what REPRESENTATION should you use and SECOND where/how would you store them.
VOMC has SDR and TM for this, what will be equivalent for variableordersequences (VOS) ?
I’ve been playing/thinking about using SDR for VOS, but there are several problems with it.
 You can not store large sequence in UNION SDR
 UNION SDR does not have mechanism to preserve ORDER it work like a SET not as SEQUENCE. There are way to do it (by permuting) but it is not natural.
 You can enlarge the SDR from 2000 to say 100000, but isnt it supposed as we claim the hierarchy to be lowering complexity.
TM seems like OK way to store the sequences, but how do you detect CHUNKS by BURSTING.
for more on other structures HTM have to support : chunking, order, recursive structures, algebraic expression.
https://www.cell.com/neuron/pdf/S08966273(15)00776X.pdf
just found this : Alternative sequence memory
reading it…
This looks like it will work, given :
Here are the rules :
1 Sequences are encoded by shift , so that we can preserve the order
[]  is Thinning : in this case 50% of the two patterns are randomly selected, so that sparsity stays the same.
()  just grouping brackets
AB = [ A >> 1  B >> 2 ]
BA = [ B >> 1  A >> 2 ]
ABC = [ AB >> 1  BC >> 2 ] = [ ([ A >> 1  B >> 2 ] >> 1)  ([ B >> 1  C >> 2 ] >> 2) ]

You stack TM in layers i.e.
TM1 stores transitions : A => B, B => C, C => D
TM2 : AB => BC, BC => CD, CD => DE
TM3 : ABC => BCD, BCD => CDE, CDE => DEF
…
The higher you go the slower it is … TM1 is activated every timestep, T2 every 2 ts, T3 … 3 ts, … which means that the capacity grows exponentially … if the lower levels forget the higher will remember it longer …
If you forget a transition, higher level can reinforce it. In fact you should be able to interrupt the flow by providing feedback at any level TOPDOWN…
TP in this case will just pack the 2 SDRs for the upper layer and unpack them downstream.
The question is how to do it with less TM layers ?
and it has to be recursive to support longrange patterns.
here is how packing/unpacking will work :
abc = ((x.a >> 1)  (x.b >> 2)  (x.c >> 3))
In [31]: x.bm(abc >> 3)
Out[31]: 'c'
In [32]: x.bm(abc >> 2)
Out[32]: 'b'
In [33]: x.bm(abc >> 1)
Out[33]: 'a'