Are you asking about how the high order sequences are learned by the TM? It is indeed a little tricky to go through the learning process step by step. Here’s a rough textual description:
Suppose you’ve already learned XBCY (let’s call the TM representations X’B’C’Y’) and now you want to learn ABCD (A"B"C"D"). When you present A, nothing is predicted in the B columns. When you present B it bursts and a random set of cells are chosen to win. Those cells become part of B". So far so good.
However due to the B columns bursting, at this point C’ is predicted. So at this point the TM will learn A"B"C’D. This is incorrect but temporary.
Later when you present A, B" will be predicted. When you see B, B" will become active. At this point nothing is predicted in the C columns. When you see C, a random set of cells are chosen to win. These become C". So at this point the TM will learn A"B"C"D".
Because of the temporary incorrect state, the TM needs to see the high order sequence several times in order to learn everything correctly.
Does that make sense? There are a few more details, but that should give the basic idea.