For an active column a winner cell is chosen to learn the temporal transition. I assume that for that winner cell there is no meaning to look for other cells it is temporally connected to.
I believe there are two purposes for a minicolumn bursting. One is to, as you pointed out, pick a winning cell to learn the new context. The other purpose is to predict all possible next elements (so if the sequence has changed, this allows a quick transition to the new context after a couple of inputs).
For that, you definitely want the minicolumn bursting in the current time step to make predictions about the next time step. There will usually be several predictions caused by bursting minicolumns, and the next input will narrow those down, and the next one further narrowing down, until the ambiguity is resolved.