How Temporal Memory make t+n predictions, if his operate only t & t-1 timeframes?

Hello! Last week im worked on Temporal Memory that pseudocode explained in this paper: https://numenta.com/assets/pdf/temporal-memory-algorithm/Temporal-Memory-Algorithm-Details.pdf

At this moment im have TM, but his verry buggy, because im not understand TM algorithms at 100%.

One of my qustions about TM - how his make t+n (t+1, t+2, … t+n) predictions, if his operate only t & t-1 timeframes? I think if his operate only t & t-1 timeframes, his should make only t+1 predictions.

Redacted: “TP” & “Temporal Pooler” ranamed to “TM” and “Temporal Memory” according to Pual_Lamb message.

3 Likes

One minor point, which I think I’ll make, because you may encounter earlier discussions and versions of HTM that could mislead you on the topic you are asking about (predictions further in the future than “t+1”). The name of the algorithm was changed from Temporal Pooling ( TP ) to “Temporal Memory” ( TM ) some time back. It was originally called TP because the algorithm performed both the memory function that it currently does of connecting active cells at “t” to active cells at “t-1”, as well as a pooling function whereby the predictions would “smear”, looking further and further into the future, by connecting cells to earlier and earlier timesteps the more frequently a learned sequence was encountered.

These two functions were actually always separated from a theoretical perspective (you can find slides in YouTube videos for some of Jeff’s earliest HTM talks back in like 2009, where the memory and pooling functions were called out separately). But for whatever reason, even though they were always understood to be separate functions in the theory, they were combined in the original implementation of the algorithm and it was called Temporal Pooling.

That latter pooling behavior was later removed from the algorithm, leaving only connection from “t” to “t-1”, and the name of the algorithm later changed to TM. I only point this out, because the term TP still has a place in HTM theory, but it now refers to only pooling behavior. This unfortunately causes a lot of confusion for folks who are new to HTM and don’t know it’s history.

Anyway, back to your question, the TM algorithm alone is only able to make prediction one time step into the future. This is to match how the connections work in biology. But to make HTM useful in practical ML applications, a lot of non-biological components are thrown into the mix. One of these is the ability to predict further into the future that t+1.

There are a few different ways to implement this. One way is to hold a table of running averages on each cell for activity at “t+n” after that cell was active (where “n” is determined ahead of time). Since each cell is part of a high-order context, this works quite well (you are NOT asking “what is most likely to happen 25 timesteps after note G”, but something much more specific like "what is most likely to happen 25 timesteps after the 357th note of Bethoven’s 5th.

5 Likes

One possible way to forecast beyond t+1 is to make the assumption that the t+1 prediction is right and feed it back into the system, just to see what it will predict at t+2. Theoretically you could repeat the process indefinitely although errors will grow the farther you go.

2 Likes

Incidentally, this is something we usually refer to as “Temporal Unfolding”. It will most likely become a big part of HTM when we get around to performing actions (for example, combination of distal and apical input could be a mechanism for activating predictions). Dealing with uncertainty and error correction will also be a big part of temporal unfolding.

2 Likes

Forcasting beyond ‘t+1’ is made possible through the SDR classifier.

2 Likes

This is really fascinating: for the first table solution, are there any libraries in htm.core (or base NuPIC, I should really get around to setting up a good enough anaconda environment) that can be used with this? I know you mentioned multi-future-step is non-biological, but I can’t quite wrap my head around iterating through the TM and assigning/accessing a list of tables (lists?).

Temporal unfolding sound interesting as well. I’ve sketched out pseudocode for how I imagine it would go:

data = [list of input data]
lst = [ [] for i in range(len(data))] # match length of list of data_inputs
unfolded = dict(zip(data, lst))
t1_outputs = []

for record in data: # training loop
    encode data
    spatial pool
    tm.compute(learn=True)
    output = predictor.infer(tm.getActiveCells())
    
    futures = []
    for i in range(n): # n steps into future
        tm.compute(t+i SDR)
        future_output = predictor.infer(tm.getActiveCells()) # use as input for next iteration in loop

    unfolded[record] = futures # assign list of outputs to input_data
    predictor.learn(output)
    t1_outputs.append(output) # store recorded output matching index to input

So you end up with a dictionary where the keys are each input_data_record, and the values are a list of outputs from t+1 to t+n for the ‘imaginary chain resulting from that input extending to n’. I would assume we wouldn’t want to learn off this imaginary chain (since it’s not real data). I also figure you’d want to do the t+n chaining ‘live’ during each iteration of training, rather than ‘batching’ it later for whatever reason.
It’ll be a bit complex and a tad tricky to figure out the proper unfolding loop, I suppose. Does this seem somewhat on the right track?

1 Like

Right, sort of answered my own question after browsing the bindings for SDRClassifier where you can give the Classifier a list of ints to serve as ‘steps’ into the future. Completely forgot that htm.core’s hotgym.py example covers this:
predictor = Predictor(steps=[1,5], alpha=[...])
It predicts gym power consumption 1 and 5 steps into the future, which I think I mistook for ‘looking 1 or 5 steps into the past’ - which is actually determined by ‘how many cells per column in TM’. Embarrassing, time to go edit my blog post.

Is this Temporal Unfolding?

1 Like

Thank you, im changed first post.

2 Likes

To me - one of the key differences between HTM and other predictive methods is that HTM signals novelty.

This seems to be the prime feature - that this is a new thing and it needs attention.

The whole “1 step vs 2 or more steps” seems to be a secondary detail.

3 Likes