Recoding htm.core's hotgym.py (I'm making many errors)

After my seemingly silly troubles installing htm.core (i ran pip install cmake, then pip install htm.core worked fine) & watching all of HTM school, I figured a good place to start would be re-coding the python examples cell by cell in Jupyter so I can understand each step of the process, what each object does and why, parameters etc.

The Mnist example works beautifully, and I gained some good understanding of how the shapes, spatial pooler etc. pieces together.

Since HTM’s real strength is temporal data, I tried recreating together the hotgym example cell by cell, works great up until the training loop. Specifically:


RuntimeError Traceback (most recent call last)
in
28
29 # make prediction, then train the predictor accordingly
—> 30 pdf = predictor.infer( tm.getActiveCells() )
31 for n in (1,5):
32 if pdf[n]:

RuntimeError: CHECK FAILED: “dimensions_ != 0” Classifier: must call learn before infer.

This same error occurs when I run the original hotgym.py directly from the cloned repo, as well as my retyped Jupyter version. Since the error’s on “predictor.infer()” I called help(predictor) since I couldn’t find a matching page in the docs; I attached most of the help as an image.

It lists some example usage of Predictor objects:

Example Usage:
| # Predict 1 and 2 time steps into the future.
|
| # Make a sequence of 4 random SDRs, each SDR has 1000 bits and 2% sparsity.
| sequence = [ SDR( 1000 ).randomize( 0.02 ) for i in range(4) ]
|
| # Make category labels for the sequence.
| labels = [ 4, 5, 6, 7 ]
|
| # Make a Predictor and train it.
| pred = Predictor([ 1, 2 ])
| pred.learn( 0, sequence[0], labels[0] )
| pred.learn( 1, sequence[1], labels[1] )
| pred.learn( 2, sequence[2], labels[2] )
| pred.learn( 3, sequence[3], labels[3] )
|
| # Give the predictor partial information, and make predictions
| # about the future.
| pred.reset()
| A = pred.infer( sequence[0] )
| numpy.argmax( A[1] ) → labels[1]
| numpy.argmax( A[2] ) → labels[2]
|
| B = pred.infer( sequence[1] )
| numpy.argmax( B[1] ) → labels[2]
| numpy.argmax( B[2] ) → labels[3]

I notice here that it does indeed call Predictor.learn() before Predictor.infer(), so I see why the error gripes at me so; however, I’m not sure where to shoehorn a predictor.learn() method into hotgym.py’s training loop.
Considering the example code, I wonder if the function of predictor.learn() is being handled somewhere within the SpatialPooler.compute() and subsequent TemporalMemory.compute() methods in the loop, both of which have the parameter ‘learn=True’.

I’m tantalizingly close to recreating this interesting HTM timeseries analysis, but this is a strange sticking point that I haven’t seen mentioned on the forums. Is there something wrong with my install, perhaps? For reference, I’m on OS Mojave 10.14 python=3.7.4, installed with pip according to community instructions.

Any insight on where I’ve gone astray (or where to go from here!) would be greatly welcomed.

1 Like

I have adapted the Jupyter notebook found at
https://github.com/psdyer/NuPIC-Algorithm-API-Example
to htm.core. Here is a simple example of using Predictor():

# Try varying the number of steps and alpha
num_of_steps = 1
alpha = 0.01

predictor = Predictor( steps=[num_of_steps], alpha=alpha )
#classifier = SDRClassifierFactory.create(steps=[num_of_steps], alpha=alpha)

# Make a sequence of SDRs
sequence = []
sdr = SDR((2048, 1))
sdr.sparse = [602, 1449, 1810, 2009]
sequence.append(sdr)
sdr = SDR((2048, 1))
sdr.sparse = [377, 570, 901, 2005]
sequence.append(sdr)
sdr = SDR((2048, 1))
sdr.sparse = [87, 232, 516, 1270]
sequence.append(sdr)

labels = [1, 5, 9]      # Category labels for the sequence

print(sequence[0])
print(sequence[1])
print(sequence[2])

N = 1000
predictor.reset()
for i in range(3*N):
    predictor.learn( i, sequence[i%3], labels[i%3] )

# Give the predictor partial information, and make predictions
predictor.reset()
a = predictor.infer( sequence[0] )
print("Prediction for sequence 0")
print( a[1] )
print(np.argmax( a[1] ) )

a = predictor.infer( sequence[1] )
print("Prediction for sequence 1")
print( a[1] )
print(np.argmax( a[1] ) )

a = predictor.infer( sequence[2] )
print("Prediction for sequence 2")
print( a[1] )
print(np.argmax( a[1] ) )
1 Like

Hey Johan, thanks for the example, lovely stuff. It makes sense that we teach the predictor (which is a neural net used to convert the TM’s final-state SDR into a prediction, i think?) using .learn() with some data + a label at each iteration, and .infer() can be used to predict outputs from a given input state.

Everything I mention from here out is up on my github as ‘alternate_hotgym.ipynb’:

The hotgym.py code is a little different, it seems; they call predictor.infer() at each training iteration to store the prediction for use in anomaly detection (power consumption spikes etc) later on. This was giving me the aforementioned "must call learn before infer" error.

I was able to sidestep this error by adding another altered copy of the training loop immediately before the original loop, where I added a predictor.learn() right after the tm.compute() and ended it there:

predictor = Predictor(steps=[1,5], alpha=parameters['predictor']['sdrc_alpha'])
predictor_resolution = 1
inputs = []
anomaly = []
anomalyProb = []
predictions = {1: [], 5:[]}
for count, record in enumerate(records): # iterate through listified CSV
    dateString = datetime.datetime.strptime(record[0], '%m/%d/%y %H:%M')
    consumption = float(record[1])
    inputs.append(consumption)
    # encoders at work
    dateBits = dateEncoder.encode(dateString)
    consumptionBits = scalarEncoder.encode(consumption)
    # concatenate these bad boys for a composite x,y input
    encoding = SDR(encodingWidth).concatenate([consumptionBits, dateBits])
    enc_info.addData(encoding) # keep track in metrics
    # create an SDR for active columns, same dimensions as SP
    activeColumns = SDR(sp.getColumnDimensions())
    # hurl input into the pool
    sp.compute(encoding, True, activeColumns)
    tm_info.addData(tm.getActiveCells().flatten())
    # dunno why we add the TM active state before tm.compute(), but i don't know most things
    tm.compute(activeColumns, learn=True)
    # let's change things up and try to just let the predictor learn before making any predictions
    predictor.learn(count, tm.getActiveCells(), int(round(consumption/predictor_resolution)))

The second, original training loop is the same up to tm.compute(), where it replaces predictor.learn() with predictor.infer(); the original .learn() is right at the end:

for [...]:
    pdf = predictor.infer( tm.getActiveCells() )
    for n in (1,5):
        if pdf[n]:
            predictions[n].append( np.argmax( pdf[n] ) * predictor_resolution )
        else:
            predictions[n].append(float('nan'))
    
    anomalyLikelihood = anomaly_history.anomalyProbability( consumption, tm.anomaly )
    anomaly.append(tm.anomaly)
    anomalyProb.append(anomalyLikelihood)
    # original position of learn()
    predictor.learn(count, tm.getActiveCells(), int(consumption/predictor_resolution))

This works… well, maybe. I’ve attached the output graph.


Had a bothersome error where it was predicting all 0’s for anomalies, and then I realized I’d somehow deleted the tm.compute() from the 2nd original training loop.

There’s also some oddity with the prediction-shift loop:

# Shift the predictions so that they are aligned with the input they predict.
for n_steps, pred_list in predictions.items():
    for x in range(n_steps):
        pred_list.insert(0, float('nan'))
        pred_list.pop()

This inserts some 0’s and nans into the predictions and later crashes the accuracy_score calculation with a divide by 0:

acc_lists = {1: [], 5: []} # i added this one for graphing
# calculate predictive accuracy, RMS
accuracy         = {1: 0, 5: 0}
accuracy_samples = {1: 0, 5: 0}
for idx, inp in enumerate(inputs):
    for n in predictions: # for each [N]umber of timesteps ahead which was predicted
        val = predictions[n][idx]
        if not math.isnan(val):
            accuracy[n] += (inp - val) ** 2 # RMSE
            accuracy_samples[n] += 1
    for n in sorted(predictions):
#         print("accuracy[n]: ",accuracy[n], "    accuracy_samples[n]:  ",accuracy_samples[n])
        accuracy[n] = (accuracy[n] / accuracy_samples[n]) ** .5
#         print("Predictive Error (root-mean-squared): ", n, "steps ahead:", accuracy[n])
#         print()
        acc_lists[n].append(accuracy[n]) # add to lists
        
    print('Anomaly Mean: ', np.mean(anomaly))
    print('Anomaly Std: ', np.std(anomaly))

The div0 seems to happen near the end, with accuracy[n] = (accuracy[n] / accuracy_samples[n]) ** .5.
I tried skipping this loop entirely in the run, and it… kinda looks like it works. Dunno.

I’m not sure of the ramifications of my two workarounds; perhaps the added pre-training loop induces overfitting, perhaps the lack of prediction-shift means the predictions are misaligned and useless? Not sure.

If anyone’s got some feedback or ideas on how to improve this, please throw them my way. Looking through NuPIC’s docs, what with modelFactory and such, it feels like Keras to the htm.core’s PyTorch - though I certainly can’t complain about learning the more granular workings of HTM code.

2 Likes

That figure is a nice benchmark comparison for my own project! Could I trouble you to post a version with the anomaly likelihoods as well?

1 Like

It’s perhaps easiest if you look at the Jupyter notebook that has been adapted to htm.core.
Here is a link:
https://drive.google.com/file/d/10zGB_1gx8lTa4I3MnPeZ0n_Y1qx9xcCa/view?usp=sharing

To run:

  • Put notebook at the root directory of htm.core
  • Activate your python environment for htm.core
  • From command line execute “jupyter notebook”

I’m adding this to my standing TODOs, an “official” jupyter for HTM.core would definitely be nice to play & explore.

1 Like

Hello,

I am attempting to understand the results I am seeing on my own example. I have an integer sequence, 20 elements long, and I create an rdseSDR for each element (size 10 with 3 activeBits). The TM has 50 columns with 8 cells each.

I am wondering why predictor.infer returns predictions that are not the same size as the input data or the dimensions of the TM architecture (I attempt to predict the next two steps and get lists of 113 long)? I see someone mentioned on this post that NNs might be used to make the predictions, but how is the output size determined? Also, why is it that we must look at the index of the maximum value of the predictor.infer?

Thank you for your consideration and time!