TM predictions coming out as just 0


I have a sequence s = [0, 0, 4, 7, 8, 4, 6, 8, 8, 9, 9, 5, 2, 1, 5, 3, 1, 1, 0, 0, 4, 7, 8, 4, 6, 8, 8, 9, 9, 5, 2, 1, 5, 3, 1, 1, 0, 0, 4, 7, 8, 4, 6, 8, 8, 9, 9, 5, 2, 1, 5, 3, 1, 1, 0, 0, 4, 7, 8, 4, 6, 8, 8, 9, 9, 5, 2, 1, 5, 3, 1, 1, 0, 0, 4, 7, 8, 4, 6, 8, 8, 9, 9, 5, 2, 1, 5, 3, 1, 1, 0, 0, 4, 7, 8, 4, 6, 8, 8]. I encoded it using the htm.core rdse encoder

params = RDSE_Parameters()
params.activeBits = 3
params.radius = 5
params.size = 10
rdseEncoder = RDSE(params)

I am using the example to train SP/TM architecture and predict next step (I’m actually only using about 80% of the elements in s so I can check how accurate predictions are to the truth). Here is what I have for SP and TM set-up

sp = SP(inputDimensions  = (rdseEncoder.size,),
    columnDimensions = (50,),
    localAreaDensity = 0.02,
    globalInhibition = True,
    synPermActiveInc   = 0.01,
    synPermInactiveDec = 0.008
tm = TM(columnDimensions = (50,),
    maxSegmentsPerCell = 1,
    maxSynapsesPerSegment = 1

and to train the model I have

predictor = Predictor([1,2])
for epoch in range(10):

    # if run without the reset, eventually get the following error
    # RuntimeError: Exception: SDRClassifier.cpp(228)
    #   message: CHECK FAILED: "recordNum >= lastRecordNum"
    #   The record number must increase monotonically.
    for count, elem in enumerate(train_set):

        rdseSDR = rdseEncoder.encode(elem)
        activeColumns = SDR( dimensions = tm.getColumnDimensions() )

        sp.compute(rdseSDR, True, activeColumns)
        tm.compute(activeColumns, learn=True)

        tm_actCells = tm.getActiveCells()
        predictor.learn(count, tm_actCells, int(elem) )

However, when I print active cells, winner cells, and predictive cells for each iteration, predictive cells always come out as all 0’s. I have attempted to modify the parameters for the spatial pooling and temporal memory architecture, but it doesn’t seem to have much of an effect on the predictions. While I expect some wrong predictions at first, I do not expect just 0’s every time.

I would appreciate it greatly if someone points out the issue in my code that’s causing this. Or, maybe explain what could be causing the issue and why? I am new to implementing the HTM to data not already in example code.

1 Like

Hey @aklandi,

First thing I’d try is a simple and perfectly predictable pattern, like maybe:
s = [0,1,2,3,0,5,8,3,0,1,2,3,0,5,8,3,…]
With this pattern you should start to see more predicted activity from 0 than from 3 – since 0 is followed by 2 different values.

Also, that RDSE radius seems high given what you’re raw input values are.
Radius of 5 means that any 2 inputs within 5 of each other will have mostly identical encodings.
So I’d drop that significantly, maybe to 1 or so.
You could validate this with the encoding vectors from each input – checking that there’s more overlap between 0 and 1 than between 0 and 8.

And to check the predicted activity I’d just use:
tm_predCells = tm.getActiveState()
This will be an array of all columns/cells in the TM, where all predictive cells are 1 and the rest 0.

I’m not sure what Predictor is, but if its a Classifier that has its own learning process which adds complexity. For debugging I’d strip away all unnecessary complexity to start.

This looks like a problem. You’ve got 50 columns and 2% of them are active at any given moment in time. That yields 1 active column! That is not enough for an HTM system to work. HTM systems rely on having very many columns, its part of the way that they work inside.

Try increasing the number of mini-columms to over 1-thousand.

PS: I also recommend increasing the size of the encoder by a lot. RDSE stands for “random distributed scalar encoder” and if it has too few bits then it stops working correctly because two inputs get randomly mapped to the same outputs. If there are many cells in the RDSE then this is astronomically unlikely, but with only 10 bits its possible. Also as sheiser pointed out a radius of 5 is probably too large for this task.

1 Like

@sheiser1, thanks! What you are saying about the RDSE radius makes sense. I think I was working with a sequence with large deviations between elements, and I forgot to modify for the example I posted. Sorry!

I am printing tm.getActiveCells() and tm.getWinnerCells() as well as tm.getPredictiveCells() during each iteration (as suggested by the htm.core example). This is actually how I am able to determine that my predictions are 0’s.

I will test more parameter values with respect to @dmac’s recommendations, but an increase to size 50 for the SDR and to 1000 columns seems to only return 0’s in the predictive cells. The original sequence s is oscillating (and restarts its cycle every 19th element), so I just shortened it for testing the suggestions @sheiser1 and @dmac made, but I’m wondering if there is a better way to debug/test since the dimensions are increasing significantly at this point?

Thanks again for your suggestions! And explanations. I really appreciate it.

1 Like

I’d recommend printing the SP active column indices. I think that’d be more helpful than the active & winner cells.
If the SP is behaving right, similar inputs (like 0 & 1) should have significant overlap, while dissimilar inputs should have much less or no overlap.
This is the same w/the encoders, though its the SP active column indices that actually get passed into the TM.

If these overlaps are found to make sense, I’d ensure the noiseless pattern is repeated enough times.
For a synapse to become formed it must reach the threshold of 0.2.
Each synapse starts at 0.1, so the sequence represented by a given synapse should need to repeat just once to reach 0.1 (given the increment of 0.15).

To test this, I’d use a noiseless sequence of maybe 5 elements, with different amount of overlap expectable between them, say:
s = [0,1,3,8,5,0,1,3,8,5,…]

If the SP is behaving right and the TM has these params, predictions should start popping out after just a couple repeats.

Also I think @dmac is right about the columnDimensions = (50) being too small. I’m used to 2048 from NuPIC, so I’d probably put it there, which at 2% sparsity should yield 40 active SP columns per timestep. This ensures that the SDR vector is big enough to distribute its meaning out across the vector, making it more robust to noise.

Hello @sheiser1, the tip to look into the SP active columns helped a lot! I’m not sure I actually printed them correctly (because I’m getting 0’s here), but it did force me to look more into other details I thought did not cause issue. I finally figured out that the real issue was that my TM initialPermanence was too low. When I set this parameter to 0.5 and connectedPermanence to 0.6 (since that’s permananceIncrement within initialPermanence), I received predictions that are nonzero and tm.anamoly other than 1.0. But this only happens when activationThreshold = 1.

I’m not sure I completely understand why this would cause an issue if the connectedPermanence is within 0.1. According to Cui, Ahmad, & Hawkings in The HTM Spatial Pooler – A Neocortical Algorithm for Online Sparse Distributed Coding, the permanence values are used to compute the overlap score. Would too low a permanence value cause the overlap score to not be sufficient enough for a column to become active? But I’m not sure this explanation makes sense to me.

Here he’s talking about the SP, where the permanences link the columns to the encoding bits.

The params you’re talking about here (initialPermanence, permananceIncrement, initialPermanence, connectedPermanence, activationThreshold) are for TM, which chooses which cell(s) within an activated column to activate.

It shouldn’t matter how low the initialPermanence is, only its relation to connectedPermanence and permananceIncrement /permananceDecrement. At the values you have in your first post, it should only take 1 instance of a transition for the perms to be formed and prediction made.

To get to the bottom of this, I’d recommend bypassing the SP entirely, and just feed the encoding vector into the TM – so:

tm = TM(columnDimensions = (rdseEncoder.size,)
tm.compute(activeEncodingBits, learn=True)

The SP has its own learning process, which adds complexity. I often skip SP and find it learns faster.

1 Like