Few questions about SP boosting, SP output stability and TM

Hello! Last weeks im dive into TM and at this day, i think, i have working TM which slightly different (more correct - slightly simpler) than original TM. Im made a lot of test and i gues my variant of TM works not bad, it can understand different contexts and can learn by one-shot.

But i cant solve few problems:

  1. When im use boosting, predictions of my HTM is very bad.
  2. When im dont use boosting, of course, i have only few active columns.

Small attention - in this post, when i say “boost” i mean boosting that used for make column-winner. I always use overlap-boosting, for cases when columns dont-overlap with inputs for long time. Sorry for confusion with terminology.

One of my tests - it stream of random numbers, 500 numbers in 80-90 range and 500 numbers in 20-30 range (cycled).

For example, this graph for non-boosted columns. Its not great (i think because my realisation of Random Scalar Encoder is terrible), but it have two peaks in moments when numbers range change and its good!

And boosted version. Its not working at all.

For boosting im use formula founded in this forum:
Math.log(activity) / Math.log(0.08)

My first question is - how i should boost columns for more efficient using and how save stability of predictions?

Each more problem of my TM - is “LEARNING_TRESHOLD” param. Like original TM, my TM have this param too and… How to determine the value of this parameter? After many experiments with my small HTM (12x12 region, each column contain 4 cells, max 5 active columns) i set it to 3. If segment contain >= 3 active cells, it can be connected to new currently active cell when column burst.

My second question is - its possible to calculate LEARNING_TRESHOLD param dynamicly, for all sizes of HTM-regions and any number of cells per column?

1 Like

Hey @bizunow13,

Congrats on your progress here!

Have you tested your system without boosting and with the generic NuPIC param values?

Boosting adds complexity, brought on by different sets of columns activating from the same raw inputs at different times. This is inherently de-stabilizing, since it creates more entropy for the TM to handle, which means it’ll take longer for the anomaly scores to settle down (assuming there are cyclical patterns in the data).

Boosting is a good optimization and I don’t discourage you from it, though I would definitely test on a known data set with common param values, to ensure your TM is behaving as it should.

I’d also suggest using a Scalar Encoder instead of the RDSE (Random Dist. Scalar Encoder), since its simpler and easier to implement.

The hot gym data set is well studied, so you could compare your TM’s anomaly scores with NuPIC:

There’s no golden value that’ll be ideal for all data sets, it depends how much a pattern should repeat before it starts to be predicted. It’s really the relation between connectedPerm, permanenceInc, permanenceDec and initialPerm. It sounds like your system is setup for one-shot learning, which means that:

initialPerm + permanenceInc >= connectedPerm

I’ve never heard of this being done, but I don’t see why not. You just need a logic to enact it, which I’d be curious to see, ideally along with some measures of its effect on system behavior.

Thank you! Also thanks for all who help investigate HTM =)

Small question - generic NuPIC params - is 256 input bits, 2048 columns and 4 cells per column with 2% max sparsity? Im test (after small tuning) my HTM realisation with this params - its ok, but slow, and this is the reason I am using small regions for tests.

Yes, its my next goal. Im need to understand how make exact SDR represintation of hot gym data like in NuPIC/HTM Studio, unfortunately, I do not understand where to find this information, with a quick look at the code of NuPIC, I did not find anything to catch on.

Sorry for long answer.

Those can be found here. There’s a set of generic parameters values that have been found to work well across many data sets, though there’s no guarantee they’re the best possible for any.

So the SDR each time step is output from the Spatial Pooler (SP). The input to the SP is the encoding vector, which is a concatenation of all encoders. These encoders are set in that params file here:

There are 3 total encoders here, each produces a binary vector: 1 for the metric value itself (“kw_energy_consumption”) and 2 timestamp-related ones (“timestamp_timeOfDay” and “timestamp_weekend”). So 3 binary vectors are combined by the MultiEncoder into a single encoding vector. This vector is the input to the SP, which maps it to an activation of 40 columns out of 2048 (2% sparsity). The indices of these 40 columns are then input to TM.

If you want to see how those parameter values govern the behavior of the SP and TM, here’s the source for each:

My advice would be to remove as much complexity as possible for testing, by bypassing the SP entirely. You can take those binary encoder vectors and feed them right into TM. I say this because TM learning is (IMO) the most special and powerful aspect of HTM. Once you verify that your TM implementation is working (which I’d do by comparing it to NuPIC’s), then try adding in an SP.

That’s just my view, hope it helps.