CLA Classifier

rhyolight · April 6, 2017, 9:58pm

NOTE: This is a historical document. We prefer to use the SDRClassifier (video).

The CLA CLassifier

The CLA Classifier is not biologically inspired, but is a useful tool for interpreting the SDR output from the temporal memory and generating predictions. Essentially, it attempts to learn a function of an SDR at time $t$ ( $SDR_t$ ), such that it produces a probability distribution over the predicted field ( $PF$ ), $k$ steps into the future:

$f(SDR_t) \rightarrow P(PF_{t + k})$

The CLA Classifier takes the following parameters:

alpha: The value used to compute the moving average. Lower alpha values give a longer memory
steps: The set of steps into the future that the classifier will learn and predict for, eg. (1,3,7,12).

To do this, for each predicted step ( $k$ ), the CLA Classifier maintains a mapping of:

$f(SDR_{t-k}) \rightarrow PF_t$

This mapping essentially stores a history of input SDRs it has seen, so, given an input, it can refer to the history and determine the probability distribution over the PF from a given input. It does this by:

If we are predicting a categorical value, the moving average array is omitted.
Storing two arrays H and A, with shape N x B, where N is the number of bits in the SDR and B is the number of buckets on the PF as defined by the input encoding:
H: A histogram that stores the relative frequency of bucketed input values from when its corresponding SDR bit (n) is active. That is:
$H[n][b] = \frac{\text{times bucketed input was seen when}~n~\text{was active}}{\text{times}~n~\text{was active}}$
A: A moving average of the input values, whose length is defined by alpha. When this array’s corresponding SDR bit n is active with a given predicted field value v that falls into bucket b, the array is updated by:

$A[n][b] =((1- \alpha) \times A[n][b]) + \alpha \times v$

This ensures that when a bucket covers a range of values (ie. non-categorical values), we don’t get a prediction about a particular range, but rather, the average value that fell into that bucket.
For a given input SDR of length $N$ with $N'$ active bits, predictions are generated for each bucket (b) of the predicted field, at each timestep (k) by averaging the product of the associated histogram value and moving average table for each active bit:

$P(PF_{t+k}) = \left\{ \frac{1}{N'}\sum_{SDR[n]=1}A[n][b]\times H[n][b] : b \in [1,B] \right\}$

Thus we have a probability for each bucket of the predicted field, which may end up being very low for all buckets. We can use the bucket given the highest probability as our prediction, or not, depending on the context and the significance of the prediction. For example, the highest prediction may be for 100% engine load with a 0.1% probability, such a low probability would not necessitate the same response that a 95% probability would with the same load.

Another useful property of these predictions, is that they essentially form an ensemble, since each on-bit’s associated prediction makes a small contribution to the final probability distribution.

History

The CLA classifier was introduced as an alternative to Reconstruction and has shown better results in general.

Topic		Replies	Views
"CLA Classifier" vs "SDR Classifier" NuPIC	3	909	July 7, 2016
About CLA classifier NuPIC question	3	694	June 5, 2020
CLA / SDR Classifier Bug in nupic.core NuPIC	11	1276	October 26, 2016
Classifier Questions Numenta Theory	4	1044	July 17, 2016
Big data classification using CLA Numenta Theory	1	777	November 26, 2016

CLA Classifier

The CLA CLassifier

History

Related topics