Maximum (Or recommended ) number of input features

Moti · February 26, 2019, 8:55pm

Hello,
There any limitation about the number of input features in HTM?
According to different examples (and to my filling), I see that usually the input size of the SP is 2048.
That mean the output size of the SDR encoders is 2048 , it’s mean that if we want to keep the ratio of 2% active columns we are have limitation to ~50 active column in the output of the SDR, and if we want to keep unique SDR encoder output for every input , we need to limited the number of inputs.
So may someone have any finger rule about the number of input features (may depend of the features value range?) depending on the SP input columns?
Thanks,
M

sheiser1 · February 26, 2019, 9:09pm

It’s generally recommended not to include more than a small handful of fields to a single model, since it crams too much semantic information into the SDR’s. The output sparsity will remain at 2% regardless of the number of input fields, but crowding the input space can easily make it harder to clearly recognize patters.

It seems a commonly advocated approach when dealing with many input fields (/features) is to make separate models for each. You can also try swarming, which searches many input field-combinations to tell which are most useful for predicting a specific ‘predictedField’.

Moti · February 26, 2019, 9:15pm

First of all thanks,

the output sparsity will remain at 2% regardless of the number of input fields

yes that what I mean, the overlapping between different inputs will increase in the SDR encoders output
May you familiar with any finger rule (like the 2%) about the ratio of the input range values to the output…
10x

sheiser1 · February 26, 2019, 9:27pm

By the ‘input range’ do you mean the size of the total encoding? (the sum of ‘n’ for all encoders)

Moti · February 26, 2019, 9:38pm

Hii
I mean, that for example if one input is categorical and have only 3 options , that mean we can allocate small number of output columns and still preventing overlapping , so if we have features with small value range we can increase the number of features .

sheiser1 · February 26, 2019, 9:59pm

This is true, in that with smaller encoding vectors you can fit more fields into the total encoding which is fed into the SP. The problem is that any fields that are given small encoding vectors will have small presences in the input space and thus small impacts on the SP column activations (and thus little effect on the predictions).

To illustrate the idea say you had 10 fields, one which was given an encoding vector of size 101 and the other 9 given vectors size 11. This would mean that of a total encoding of 200, 101 of the bits would be dedicated to describing just one field. The total encoding of 200 is what determines which SP columns will activate, so each of the 9 smaller vectors would have little influence on the SP compared to the larger one. Of course you could give all the fields small vectors so none had a bigger say, and be left with many small voices competing for influence.

So yes, you can increase the number of features without blowing up the compute cost on the SP by giving each feature small input vectors, though reducing the number of features is still preferable in general.

Moti · February 27, 2019, 3:48pm

Thanks ,
I will try to play with that.
P.S: if I think one feature have important meaning (like system mode), but the feature values can be representing only by 4 bits (categorical value compuse from 8 different values), I still need to representing it with SDR output of high order?
10x

sheiser1 · February 27, 2019, 7:29pm

If its an important feature than it should have a bigger presence in the input encoding. Although its possible to encode the categorical field with n=4;w=1, you’d make an equivalent but bigger encoding by using say n=100;w=25

Topic		Replies	Views
How many feature can be concatenated for input to spatial poler Applications	17	913	March 5, 2020
Ratio between the inputDimensions and columnDimensions in SP NuPIC	9	459	March 7, 2019
Multiencoder and density of SDRs NuPIC	2	554	March 15, 2018
When is the HTM Spatial Pooler full? Numenta Theory spatial-pooling	7	1224	May 31, 2016
Is there a maximum length for the inputVector of the Spatial Pooler? NuPIC spatial-pooling , question	8	716	December 4, 2017

Maximum (Or recommended ) number of input features

Related topics