Ratio between the inputDimensions and columnDimensions in SP

I try to understand what is the golden ratio between the inputDimensions and the columnDimensions.
According to different sources the SDR should encode each feature with n/w ratio of 2%.
In addition the recommended SP columnDimension is 2048, but what is the recommended ratio between the input demnsion of the SP to the columns , this ratio have direct influence of the number of n bits of the SDR encoder.

1 Like

The inputDimensions has to do with the size of the input space. The columnDimensions is the size and dimensions of the spatial pooling minicolumn space. They must have the same dimensionality, but they do not have to have the same size. For example, if the input space is 28*28 (which is 2d) the columnDimensions must have 2 dimensions as well, but you can have many in each dimension as you want.

Thanks @rhyolight , I understand that they must have the same dimensions, but there any recommendation about the ratio in the same dimension.
For example I have 10 features each encoding to n=100 it’s mean my input to the SP have ratio of 1000:2048 to the columns, this is OK?, and if I decide to encode each feature to n=200? , The ratio changing to 2000:2048? Or to n=300 the ratio changing to 3000:2048, in short my question is how the ratio between the SP input to the columns change the accuracy of the HTM model? , and if we have any preferred ratio?

Typical is an input space under 1000 bits and 2048 columns. If you use TM I would start out with 16 cells per column.

Hello @rhyolight thanks for your answer, it’s mean the ratio need to be 1000:2048?, So if I have 10 features with equal importantly , each of them will encoder to be with 100 bits in the SDR output .
In addition what is the theoretical reason for set the SP to maximum 1000?
In addition why to limte the number of cells in the HTM to 16? ( I read in different post in the forum the recommended cells in column 32, more cells equal to more patterns)

The input space is 1000. You can try larger, but the number of connections required increases quickly as you make the spaces larger. So the computations will take longer to run and will take more memory.

This means that the total n of all combined encoders shouldn’t exceed 1000 right?

I’m just choosing 1000 arbitrarily. We usually use less than 500 bits for an input space.

But if I have multi features input (12 features), some of them are ushort type (0-65k) and some are categorical , limited the number to 500 will prevent good representing of the input.
Again what limit the input space to 1000 ?, If I increase the columns to 4096 , it’s mean I can increase the input space?

Like I said, as you increase the size of these spaces, it will require more compute time and memory. You will need try different settings and see what works for you.