When dealing with the float data type, how best to set up parameters of Random scaler encoder
I really don t know more about it but i believe this documentation can help please do check it out.
What HTM implementation are you using?
HTM core with python 3.7
This might be helpful (I just searched the forum for “RDSE”): Deciding values for n and w values for RDSE encoder
Thanks mate , i got it from your original youtube video in RDSE, however in htm-core resolution is express in a float values of 0.88 , which is different from the integer values of 1, 5 shown in your example. Can you shed some light on this.
Let me try to explain what is needed.
The code in RandomDistributedScalarEncoder.cpp shows that the internal parameters used by the algorithm are computed from three parameters. These are the
size, and one of
sparsity but not both, AND only one of:
category. It will issue an error if you don’t follow this rule.
If you don’t happen to use
resolution as one of the parameters, the program will compute the resolution from the parameters that were given, i.e.
activeBits could be computed from size and
sparsity is given. So the computed resolution could be a fraction.
This allows the encoder to be configured using multiple ways depending on how you think of the data.
I don’t think I explained
resolution very well.
resolution parameter is the smallest unit of data that you care about. Its units are those of the data set. A data point divided by the
resolution and truncated to an integer is its bucket number. The encoder will generate a single unique pattern for each bucket.
If your data contains categories then the resolution is 1 and each category has its own bucket so each has its own unique pattern.
As an example, if you have data that ranges from -1.00 to +1.00 and you care about increments of 0.02 (the resolution), then all values between 0.32 and 0.34 would be in the same bucket and result in the same pattern.
The entire range of numbers in your data (the maximum value minus the minimum value) divided by the resolution gives the total number of buckets. All potential buckets will occupy
activeBits bits in the output array. If there are too many buckets they may not fit in the output array so the
resolution parameter should be the largest value that you can get by with.
With the Scalar Encoder, each bucket is encoded directly into the output bits. There are times that you may have too many buckets or you may not know the maximum data value. In this case if the number of data values actually used are a reasonable quantity there is another encoder that can be used. The RandomDistributionScalerEncoder (or RDSE Encoder) makes a hash of the bucket number and uses the hash to generate the pattern.
As with any hash, care must be taken to make sure the width of the resulting hash value is small compared to the total number of unique values actually used to avoid excessive collisions.
With either encoder, remember that the output is a bit array but it is not really an SDR. It is the job of the Spatial Pooler to apply the sparsity and create a real SDR.
Thank you now that you have explainted it makes a lot of sence.
5 posts were split to a new topic: HTM to process 2D SEAGen data