In Random scaler encoder what does the parameter resolution do?

When dealing with the float data type, how best to set up parameters of Random scaler encoder

I really don t know more about it but i believe this documentation can help please do check it out.
https://nupic.docs.numenta.org/0.6.0/encoders.html

1 Like

What HTM implementation are you using?

HTM core with python 3.7

This might be helpful (I just searched the forum for “RDSE”): Deciding values for n and w values for RDSE encoder

Thanks mate , i got it from your original youtube video in RDSE, however in htm-core resolution is express in a float values of 0.88 , which is different from the integer values of 1, 5 shown in your example. Can you shed some light on this.

Let me try to explain what is needed.

The code in RandomDistributedScalarEncoder.cpp shows that the internal parameters used by the algorithm are computed from three parameters. These are the size, and one of activeBits or sparsity but not both, AND only one of: radius, resolution or category. It will issue an error if you don’t follow this rule.

If you don’t happen to use resolution as one of the parameters, the program will compute the resolution from the parameters that were given, i.e. radius / activeBits. And activeBits could be computed from size and sparsity if sparsity is given. So the computed resolution could be a fraction.

This allows the encoder to be configured using multiple ways depending on how you think of the data.

1 Like

I don’t think I explained resolution very well.

The resolution parameter is the smallest unit of data that you care about. Its units are those of the data set. A data point divided by the resolution and truncated to an integer is its bucket number. The encoder will generate a single unique pattern for each bucket.

If your data contains categories then the resolution is 1 and each category has its own bucket so each has its own unique pattern.

As an example, if you have data that ranges from -1.00 to +1.00 and you care about increments of 0.02 (the resolution), then all values between 0.32 and 0.34 would be in the same bucket and result in the same pattern.

The entire range of numbers in your data (the maximum value minus the minimum value) divided by the resolution gives the total number of buckets. All potential buckets will occupy activeBits bits in the output array. If there are too many buckets they may not fit in the output array so the resolution parameter should be the largest value that you can get by with.

With the Scalar Encoder, each bucket is encoded directly into the output bits. There are times that you may have too many buckets or you may not know the maximum data value. In this case if the number of data values actually used are a reasonable quantity there is another encoder that can be used. The RandomDistributionScalerEncoder (or RDSE Encoder) makes a hash of the bucket number and uses the hash to generate the pattern.

As with any hash, care must be taken to make sure the width of the resulting hash value is small compared to the total number of unique values actually used to avoid excessive collisions.

With either encoder, remember that the output is a bit array but it is not really an SDR. It is the job of the Spatial Pooler to apply the sparsity and create a real SDR.

1 Like

Thank you now that you have explainted it makes a lot of sence.

Hi David, can you please help me. I am trying to encode SEAGen data which consists of mapping x,y float coordinates to an integer label of either 0 or 1. I’m using htm.core community with python3.7. I’m currently modifying the hotgym.py file to take my csv file…
c0,c1,c2
float,float,bool**
S
6.5358958546461,1.15006943124406,0
8.72474535182035,2.1233268092272,1

I can’t find any example suitable for just primitive data types …

inside hotgym.py …

‘enc’: {
“value” :
{‘resolution’: 0.88, ‘size’: 700, ‘sparsity’: 0.02},
“time”:
{‘timeOfDay’: (30, 1), ‘weekend’: 21}
},

Can you help me to format the encoder definition for float,float, bool? the bool is the prediction field

btw as far as resolution i only care about 0.0 - 9.9 range

I can’t find any example suitable for just primitive data types …

The RDSE and ScalerEncoder are both encoders for a scalar number. That is a primitive data type. A category can be represented by an numeric index so the RDSE or ScalerEncoders will work. Boolean can be represented by the numbers 0 and 1 so they also can be encoded this way. Encoding an array of values would require an encoder for each element in the array.

You have two primitive values to encode so you will need two encoders. In the hotgym example there are also two values, consumption and time. Each have there own encoders and the results are concatenated. The DateEncoder is actually a collection of several scalar encoders and the result returned by the DateEncoder class is a concatenation of the outputs of each of those encoders. You can concatenate as many encoder outputs as you need. I recommend that you view the HTM School session on Encoders if you want to understand how encoders work.

I hope that helps.

(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)

Since I have three pieces of data (x,y,label), I should use 3 encoders rights?