Deciding values for n and w values for RDSE encoder

iankurgarg · June 6, 2017, 4:02pm

What should be criteria to decide the optimal values of n and w for given time series data? Would the default values of n and w work in most cases ?

rhyolight · June 6, 2017, 4:16pm

Have you seen this video with a description of how changing n, w, and resolution affect encodings?

iankurgarg · June 6, 2017, 4:18pm

Yes. But I am still not sure how to decide about specific values based on my data.

rhyolight · June 6, 2017, 4:23pm

So what does your data look like? floats, ints? min/max?

iankurgarg · June 6, 2017, 4:25pm

Two sets of data. Both Integers
One ranges between 0 and 10,000
and second ranges between 0 and 400,000

rhyolight · June 6, 2017, 4:36pm

Here is the calculation we have used in the past:

So you choose how many buckets you want to represent your data between min/max. For example, if your data ranged from 0 - 9,999 and you chose 5000 buckets, each bucket would represent two values, therefore your resolution would be 2. But that would mean that juxtaposed numbers might have the same encoding. Increasing the number of buckets will decrease the resolution of the encoding.

Looking at it this way, you don’t need to specify n and w.

iankurgarg · June 6, 2017, 4:41pm

Oh… that’s really useful. But not specifying n and w would mean n = 400 and w=21 … By specifying just resolution would the same n and w work for both sets of data ?

rhyolight · June 6, 2017, 5:16pm

You might want different params for each RDSE encoder. See the constructor docs for RandomDistributedScalarEncoder.

iankurgarg · June 6, 2017, 6:38pm

What I meant was just changing resolution, would that be enough without specifying values of n and w for different sets of data.

rhyolight · June 6, 2017, 8:13pm

Yes, you’ll only need to find a different resolution value for each field, which might depend on your min/max and how granular you want the encoding for each field.

Look at it like this… the higher the resolution, the more values will go into each bucket, so there will be less overall encodings, which each represent more values. A high resolution smudges a bunch of values together, and a lower resolution ensures more numbers are encoded more uniquely.

Topic		Replies	Views
How to choose RandomDistributedScalarEncoder settings Engineering	4	470	October 24, 2018
In Random scaler encoder what does the parameter resolution do? NuPIC Community Fork question	9	1206	March 31, 2020
RDSE implementation is wasting buckets? Numenta Theory	11	2667	July 10, 2019
OPF time encoder parameter questions NuPIC	6	849	December 18, 2018
Strange anomaly likelihood plot NuPIC	13	993	April 7, 2017

Deciding values for n and w values for RDSE encoder

Related topics