Why was the SDR Classifier's target distribution changed?


#1

I was looking through the python source for the SDR classifier and was wondering why the following change was made:

Specifically this line, the target distribution went from:

targetDist[classification["bucketIdx"]] = 1.0

to

for bucketIdx in bucketIdxList:
      targetDist[bucketIdx] = 1.0/numCategories

I would have thought the first version was correct, that given the temporal memory’s output, you want to map it to its associated bucket via learning.

I might be wrong, but doesn’t the new version make the assumption that inputs are uniform? What if a particular output is rarely seen?


#2

Interesting question. I must defer to @scott or @mrcslws to answer that.


#3

The change referenced allows for multiple classifications to be passed in during learning. The implementation was changed to enable classification to be either a single classification dict or a list of them.

The error then needs to handle either a single or multiple classifications and apparently this requires normalizing the targetDist array containing the passed in classification buckets to 1.0 by initially setting all categories to 1.0 and then dividing by the number of categories (so the resulting values sum to 1.0).

I don’t recall the use case for training with multiple classifications.