SDR Classifier without TM

Hi!

I used htm.core. I want to classify without Temporal Memory. Can I put the output of SP to the input of the Classifier? If yes, how can I do it?

@breznak, @David_Keeney maybe you can help me?

Thanks a lot!

1 Like

Hi, sure, it’s a good use. See mnist demo in examples for how it’s done. But the example is not Network API.

1 Like

The short answer is Yes you can.

The Classifier needs two things.

  1. The quantized value of the original value (the bucket)
  2. The output SDR (in this case the output of the SP)

See comments at the top of ClassiferRegion.

 * An example of the bucket values are:
 *   Assume the radius of the encoder is 0.01
 *   The bucket 1.00 will contain all values >= 1.00 and < 1.01.
 *      value encoded         quantized value (or title) of bucket
 *         1.0                     1.00
 *         6.23463745              6.23
 *        -0.45679278             -0.45
 *         0.0045                  0.00

What you are doing is for each quantized input value, have the classifier learn the pattern generated by the SP output. This is a classical NN learning in the classifier so it takes a lot of examples to do a good job matching.

Note that it is also sensitive to the number of discrete quantized values (or buckets) in the range of the input values. for each bucket value you need to train with enough examples that it is able to recognize the pattern for that bucket.

If you want to use NetworkAPI to implement this you can check out ClassifierRegionTest.cpp, test asCategoryDecoder and asRealDecoder as an example of how this is done.

2 Likes

Thank a lot for your answers ! I saw mnist demo. It helped me a lot. I have some questions about Classifier. At first I can’t find information, if I put binary information onto input of Classifier(if i had only two classes) and on output I have list with two probabilities, but I didn’t know which probabilities belong to which class.

And the second questions. I have dataset, when label are differents sensors. Each sensor have two classes(on/off).Screenshot at 2020-02-27 23-43-14

I divided this label column to 2 column, where each sensor have 2 classes ( binary information 1 or 0) Does classifier receive only one label column ? In my approach I must create 2 Classifier ?

Can you advie me about my next questions ?

@David_Keeney, @breznak, I know that I ask too much questions. But you very help me.

Thanx a lot !

1 Like

If I understand your situation correctly, you have a set of categories, each with a state of on or off…I assume mutually exclusive so only one category is on at each cycle. So what you are looking for is a Category Classifier. You want to know the probability of a pattern generated by an SP or TM as being the result of a particular category.

I assume you are using the NetworkAPI interface.
For this you will have two data flows into the ClassifierRegion.

  • bucket. This is the original data which in this case would be the category or some id that identifies it. This is the sample for a particular cycle. Categories by definition are already quantized. The point is that there are a fixed number of them. This value can be an array but in your case we would only provide one value.
  • pattern. This is the pattern created as a result of using that data. This would be the SDR output of the SP or TM.

There are three output flows from the ClassifierRegion.

  • pdf (probability distribution function). This is an array. Each element of the array gives the probability of the pattern being a specific category.
  • titles. This is an array of all of the categories (all values passed in bucket in the past) that have been presented as input. Both the pdf and titles arrays are in the same order.
  • predicted. This is an index into the pdf and titles arrays for the one that has the highest probability of being a match for the current sample.

Note that this must be given lots of samples for each category to learn from before the output starts to give good results.

1 Like

Thanks a lot. But I have some specifying questions. You are right about the set of categories. If you see in the pic below, you can see the column ‘label’. This is my input column. and I converted this input into three other columns: ‘tap’, ‘toilet’, ‘shower’. For example, If we have the name ‘Tap’ in column ‘label’, we put 1 in column ‘tap’.

I used Classifier from htm.bindings.algorithms. I didn’t use NetworkApi in htm.core. And this classifier has output flow:

  • pdf (probability distribution function). This is an array. Each element of the array gives the probability of the pattern is a specific category.

It is information from help. Can I use Classifier from htm.bindings.algorithms? If no, I must use NetworkApi and have you some examples of NetworkApi in htm.core?

Thanx a lot for your helping!

@David_Keeney @breznak

Can I use Classifier from htm.bindings.algorithms?

Yes you can. The ClassifierRegion in NetworkAPI is just a wrapper around the SDRClassifier C++ algorithom. This is the same algorithm that is called by the python Classifier class from htm.bindings.algorithms.

Ok, so using the python Classifier class;

You have your category indexes ‘tap’ (1), ‘toilet’ (2), ‘shower’ (3).
The tricky part is that the classifier remembers the category indexes the first time you use them in the learn( ) method. The pdf output is an array that uses the same indexes. If you skip index numbers it will fill them with 0’s so in this case the pdf output will have pdf[0] being 0. pdf[1] will be the probability of it being a ‘tap’, etc.

When you are in learning mode, for each iteration, you call classifier.learn(pattern, category).
The pattern is the SDR output you obtained from your TM or wherever.
The category is the integer value you used for the category index that resulted in that pattern.

Then you can call classifier.infer(pattern) and it should return a numpy array that is the pdf. Each element of that array is a probability that the pattern matches that element’s index. So, to find the probability of that pattern matching ‘shower’ you look at pdf[3] because 3 is the index for ‘shower’.

To find the most likely match you can use numpy.argmax( classifier.infer( pattern) ) to get the index of the category with the largest probability.

As I have mentioned before, this is a classic NN classifier so it takes quite a few training samples on each category before it can do a reasonable job of guessing the correct match.