Please help me about htm.core

Hi everyone,
i have a problem, please help me
i run the MNIST classification code in htm.core. in this code, in the training phase there is the command ‘sp.compute( enc, True, columns )’. so there is two 2D-variable ‘enc’ and ‘columns’. I want each of the variables ‘enc’ and ‘columns’ in the form of one-dimensional vector of integers.So I used the following commands:

sp.compute( enc, True, columns )
vector=enc.flatten()
columns2=columns.flatten()
sp_input= np.array(vector,dtype=‘uint32’)
sp_output = np.array(columns2, dtype=‘uints32’)

But I get this error:
sp_input = np.array(vector,dtype=‘uint32’)
TypeError: int() argument must be a string, a bytes-like object or a number, not ‘htm.bindings.sdr.SDR’

please help me, please tell me what should I do to have ‘enc’ and ‘columns’ in 1-D int format?

Hi,

The SDR objects have an attribute SDR.sparse which is the indexes of the active bits, in 1-D integer format. Try:

sp_input= np.array(vector.sparse,dtype=‘uint32’)
sp_output = np.array(columns2.sparse, dtype=‘uints32’)

Also try the python command:

help(htm.SDR)

I hope this helps

1 Like

thanks a lot .
My problem was solved with this command SDR.sparse :slightly_smiling_face:

I have some more questions

  1. I want to do some calculations in the training phase in MNIST.py code and I expect these calculations to improve in higher iterations.
    But running code with 60,000 data takes almost four days, can I still say that the data is trained if I subtract the data and do it for 10,000 data?

  2. Is there a code that checks the learning of the HTM algorithm and shows when iteration numbers increase, learning is better?(if yes, Please give me the code link)

  3. is it necessary to have GPU to run hotgym.py code (in the example file)?

sincerely
katrin

The HTM.Core library should not take 4 days to run MNIST! Unless of course you’ve changed something in the source code. On my desktop workstation computer, it takes under a minute.

If you built the library from source code, please check that you’re using a release-mode build and not the debug mode.


  1. Running with less data will yield a lower score. Whether that’s okay is up to you, and what you are trying to do.

  2. Nope. The htm.core library does not measure how much learning is happening. You just sort of train it until it performs well enough.

  3. The htm.core library does not use GPUs (graphics cards).

thanks for your respond answer.
It takes about a minute when I run the MNIST.py but I try to compute this formula (which is divergence of input and output of spatial pooler algorithm and divergence is a concept in information theory ).in this formula x is the 1D-vector by size 784 and y is 1D-vector by size 784(I reshape the 28*28 pic (array) to a vector). I like to compute the divergence between the spatial pooler input and spatial pooler output during training algorithm and I expected the divergence to be a smaller number when learning occurs, meaning that the input and output image would be more similar and the redundant information less
But if I do these calculations for 60,000 MNIST data, it takes 4 days (because there are three “for” loops nested). Am I wrong?
Is the result I expect to see wrong? For 10,000 MNIST data , I did this test, but the obtained divergencs were sometimes small and sometimes large, and no specific result can be obtained from them.Does this make sense?

In one of the videos, Matt taylor said that music was given to the HTM algorithm and after 10 iteration, what the algorithm gave as output was not good at all and did not look like the original song. After 20 iteration, it was relatively similar to the original music, and after 25 iteration, it was exactly the same, and it turned out that this algorithm learns exactly like our brain and does not store data. I’m looking for such a code and I wanted to do the same by calculating divergences. Do you think this is right?

I don’t really understand how that equation works, so instead I’ll talk about what I do understand.

According to wiki: Divergence is “a function which establishes the “distance” of one probability distribution to the other on a statistical manifold.”

Typically for HTMs when you want to compute the distance between two SDRs you can use the overlap, which is the fraction of the bits which are the same in both SDRs.
See method htm.SDR.getOverlap.
Usage: percent_overlap = x.getOverlap(y) / x.getSum() where x and y are SDRs.


Is there a code that checks the learning of the HTM algorithm

You could try computing the score during training? Just before you call classifier.learn() call classifier.classify() and check if the HTM gets it correct. Put the results through a moving average and wait until that average reaches an acceptable score.

1 Like

Thanks for your explanation
I needed the data distribution to calculate the divergence, but since I did not have it, I used the Gaussian distribution of the data (I estimated the distribution with the Gaussian distribution).I can use this formula to calculate the divergence between input SDR and output SDRin both htm.core code and nupic-master code.

According to your explanation ‘percent_overlap’ is it the same as divergence?Is there such a concept in nupic-master code?

I would not expect that to yield anything meaningful.

Its not exactly the same thing, but I think its the closest equivalent thing.
If you’re not familiar with SDR’s, then you should read this article:

https://arxiv.org/abs/1503.07469

For completeness, here is a list of all of the papers Numenta has published:

https://numenta.com/neuroscience-research/research-publications

thank you, I will read them all

1 Like