may i ask you question? could you tell me how to creat SDRs from a image like digits in MNIST?
I think converting from images to SDRs isn’t a straightforward thing.
First, an important property of SDRs is
Semantically similar data should result in SDRs with overlapping active bits (Scott Purdy, Encoding Data for HTM Systems)
However, semantic similarity is very ambiguous in 2D image data sets.
Second, the high-dimensional space of images is enormous and most of the space is occupied by random images that are meaningless to human, which makes the SDR generation not as trivial as numbers->SDRs or dates->SDRs.
There are some approaches can do what you want: convolutional neural networks family, and embedding methods such as t-SNE, etc. Each approach by itself can well handle MNIST.
Based on my limited knowledge in neuroscience, generation of SDRs from retina input may not happen in visual cortex or neocortex, There’s a region called LGN (lateral geniculate nucleus) on the pathway from retina to the lower layers of visual cortex. The LGN has hierarchical structure similar to deep neural networks, and may be responsible for generating some (sparse?) visual representations.
thank @hwangtamu, but there is one person who apply HTM for MNIST and get best result 95%, so that i want to do it.
i used to ask him "can you help me understand how SDR encoder work from a image like digit?"
and he answered: “Matt has a video on it. An encoder is basically a way you choose to represent the data. For images, one way to represent it is to represent all pixels as black and white
For my implementation, I represented each pixel as 4 bit one-hot vectors concatenated together, where each bit represents an intensity.”
i dont understand that answer, can you explain it to me?
I’m also a beginner of HTM.
[quote=“tikky, post:4, topic:2151”]An encoder is basically a way you choose to represent the data. For images, one way to represent it is to represent all pixels as black and white
For my implementation, I represented each pixel as 4 bit one-hot vectors concatenated together, where each bit represents an intensity.
Represent pixels as black and white:
# Send black and white images as binary (0, 1) instead of (0..255) # https://github.com/numenta/nupic.vision/blob/master/src/nupic/vision/regions/ImageSensor.py#L2121 if self.mode == "bw": outputArray /= 255 outputArray = outputArray.round()
For binary values 0 and 1, one version of their corresponding 4 bit one-hot vectors can be 0001 and 0010, so that after concatenation, the sparsity is a fixed value 0.25.
thank. so after encoder, next is spatial pooler process
i have this pic
but i dont understand what are columns. can you tell me?
@tikky It seems like you need to spend some more time researching HTM theory, especially Spatial Pooling if you don’t understand that the columns in that image represent. There are many resources for you here:
@tikky Here is the experiment that applies HTM for MNIST digit classification. It uses the ImageSensor in nupic.vision to encode images. https://github.com/numenta/nupic.vision/tree/master/src/nupic/vision/mnist