Hi every one
I am looking for a code with which I can feed a specific dataset to the SP algorithm and then reconstruct the output of the SP and get the original data. For example, give one of the MNIST digits to SP, reconstruct the sparse output of the SP algorithm, and get the original input again.
Can anyone help me in this regard? Can you send me the code or the article and explain how to reconstruct the SP algorithm?
Thanks a lot
Hi every one
reconstruct the output of the SP and get the original data
Sounds like the SDR Classifier is doing what you describe here.
If I’m not mistaken, the SDR Classifier is only doing classification, not reconstruction. That is it will only recover which digit category, not reconstruct the original image. What you are probably looking for is something like an autoencoder. I am currently unaware of any SDR autoencoders, but it sounds like a fun project.
Any reason not to keep both original digit images and corresponding generated SDRs in the same order?
And yes classifier does tell the category not the original data.
For approximative search… you can either compare your unknown SDR with all known SDRs or use an indexing method or associative memory
The bit-pair maps like diadic memory or sdr-id-map could work to a certain extent - the results depend on solidity of the SDR and how the bits are distributed.
A more reliable method would be to use an ANN (Approximate Nearest Neighbor) indexer accepting a overlap based distance function for SDRs. Like pynndescent
PS How big are your SDRs? (total bits AND % of 1s?)
PS Straight KNN search would work too but it’s quite sluggish.
no, I don’t think the SDR classifier does reconstruction.
I am looking for a code that can generate the original data from the SDR data, as shown below.
I would be grateful if you could send me a code or explain a little more about how to do this.
I hacked together this demo some time ago to demonstrate something similar to what I think you are trying to do. Let me know if you find it interesting or want to discuss it further.
Atoms in the dictionary are initialized by sub-sampling from a set of random images in the training set. The samples are obtained by splitting the 28x28 pixel images into a 4x4 array of 7x7 pixel patches. Thereafter these atoms are used as an over-complete basis set to encode portions of subsequent images. The encoding selects the best atom by direct projection (dot product of image and basis atom) to obtain a correlation coefficient. The product of this coefficient and the basis atom is subtracted from the image leaving a residual. This residual is then subjected to the same procedure as before to select the next atom that best captures the image features that were not present in the first atom. This continues until the atom limit is reached or the magnitude of the residual falls below a minimum threshold. The reconstructed image is then displayed along with the residual.
NOTE: This demo is not currently learning or adapting the atoms after the initial sampling stage. This simple choice for the basis set yields some fairly impressive results which can best be appreciated by comparing them to the reconstructions that results if you enable the “random atoms” checkbox in the menu. A slightly better implementation would be to add new atoms only if the residual cannot be sufficiently captured by a linear combination of the existing atoms.
To “Reconstruct the SP algorithm” you could use the information of the connected synapses lying in the SP data structure. The first step would be to get the synapses with the function getConnectedSynapses.
Here is an example from the NuPIC Github repo:
#Spatial Pooler example from NuPIC Walkthrough sp = SpatialPooler(inputDimensions=(15,), columnDimensions=(4,), potentialRadius=15, numActiveColumnsPerInhArea=1, globalInhibition=True, synPermActiveInc=0.03, potentialPct=1.0) for column in xrange(4): connected = numpy.zeros((15,), dtype="int") sp.getConnectedSynapses(column, connected) print connected
To refer to your MNIST example:
The column parameter in getConnectedSynapses should now get all active “SDR bits” from the middle row of the image you posted. The connected bits from the original input are then in “connected”. With these you can draw conclusions which bits were active in the original.
If you get stuck or are looking for another solution, you can also show us your code
Hello @ CollinsEM,
It was interesting. Are there codes for this implementation? My goal is to have the input vector, the sparse vector, and the reconstructed vector performs multiple operations on them. Are there more explanations of this method on the white paper or website?
Thank you very much for your explanation @ Markus
But I still don’t fully understand how to feed an image to the SP code and get it reconstructed. If possible, can you explain more or send me the codes related to the reconstruction?
I got good enough results for me by just transposing the SP and running it backwards.
What I was proposing was to search the training sample that best matches your test (or query) sample.
- Considering MNIST’s 60000 training digit images you use spatial pooler to generate 60000 corresponding SDRs.
- You know/preserve the one-to-one correspondence between each digit image and its corresponding SDR in a let’s call it training table
- When you come up with an unknown SDR, instead of trying to compute a matching image you search instead the best matching SDR in your training table and return the matching image.
Results might not be perfect but considering “sdr machinery” in general is based on approximation matching, and that a “reverse spatial pooler” would be approximative too, this might be an acceptable replacement of an end-to-end trained autoencoder.
That being said, if you want to test an actual “inverter”, you can try to use a NN like a single or multi layer perceptron (found it in sklearn python module) and train it to generate the original MNIST digits from their corresponding SDRs.
I don’t know how computationally expensive that would be but it should be an interesting experiment.
PS that way you should get an end-to-end autoencoder, encoding part being the spatial pooler and decoding one - the trained MLP