I’m trying to fill some gaps in, and confirm my understanding of HTM’s details, so I have a few very basic questions about the spatial pooler:
The purpose of the spatial pooler is to take in a binary vector with a sparse number of ON bits, and then determine which of the columns of cells “win” (will there always be a set percentage of winning columns, or is the sparsity of the winning columns just a result of the input vector’s sparsity?). The way the winning columns are selected is by computing each column’s overlap score with the input vector, which is done by looking at the bits a column has connected synapses with and taking the sum of all the ON bits; this sum is the column’s overlap score. The reason the spatial pooler does this is to try to make the same (or similar) columns win when similar inputs are fed in. It sort of filters out noise, and forms an invariant representation of variations of a single pattern. Is this a fair description of the spatial pooler’s purpose?
Does the spatial pooler have any role in finding patterns between multiple input fields? For example, if I have field-1 and field-2, and I encode them each into binary vectors, then concatenate these vectors and feed them into a spatial pooler, does it find spatial patterns between them? E.g. does it learn that when field-1's value spikes, that field-2's value dips? My current understanding says, yes, it does; but I’m having some difficulty putting into words exactly why that is the case…
The columns in an HTM layer form synapses with the input vector, but how many, and where at? Is the number of synapses a column can form configured by a parameter, or is there some accepted standard? Can a column form synapses with any bit of the input vector, or is this range restricted? Will there ever be a bit in the input vector that has not formed a synapse with one of the SP’s columns? Also, can two columns form a synapse with the same bit on the input vector?
Once a column determines which bits it will set up synapses with, is it possible for that to change? I know that the permenance of a synapse is variable, and changes constantly, but I’m talking about a column moving a synapse to an entirely different bit on the input vector. My current understanding says, no, you can’t change the position of a synapse during an HTM’s lifetime. The positions of the synapses need to be determined upon the network’s creation and remain the same forever thereafter.
How does one know how many columns a spatial pooler needs to have? Is there a minimum number based on the length of the input vector? Suppose I have an input vector 5 bits long, and a single column in my spatial pooler (I know, completely unrealistic, but it’s just for illustrative purposes…). If I let this single column form 5 synapses, then each bit on the input vector would be covered with a synapse. Or consider a less extreme example, say I have an input vector of 2048 bits, and I let each column in my SP form 2 synapses; then would 1024 columns suffice? Would it even serve a purpose to have any more columns?
This might be a more subjective question? But, what is the most logical way to visualize a spatial pooler? Wouldn’t a 2D grid of cells make the most sense? I know that since HTMs are all currently simulated in computers (there aren’t any hardware implementations of them right now) and there is no actual structure like this in the computer’s memory, but is this the way that one ought to form a mental image of them? That appears to make the most sense to me, because the input is a 1D binary vector, and the SP columns are positioned on top of the vector, and extend synapses toward bits of the vector within a certain range of themselves (or is there no such concept in current HTM implementations of the distance between a column and a particular index of the input vector?). I know nobody is going to make me think of it in a particular fashion - I’m mostly wondering if there is some fundamental concept I might be missing by picturing things this way?
Of course, if this was the best way to picture it, then how do images like these make any sense?:
I understand that in the brain, columns are arranged in three dimensions, but SDRs in the brain aren’t 1D. Hopefully this question makes some sense… I’m having a bit of difficulty pinpointing precisely what I’m trying to ask.
I do believe that is it for now . I think I have a reasonable general, high level understanding of the theory behind the spatial pooler, but there’s just several details I am not quite sure about.
Hi Hopding, you have a good understanding of the SP. I’ve addressed a few of your questions below. For the rest I recommend taking a look at the BaMI section for the SP algorithm, an updated version from the original CLA whitepaper.
This is an apt description of the SP. To answer your question, the sparsity of inputs to the SP can vary greatly, but the SP will reliably activate an approximately fixed number of columns. This is accomplished through the columnar inhibition.
This depends on your method of concatenation. Using the Multiencoder, as Matt suggested, is best; see the example here. A naive concatenation method, like summing the vectors, may wash out spikes between field-1 and field-2, and other (more subtle) semantics in the data.
No two columns connect to the exact same set of input bits, but there is usually overlap. The SP initializes a set of potential synapses for each column. Which input bits these correspond to is a random selection of the input space. The permanence values are initialized somewhat randomly – random values within a range around the connectedPerm param, and a bias towards centering the column over the input space.
Brains are plastic and it follows that the SP is as well. An HTM region tries to best represent the input with its given resources. The SP does this through the inhibition rule, boosting, synapse permanence increments/decrements, and also adding/forgetting synapses based on their contributions.
Generally with more columns in an HTM region, each column will represent larger and more detailed patterns in the input. I would also consider the granularity of your input. That is, you can have 2048 input bits but only 4 possible values, so the dimensions are overkill. But if you expect any real numbered values 0-100, this may be appropriate.
I agree that a 2D array of columns is the best logical organization for visualizing the SP (and HTMs in general), but this arrangement is not required in HTMs. I’ve always liked Chetan’s visualization from our 2014 hackathon.
@alavin Thank you very much for taking the time to write such a detailed reply and addressing all of my questions! You’ve cleared up much of my confusion
I did find the time to watch that video of Chetan’s presentation - it was fantastic! I hadn’t come across that video before, but it certainly does a great job of going through many of the SP and TM details, very helpful.
I’ve read the HTM whitepaper, but I’ve only started on the BaMI. I wasn’t aware there was a Spatial Pooling chapter already for it. I’ll be sure to read through that as soon as I can and I’m sure it will enhance my understanding. Thanks again @alavin!
I’m happy to help @hopding
The current SP chapter of BaMI covers the algorithm and implementation, and I’m writing the next version now to add discussion on SP concepts, applications, and examples. Spatial Pooling questions from you and others in the community help guide me on material to include, so thank you.
I’m coding for Spatial Pooling. I have a 400x200 sdr type matrix. This matrix actually contains an encoded binary score word embeding.
I assign my input vectors to dense values.
Do I apply each data row by row for spatial pooling in HTM.core? Or you’ll use the whole matrix in bulk (2-D) in spatial pooling to pass to the temporal stage. I keep reading articles.
Can you help with the coding steps?
My results are as follows.
Note: A small example coding here is the matrix 44x16 size. And I assigned all lines to input in 2D.
HTM is at its core a temporal memory system, so yes, every data is processed one row at a time. Synaptic weights and connections may be mutated at each time step.
Are you saying that the input has a 2D topology? If so, you must create a 2D topology in the SP as well. Watch this video first: HTM School Episode 10: Topology
It is hard to tell what your graphs are saying without understanding more about the semantics of the data. 400x200 is a pretty large input vector. How many minicolumns are you using?
Thanks for your reply.
Actually there is an sdr (1d) forms for each word. Each sdr contains 200 bits. It makes 8000 bits for 400 words.
We calculate each of them separately for each word.
otherwise, all the input matrix (2d-400 * 200) for the sp active output is calculated. The graphs are given for a small example
.
Ah, ok, then definitely use global inhibition in the SP (no topology). So input space to the SP is 200 bits.
If you are going to run the SP and TM, we usually run them both simultaneously and let them learn together. Are you finding anything unexpected in your graphs?
oh sorry. I don’t expect any results and graphics for now I experimented on a small corpus. I have to work hard. Maybe I’m testing wrong way. So I wanted to ask you.
so I’ll run sp and tm for each word.
Finally, I complete the process by classifying.
Thank you so much.
There is a py file named sptutorial in htm.core. At the end of those codes there is numexamples variable. And this variable is randomized. Can I generate subsample values for the original sdr instead of these randomized values?
According to the above code, is a minicolumn value calculated with each compute() function? According to this code, how do I get the temporal memory input value?
My application and results are below.
Is “outputSDR” the input value for TM? Is the minicolumn length equal to “outputSDR”?
There’s a “sdr” for every word. These words are actually a vocabulary piece. I created a subsample for each.
And I’ve run the compute function for each subsample.
You mean “Concatenate all these” for the subsamples each word. Then should I use the SP output of each word as the Tm input?