Network of SpatialPoolers

Hi all,

has anybody tried to make a network based on SpatialPoolers? For example, SP1 maps all inputs bits and generates SDR1, which is used as input of SP2, which generates SDR2 etc.?
Would such network make sense ? Any results regarding this?

Some papers indicate that such network is able to generate some sort of more stable invariant representation.

Thanks in advance.


Welcome to the community.

"has anybody tried to make a network based on SpatialPoolers? "
It has been done several times.
Here is an example.

1 Like

Thanks for quick response.
Unfortunately I do not see anything related to SP network in provided example?! It is rather Sp with changed boosting implementation?


Maybe this is the topic you are looking for: Can you stack multiple spatial pooler regions on top of each other?

I’m curious about these papers. Could you please share a link?

Looking for them. Will come back on this…

1 Like

Thanks. This is what I’m looking for.
But still, there is a question regarding any (other) try to cascade SPs? Does this make a sense from theoretical point of view? For example, to build following by using SP only:



It does not make sense from an HTM theory point of view. At least I am not aware of a network like this in the cortical column we are investigating.



@ddobric on the other hand, I wouldn’t banish the idea as not interesting!
From top of my head, two possible experiments:

  • Quantization:
    • task: going from 1x100k input to ,say, 1x100 SDR.
    • is it better to go in one go? (input 1x100k, SP (100 cols)), or with several (how many?) intermediate layers? (1x100k -> 1x10k SP -> 1x100 SP?)
  • Input merging
    • task: merging several inputs (of different?) modality (say sound, vision, language,…)
    • when(?) is it better to use a single SP + multiencoder (concatenate the inputs), and when to use sub-networks?

TL;DR: this question stands for the Hierarchical part of HTM, it (probably) does not have to include the temporal processing

If you’re interested in writing a paper in any of the above topics, let’s get in touch.

1 Like

HTM seeks to model and understand what is going on in your head. We can quibble about details but it is clear that as data course from area to area it stays in a mostly parallel form. The maps of the various sensory fields, while “crazy quilted” are preserved in the hierarchical processing.

The model shown by @ddobric show a concentration that is not found in any of the papers that I have read regarding cortical processing.

So where does this all come together?
I tossed this drawing together from memory so it is sure to be missing some connections. I skipped the thalamic connections all together as this was already a very complicated drawing.

The blue connections in the various sensory streams are essentially in parallel form.
The broad green hub connections connecting the various highest levels in each lobe are also parallel.
I coded the data transformation connections in orange
The EC/HC complex has some scaling where representations in multiple scales exist, but each maintains topography. The rest of the limbic system does not look anything like the cortex. The processing models look more like the classic Boltzmann networks. Or perhaps a Hopfield Network?
What is presented to the lower brain “wraps around” the complexes, transforming the topographic representation into a “spherical input” form. The limbic clusters don’t work the like the cortex so it should not be surprising that the topological format is transformed as the data is conveyed. The connections between the inputs to the limbic system and outputs to the prefrontal cortex all do some data rearrangement to make the data formats compatible.


The reason I was curious about the papers (I don’t see any link though) is that at one point I was asking myself the same question. The question came up when while I was studying the encoders and somehow it made me think to use the SP as an encoder for input that already has semantic meanings in it. Therefore, I thought of stacking 2 SP’s, 1st SP processes (encodes) the input and 2nd SP uses the output of the 1st as its input. This does not answer your main question but my thoughts on this point to a potential use case.

What is the data transformation you are looking for at the second level of processing?

It’s really part of the larger question: What is an SP doing? If you say that it is segmenting / grouping inputs at a certain scale then applying the same thinking to the pattern that emerges is hoping that nothing at the local scale is lost in forming a larger scale pattern. This is strongly dependent on the nature of what is being recognized. I can imagine cases where the local information is critical to correct global recognition.

The TBT or my hex-grid thing both work on the principle that as the local SP is processing input the lateral connections are interacting to sort though the possible things that might fit on the local level. The recognition is then distributed over a wide area where the local voting propagates far beyond the reach of the local SP. This is a relaxation process where the better the local fit the faster the units settle on an answer. And it is a strictly local process.

Nothing very specific. The idea was to use the SP to naturally encode inputs unsupervised, but the training is supervised until it generates stable outputs. Basically replacing encoders tjat require a priori knowledge. The whole business of measuring stability is another story though.

Agree but then the encoders do not escape this dilemna as well.

It would be nice to see an illustration.

This should do it:

Thanks a lot I will have a read.