Someone just linked me to your post when I asked about stacking layers of spatial poolers and temporal memories. I didn’t think to stack JUST spatial poolers, but from the results on random floats, it looks rather promising.
Now, I’m still very new to this field of research, but I have a basic understanding of the mechanics of HTM components, so I’ll give it my best guess theory as to what’s happening.
What I think is happening is while one layer is able to form a good, albeit “shallow” SDR of the random values, having a whopping EIGHT SP’s all work to encode a structure (could I call it a three dimensional structure?) that’s much, much deeper than a single layer. In my opinion, this could yield more detailed and accurate representations, and thus better predictions and outputs. What the overlap graphs tell me (and again, still new here!) is that the SP stack was able to narrow down an allocation of columns specifically for these values, and with the greater representational accuracy, is much more clearly able to “predict” the value 0.5. Don’t know if predict is the right term here, but it’s clear that the more stacked SP’s there are, the less overlap per given value is.
Of course, I do worry about the classic ML problem of overfitting data, which may or may not be happening here. What happens when you feed it other values, like 0.1 or 0.7? Do you still get a narrow square, or curve, in the overlap graphs?