Inefficiency of HTM topology from a ML standpoint

marty1885 · January 13, 2020, 4:31pm

The claim is that the topology in HTM might be more about biological constraints and less about functionality. I’ll demonstrate why I think so more and more, especially after digging deep into SMI

Let’s first examine a case where topology works - in convolutional neural networks. Without going into too much detail. CNN works because of 3 key reasons

Local feature and featur representation are consistent globally (a line is a line no matter where it is)
Local feature groups are meaningful (elements close by in the input vector have relations)
After operation, the 2 previous statement is still true (you can stack convolutions)

Like CNN, topology in HTM tries to capture the localized information of an SDR. Then by stacking/connecting cortical columns together to form representation of objects. However in HTM the properties does not hold true. Let’s assume an visual pathway and we have an signal from the retna encoding a picture of a horse. The first problems arise right after we get to V1. Since every neuron have their own synapses and permeances. Any nicely encoded SDR will break into different representation. Neuron A will have a different representation of a tail vs neuron B. Breaking property 1 and 2 for V2. Thus no information sharing within V2 can happen since every neuron is dealing with different a representation. This happens for every layer in every column. Rendering topology useless, there’s no localness in the generated SDR.

Note: 1000 brains theory can solve this via movement. But still topology takes no functional rolw in processing.

hsgo · January 13, 2020, 4:53pm

But then how can you explain performance increase when topology is applied?

Bitking · January 13, 2020, 4:58pm

If you restrict all processing to one level then I would agree with you.
I don’t see that this is a necessary restriction.
If the goal is not object recognition but instead, feature extraction, then this can be distributed over many levels. The SDR that forms later in the hierarchy can be based on a rich mix of these extracted features.

marty1885 · January 13, 2020, 5:01pm

My best guess is with topology, there’s less runaway hebnian leaning.

hsgo · January 13, 2020, 5:05pm

Yeah my guess is the same. But why do think that’s not a part of the functions?

rhyolight · January 13, 2020, 5:32pm

The difference between “stacking” and “connecting” cortical columns is the difference between hierarchical connections and lateral voting, which informs object pooling. If CCs are all emitting a representation to neighbors, they can use each other’s reps as contextual input to internal computation of feed forward input. It helps to think of attractor dynamics. In TBT, this lateral voting is not necessary to build an object model, but it helps a lot in practice.

Stacking a hierarchy is another story altogether. I’m still trying to figure it out. I think it involves thalamus and object abstraction, but that’s just me spit-balling on a Monday.

In any case, ask where topology matters? It matters a lot in the lower levels of hierarchy, but as you ascend, the topology gets mangled and mashed so that abstractions can be made (right @Bitking?)

Bitking · January 13, 2020, 5:34pm

There is some diffusion but it seems that topology is maintained through the levels of each sensory modality.

For the interested bystander, we hashed topology out in this thread:

rhyolight · January 13, 2020, 5:39pm

A post was split to a new topic: Merging topologies

thejbug · January 13, 2020, 6:15pm

Here’s my explanation for why topology works in HTM
I’ll consider topology to mean either of two things:

there are multiple local inhibition areas in a cortical column that are smaller than the entire column
there are multiple columns (with or without multiple inhibition areas) that are arranged topologically. (more relevant to thousand brains theory with vision)

Since HTM does not have sliding filters, this property will never be relevant for an HTM system that only looks at static images. It becomes relevant again when considering saccades, which will indirectly implement something analogous to sliding filters in CNNs.

this continues to be true.

since each area has a different set of synapses, global consistency is immediately broken on the first pass through the spatial pooler(s). This would be a problem if you intended to stack a conv layer on top of the HTM layer, because it’s sliding filters would no longer be meaningful. This is not a problem if you want to stack more HTM layers, however, because the next levels in the hierarchy deal with topology the same way (not with sliding filters, but with multiple independent areas)

CNNs are pretty clever and arguably take advantage of topology to a much greater extent than HTM (at least for static images) by sliding filters across the image (not really biological). That ensures that the features can be reused anywhere no matter where they are learned.

HTM systems should have an advantage over CNNs only when movement / saccades are enabled, and multiple topologically arranged columns are allowed to vote on each other’s representations.

With saccades, a single column’s synapses can be applied to any area of the image (like a sliding filter), and it’s neighboring columns play the role of parallel voters.
Neighboring columns will always have the same spatial relationship to each other, allowing them to learn the spatial relationship of features (like a successive CNN layer does to with the previous layer).
each column will have unique failure points, having learned independently on slightly different inputs. This means that together they will implement a robust model ensemble for more accurate voting.

Jose_Cueto · January 14, 2020, 6:57am

Hi @marty1885 interesting thoughts. Does this really matter in the HTM context as an incidence detector at the bit level (input space)? I think the C in CNN is doing more of an extraction/filtering of “features” which is different from for example an SP.

UPDATE - While I agree that consistent representations are so important in an ML standpoint, the following is not always true.

Local feature and featur representation are consistent globally (a line is a line no matter where it is)

Local feature groups are meaningful (elements close by in the input vector have relations)

A CNN may not always be translation variant. For example the C’s output may not always be consistent (e.g. line moved in space) but it can be “corrected” on the next operation (max pooling).

I think that CNN and HTM is not a good comparison as they are both doing different operations, unless someone can show that convolution equation is equal to the an SP operation of course. Otherwise it’s not an apples to apples comparison.

Topic		Replies	Views
Is the topology in HTM similar to the convolution in CNN(Convolutional Neural Network)? YouTube topology	18	1817	October 26, 2018
2d vs 3d vs nd SDRs Numenta Theory sdrs	4	903	November 8, 2018
HTM expressed and sets and topology! Lounge	0	364	May 23, 2019
HTM and Reversibility Numenta Theory theory	6	1065	January 9, 2020
HTM superior to NN? Numenta Theory	10	1892	January 31, 2018

Inefficiency of HTM topology from a ML standpoint

Related topics