I’m reading trough the algorithms that HTM uses. And something sounds weird.
Since running HTM is basically performing a lot of and and additon operations. There is no way for HTM to learn to perform a XOR operation.
Am I correct about it? If so, how limit is this limitation (what can’t HTM learn because it cannot perform XOR)
It depends on how you code the problem.
With HTM and grids, one spreads a value in space.
for a variable of 2 inputs, you have 4 output states.
I can code matches to that in SDR bit fields.
One dendrite segment for each combination should do it.
All four values are available for further processing if that is what you learn.
I used to teach computer science and was always impressed by fellows that “proved” you can’t do something because XYZ thing that are then proven wrong. Marvin Minsky “proved” that perceptrons could not solve the XOR problem and sold this idea hard. He sold some books and lectures. I would prefer “I don’t know how to do XYZ” and let it go at that - but that does not sell any books. This killed real AI work when the money dried up and the world ignored neural nets. True believers knew this was crap and pressed on. Things like the PDP book put neural models back on the map; discovering that you needed to put these things together in a system blew past the “limit” to an almost unimaginable degree.
Saying that time sensing, pattern sensing, and grid forming functions by themselves have some sort of limit is missing the point of what they do when combined. HTM models may not be the entire solution (I am very certain on this) but they will play a very important part in creating some very cool things.
Systems. Not parts.
Don’t forget about inhibition, which has a reductive effect. The spatial pooler simulates inhibition via a sparsification step. Minicolumns are scored on how well they connect to the active input cells, and only a fixed sparsity of the winners are chosen for activation (all others inhibited). This is applicable to learning operations like XOR.
Say you have cells in the input space which represent X, other cells which represent Y, others which represent TRUE, and still others which represent FALSE. Then assume you set the activation threshold to 2% the number of minicolumns in the TM layer (40 for a typical 2048 minicolumn layer)
When you feed X into the SP you’ll get one set of sparse minicolumns activations. When you feed Y you’ll get another. TRUE gives you a third set, and FALSE gives you a fourth.
When you feed both X and Y, you’ll get a set which will consist largely of minicolumns of X and of Y, but additionally some minicolumns that are not part of the representations for X or Y will also be among the winners. This is because their potential synapses happen to align better with that particular set of input cells.
Now move to TM. Say the following simple sequences are input and trained in TM:
X → TRUE
Y → TRUE
X + Y → FALSE
This is the XOR operation. Later inputting X or Y alone will predict TRUE, but X + Y will not predict TRUE, because there will be less than 40 (the activation threshold) of the TM layer cells activated that belong to X or Y.
This is of course an overly simplified example, but hopefully it demonstrates the point that learning this type of logical operation is not impossible for HTM.
I probably should point out that a large HTM layer isn’t required for this particular problem (I was just demonstrating how it could be done in a typical setup). Even a tiny system like this could also learn the XOR operation:
In this case, each box represents an HTM neuron. Settings for the system are three input cells, four minicolumns, one cell per minicolumn, sparsity 25%, and activation threshold 1. The red lines indicate potential synapses used in SP, and the blue lines indicate the distal connections made when training the following two sequences in TM:
X -> T
Y -> T
After this, inputting just X will predict T, inputting just Y will predict T, and inputting both X and Y will not predict T.
Also, if you are particular about the original “T” cell being the one that goes into predictive state (versus its counterpart up in the TM layer), the system can be set up where the TM layer provides apical feedback to the input layer (this is how I decode predictions in htm.js without requiring a classifier)
Here is another visualization of how HTM applies to this problem. Here you can see how the receptive fields for some minicolumns can align better to just X or just Y, while others can align better to both X and Y.