Binary permanence

I’m still reasonably new to HTM, both the theory and implementation sides, but I have to say I’m already hooked! I think I have a fairly good grasp of what the Spatial Pooler is and what it’s trying to achieve. There’s one question nagging me though, which I’m hoping someone will be able to answer.

Why is permanence modelled as a scalar (floating point) value rather than a binary value?

For reference, I found a bit of discussion on a separate topic: Project to compare mraptor's bbHTM to biology.

I understand that in the theory synapses can become stronger as they learn to recognise the same pattern, which hints at scalar values being a more accurate model, but this question is more about whether in practice the extra resolution of a scalar value matters.

From an efficiency perspective a binary permanence array seems ideal, because the connections are then identified by simply ORing the input and permanence arrays for each column, rather than applying a threshold.

From a learning perspective, when you use scalar values and increment/decrement permanence values you’re effectively connecting/disconnecting a small number of connections. If the permanences were binary, this could be modelled by flipping a certain number of bits from 0 to 1 or 1 to 0, yielding very similar behaviour (assuming random permanences, which in my understanding is how they are initialised). Rate of learning/forgetting is then controlled by how many new connections are formed to the input array and how many existing connections are lost.

A potential downside to binary permanences is that strong connections can potentially be destroyed too easily, since instead of decrementing when they don’t match the input (and most likely staying connected), they would instead have a chance to disconnect. However, to be in a situation where a strong connection could be lost the column would have to be receiving a very similar input to the one it had learnt, in which case you could argue that there’s not much harm in it losing a connection or two since this just constitutes some additional noise. This noise is fine if the input and permanence arrays are sparse, since we know SDRs have a high noise tolerance, and is also fine if the input and permanence arrays are coarse, since the number of lost connections is likely just a small percentage of the total.

Perhaps the best way to find out the answer is to test it, but as I’m relatively new to HTM I’d like to make sure I’m not missing some important concept before going down that potentially long road.

Thoughts much appreciated!

1 Like

(moved from #htm-hackers to #htm-theory)

I already tested it somehow, check this out : http://ifni.co/bbHTM.html

A potential problem that I would expect would be that rare inputs would have an equal weight for changing the memory as frequent inputs. This could lead to inputs which should be well trained and predictable being forgotten in a single time-step due to a novel input.

My sense is this is more likely to be a problem if used for synapses in temporal memory, versus those in the spatial pooler (reason being the “potential percent” concept, in which all the possible connections exist upon executing the system).

My very first test implementation of a spatial pooler actually operated this way (as a workaround for an unrelated bug) and it didn’t really create a problem (but note that all of the inputs I was using were semantically dissimilar, which of course matters in this case). Would need to run tests on semantically similar inputs to really get a feel for the impact of implementing it this way.

Synapses actually grow in the brain. I think you can compare the permanence value to how grown a potential synapse between two cells is. We are doing this because it is the best way to mimic the biology. An already-established connection can be continually reinforced, thus increasing the length of time that pattern stays “in memory”.

Thanks for the replies!

@mraptor Nice work! I’d be interested to know if you’ve done any direct performance and/or behaviour comparison between your ’ Spatial Mapper’ and a scalar variant.

@Paul_Lamb For rare inputs I think it’s unlikely to cause a large loss in memory, assuming the rate of learning/forgetting is suitably small, but if several anomalous inputs are received in quick succession the issue becomes more severe. It would be interesting to see how it handles this kind of scenario.

You make a good point about the importance of the input data similarity. For similar inputs it’s more likely the column connections will track the input as it’s changing, particularly if it’s changing slowly over time, in which case the pooler’s memory would be quite short-term.

@rhyolight Agreed, it’s definitely a simplification of the biology, which is what makes me slightly uneasy about suggesting it in the first place!

1 Like

I haven’t, testing is the hardest… I suspect mine will under perform, the reason being i don’t think hamming-distance work good for 2% sparse SDR it just copies most common patterns over time, cause there is barely overlap. You can see it by the pic of the memory.
I haven’t had the time recently to test SMapper that use union instead i.e. hamming-distance with union (union will slowly|fastly reduce sparsity of the conn-matrix-rows and there hamming dist will work better ).
In the current variant it won’t because one-UNION-SDR is used per output bit, which is not enough (TM uses row-number of union-sdrs f.e. 5 rows TM will use 5 elements to “capture” patterns, so if 1 has 100 “different-patterns” false-positive capacity then 5 on theory will allow 500 ).

Afterwards I added usage-counter for every row in the connection matrix to the SMapper and then used it as regularization parameter (when making decision which conn-matrix-row to pick for adjusting) and it worked better.
This is so that if a row is very far away from the input data it is never used, by adding usage-reqularizer SM uses better the available capacity.

I will be concentrating soon more on the algorithm to decide what and when to turn on/off, rather than pure OR&AND … i think this is where the solution will be for making it work.