New Learning Rule for HTM Synapses

dmac · June 8, 2023, 11:27pm

The Old Model

To refresh your memory: HTM synapses learn with following rules:

if dendritic_segment_is_active:
    if axon_is_active:
        permanence += increment    // Potentiate the synapse
    else:
        permanence -= decrement    // Depress the synapse

synapse_is_connected = (permanence >= 0.5)

The permanence variable is essentially an up/down counter.

New Parameters

First replace the parameters increment and decrement with:

The incidence_rate which is the decrement / increment
The learning_period which is 1 / increment

I feel that these new parameter are much more intuitive to work with and think about. The incidence rate is a measurement of how often a presynapse needs to correctly predict the postsynaptic activity in order to maintain the synapse. The learning period is a measurement of how many samples of data it can learn on before the permanences saturate.

The New Model

Redefine the permanence variable to measure the incidence rate of each synapse, using an exponential moving average. The new model is:

alpha = 1 - e ^ (-1/learning_period)
permanence += alpha * (bool(axon_is_active) - permanence)

synapse_is_connected = (permanence >= incidence_rate)

I’ve tested both models and they both work. The new model is inexpensive to compute and theoretically sound. Exponential moving averages are better than up/down counters.

thanh-binh.to · June 9, 2023, 6:25am

@dmac thanks for sharing! Do you have any Performances comparison?

dmac · June 14, 2023, 2:58pm

Another interesting bit of analysis:

The “increment / decrement” is the maximum number of different patterns that the dendrite can recognize.

JarvisGoBrr · June 14, 2023, 6:54pm

I wonder if it works with lower precision datatypes.

I have recently gotten used to working with 8bit synapses and they seem to work just as well as 32 bit floats, and I’m still tring to find a good way to make 2-bit synapses work with no performance loss.

thanh-binh.to · June 22, 2023, 12:20pm

I have just implemented your new learning rules in Connections() class of htm.core

void adaptSegment(const Segment segment, const SDR &inputs, const Permanence increment, const Permanence unusedPerm = 0, const bool pruneZeroSynapses = false, const UInt segmentThreshold = 0)
  {
    // use new learning mode here
    const auto learningPeriod = 1 / increment;
    const auto incidenceRate  = htm::minPermanence + htm::Epsilon;
    const Permanence alpha    = 1 -std::exp(-1 / learningPeriod);
    const auto &inputArray    = inputs.getDense();

    if( timeseries_ )
    {
      previousUpdates_.resize( synapses_.size(), minPermanence );
      currentUpdates_.resize(  synapses_.size(), minPermanence );
    }

    vector<Synapse> destroyLater; destroyLater.reserve(numSynapses(segment));
    for (const auto synapse: synapsesForSegment(segment))
    {
      const SynapseData &synapseData = dataForSynapse(synapse);

      Permanence newPermanence = (1 - alpha) * synapseData.permanence + alpha * inputArray[synapseData.presynapticCell];

      //prune permanences that reached zero
      if (pruneZeroSynapses and newPermanence < incidenceRate)
      {
        // new value will disconnect the synapse
        destroyLater.push_back(synapse);
        prunedSyns_++; //for statistics
        continue;
      }

      //update synapse, but for TS only if changed
      if(timeseries_)
      {
        auto update = newPermanence - synapseData.permanence;
        if( update != previousUpdates_[synapse] ) updateSynapsePermanence(synapse, newPermanence);
        currentUpdates_[ synapse ] = update;
      } 
      else
        updateSynapsePermanence(synapse, newPermanence);
    }

    //destroy synapses accumulated for pruning
    for (const auto pruneSyn : destroyLater) destroySynapse(pruneSyn);

    //destroy segment if it has too few synapses left -> will never be able to connect again
#ifdef NTA_ASSERTIONS_ON
    if(segmentThreshold > 0) 
      NTA_ASSERT(pruneZeroSynapses) << "Setting segmentThreshold only makes sense when pruneZeroSynapses is allowed.";
    
#endif
    if(pruneZeroSynapses and synapsesForSegment(segment).size() < segmentThreshold)
    { 
      destroySegment(segment);
      prunedSegs_++; //statistics
    }
  };

and tested for predicting a simple sinus with new parameters:
learning_period = 1 / incrementPerm; incrementPerm =0.1, 0.2, 0.5
incidence_rate = standard min permanence
Unfortunately, the prediction results are worse than the original one.
Do I miss somethings from your idea/implementation?
Thanks

dmac · June 22, 2023, 2:56pm

This is incorrect. The incidence_rate should be the permanence “decr / incr”.
Typically it should be pretty small but greater than zero, like 0.1 or 0.05.
If you’ve set it to 0.5 then it’s way too high!

HTH

dmac · June 22, 2023, 3:08pm

      //prune permanences that reached zero
      if (pruneZeroSynapses and newPermanence < incidenceRate)

This line is incorrect. The incidence_rate is the new permanence-threshold. The cut-off for deleting synapses should be much lower than the connected threshold so that synapses can be disconnected but still exist in the system.

I don’t have any direct comparisons, but I expect that the added computation is very easy (it’s just an extra two multiplications).

However the real issue is that now the synapse permanences will never reach zero. They exponentially approach zero instead. Synapses only get removed when their permanence reaches zero, so now synapses will never be removed. I recommend simply increasing the threshold for pruning synapses.

thanh-binh.to · June 23, 2023, 2:06pm

OK. In my test with ApicalTieBreakTM, decr = 0, so that it does NOT improve my results.

thanh-binh.to · June 23, 2023, 2:12pm

In my test with ApicalTieBreakTM, pruneZeroSynapses is already false, so that no change on results.
Once again, my test results are worse than the original ones.

Do you test it with ApicalTieBreakTM? Thanks

rogert · June 23, 2023, 4:54pm

I tried the new formulae in an experimental extended version of ApicalTiebreakTM in HTM-scheme

HTM-scheme has always used 8-bit integer permanences (linearly mapped to 0.0-1.0)

To experiment (not final code) I changed relevant code to convert integer permanences to and from floats at every use.

Previously the permanence update code was:

(if (member source reinforce-candidates)
  (make-syn source (clip-max (fx+ curr-perm increment)))
  (let ([new-perm (fx- curr-perm decrement)])
    (and (fx>? new-perm destroy) (make-syn source new-perm))))

and the test for connection was:

  (fx>=? (syn-perm synapse) connected)

To test the new learning rule, define constants:

(define incidence-rate   0.2)    ;; decrement [0.02] / increment [0.1]
(define learning-period 10.0)    ;; 1 / increment
(define alpha                    ;; 1 - e ^ (-1/learning_period)
  (fl- 1.0 (exp (fl/ -1.0 learning-period))))

… Permanence update [permanence += alpha * (bool(axon_is_active) - permanence)] becomes:

(if (member source reinforce-candidates)
  (let ([increment (fl->perm (fl* alpha (fl- 1.0 (perm->fl curr-perm))))])
    (make-syn source (clip-max (fx+ curr-perm increment))))
  (let* ( [decrement (fl->perm (fl* alpha (perm->fl curr-perm)))]
          [new-perm  (fx- curr-perm decrement)])
    (and (fx>? new-perm destroy) (make-syn source new-perm))))

… Connected test [synapse_is_connected = (permanence >= incidence_rate)] becomes:

  (fl>=? (perm->fl (syn-perm synapse)) incidence-rate)

Results seem on brief inspection to be very similar to previous version – total run time increased by about 1%

thanh-binh.to · June 23, 2023, 7:55pm

In my experiment with sinus wave prediction with the same permanent_increment=0.1. I observed 2 things: how many learn steps to get the same “good feeling” prediction and anomaly. The original version provides good prediction results and very low anomaly after 4000 run steps, while the new one never reach a good prediction!

dmac · June 23, 2023, 10:35pm

That’s not a valid value for that parameter …
Also, you will need to initialize synapses at or near the threshold incidence_rate.

thanh-binh.to · June 24, 2023, 6:56am

Internally i check
ìf (decr ==0) decr = 0.01;

dmac · June 27, 2023, 12:44pm

The new model can implement weighted synapses. Instead of applying a simple threshold to the synapse’s incidence rate, apply a sigmoid/logistic function.

Newly initialized synapses can have a small effect, while established synapses can have a much larger effect.
You can still make binary thresholds by setting the slope of the sigmoid to infinity.

The old model could not implemented weighted synapses because after running for too long the simple up/down counters almost always saturate at either 0 or 1, meaning that almost all of the synapses are destined to have binary weights.

Falco · June 27, 2023, 8:19pm

I… don’t understand this. The dendritic segment or the entire dendrite? Is there a biological equivalent or does this only work in HTM code?

dmac · June 27, 2023, 10:23pm

Sorry, that statement was a bit handwavy…

Imagine that you have a dendrite that detects “N” many input patterns using “N” many synapses, one synapse per input. Now suppose that you see each pattern once and the dendrite learns each time: each synapse’s permanence will be incremented once and decremented N-1 times, resulting in a net change of “increment - (N - 1) * decrement” and if this value is negative then synapses will get weakened & disconnected until the dendrite stops responding to all N patterns.

Correction The maximum number of patterns is
N = incr / decr + 1
This assumes that every pattern is seen at an equally frequency.
This assumes that every synapse is used for exactly one pattern.

Falco · June 29, 2023, 7:47pm

Ok, yes, that makes sense. Thank you.

Could I press you on a little bit further? In biology (if I remember from Jeff’s lectures) about half of roughly 40 synapses within a short stretch of dendrite need to be triggered temporarily closely together to be able to initiate a dendritic spike strong enough to make it to the soma. Shouldn’t that amount also be taken into account?

dmac · June 30, 2023, 1:45am

Perhaps?

I’m using this as part of a spatial pooler.
SP’s dendrites have no limit on the number of synapses, instead they have a “potential pool” which is a random subset of the inputs and that will limit the number of synapses on each dendrite.

Topic		Replies	Views
HTM constants confusion Implementations	2	563	May 14, 2021
Synaptic mess in my understanding of permanences Numenta Theory	4	501	July 9, 2022
Sequence learning and invariant representations Numenta Theory	2	828	July 15, 2016
Some confusion about segments vs. synapses Numenta Theory	1	480	August 16, 2018
Temporal Memory algorithm : Permanence of synapses NuPIC htm-implementations , temporal-memory , synapses	6	1113	August 9, 2017

New Learning Rule for HTM Synapses

The Old Model

New Parameters

The New Model

Related topics