New Learning Rule for HTM Synapses

The Old Model

To refresh your memory: HTM synapses learn with following rules:

if dendritic_segment_is_active:
    if axon_is_active:
        permanence += increment    // Potentiate the synapse
    else:
        permanence -= decrement    // Depress the synapse

synapse_is_connected = (permanence >= 0.5)

The permanence variable is essentially an up/down counter.

New Parameters

First replace the parameters increment and decrement with:

  • The incidence_rate which is the decrement / increment
  • The learning_period which is 1 / increment

I feel that these new parameter are much more intuitive to work with and think about. The incidence rate is a measurement of how often a presynapse needs to correctly predict the postsynaptic activity in order to maintain the synapse. The learning period is a measurement of how many samples of data it can learn on before the permanences saturate.

The New Model

Redefine the permanence variable to measure the incidence rate of each synapse, using an exponential moving average. The new model is:

alpha = 1 - e ^ (-1/learning_period)
permanence += alpha * (bool(axon_is_active) - permanence)

synapse_is_connected = (permanence >= incidence_rate)

I’ve tested both models and they both work. The new model is inexpensive to compute and theoretically sound. Exponential moving averages are better than up/down counters.

7 Likes

@dmac thanks for sharing! Do you have any Performances comparison?

1 Like

Another interesting bit of analysis:

The “increment / decrement” is the maximum number of different patterns that the dendrite can recognize.

1 Like

I wonder if it works with lower precision datatypes.

I have recently gotten used to working with 8bit synapses and they seem to work just as well as 32 bit floats, and I’m still tring to find a good way to make 2-bit synapses work with no performance loss.

1 Like

I have just implemented your new learning rules in Connections() class of htm.core

void adaptSegment(const Segment segment, const SDR &inputs, const Permanence increment, const Permanence unusedPerm = 0, const bool pruneZeroSynapses = false, const UInt segmentThreshold = 0)
  {
    // use new learning mode here
    const auto learningPeriod = 1 / increment;
    const auto incidenceRate  = htm::minPermanence + htm::Epsilon;
    const Permanence alpha    = 1 -std::exp(-1 / learningPeriod);
    const auto &inputArray    = inputs.getDense();

    if( timeseries_ )
    {
      previousUpdates_.resize( synapses_.size(), minPermanence );
      currentUpdates_.resize(  synapses_.size(), minPermanence );
    }

    vector<Synapse> destroyLater; destroyLater.reserve(numSynapses(segment));
    for (const auto synapse: synapsesForSegment(segment))
    {
      const SynapseData &synapseData = dataForSynapse(synapse);

      Permanence newPermanence = (1 - alpha) * synapseData.permanence + alpha * inputArray[synapseData.presynapticCell];

      //prune permanences that reached zero
      if (pruneZeroSynapses and newPermanence < incidenceRate)
      {
        // new value will disconnect the synapse
        destroyLater.push_back(synapse);
        prunedSyns_++; //for statistics
        continue;
      }

      //update synapse, but for TS only if changed
      if(timeseries_)
      {
        auto update = newPermanence - synapseData.permanence;
        if( update != previousUpdates_[synapse] ) updateSynapsePermanence(synapse, newPermanence);
        currentUpdates_[ synapse ] = update;
      } 
      else
        updateSynapsePermanence(synapse, newPermanence);
    }

    //destroy synapses accumulated for pruning
    for (const auto pruneSyn : destroyLater) destroySynapse(pruneSyn);

    //destroy segment if it has too few synapses left -> will never be able to connect again
#ifdef NTA_ASSERTIONS_ON
    if(segmentThreshold > 0) 
      NTA_ASSERT(pruneZeroSynapses) << "Setting segmentThreshold only makes sense when pruneZeroSynapses is allowed.";
    
#endif
    if(pruneZeroSynapses and synapsesForSegment(segment).size() < segmentThreshold)
    { 
      destroySegment(segment);
      prunedSegs_++; //statistics
    }
  };

and tested for predicting a simple sinus with new parameters:
learning_period = 1 / incrementPerm; incrementPerm =0.1, 0.2, 0.5
incidence_rate = standard min permanence
Unfortunately, the prediction results are worse than the original one.
Do I miss somethings from your idea/implementation?
Thanks

1 Like

This is incorrect. The incidence_rate should be the permanence “decr / incr”.
Typically it should be pretty small but greater than zero, like 0.1 or 0.05.
If you’ve set it to 0.5 then it’s way too high!

HTH

1 Like
      //prune permanences that reached zero
      if (pruneZeroSynapses and newPermanence < incidenceRate)

This line is incorrect. The incidence_rate is the new permanence-threshold. The cut-off for deleting synapses should be much lower than the connected threshold so that synapses can be disconnected but still exist in the system.


I don’t have any direct comparisons, but I expect that the added computation is very easy (it’s just an extra two multiplications).

However the real issue is that now the synapse permanences will never reach zero. They exponentially approach zero instead. Synapses only get removed when their permanence reaches zero, so now synapses will never be removed. I recommend simply increasing the threshold for pruning synapses.

1 Like

OK. In my test with ApicalTieBreakTM, decr = 0, so that it does NOT improve my results.

1 Like

In my test with ApicalTieBreakTM, pruneZeroSynapses is already false, so that no change on results.
Once again, my test results are worse than the original ones.

Do you test it with ApicalTieBreakTM? Thanks

1 Like

I tried the new formulae in an experimental extended version of ApicalTiebreakTM in HTM-scheme

HTM-scheme has always used 8-bit integer permanences (linearly mapped to 0.0-1.0)

To experiment (not final code) I changed relevant code to convert integer permanences to and from floats at every use.

Previously the permanence update code was:

(if (member source reinforce-candidates)
  (make-syn source (clip-max (fx+ curr-perm increment)))
  (let ([new-perm (fx- curr-perm decrement)])
    (and (fx>? new-perm destroy) (make-syn source new-perm))))

and the test for connection was:

  (fx>=? (syn-perm synapse) connected)

To test the new learning rule, define constants:

(define incidence-rate   0.2)    ;; decrement [0.02] / increment [0.1]
(define learning-period 10.0)    ;; 1 / increment
(define alpha                    ;; 1 - e ^ (-1/learning_period)
  (fl- 1.0 (exp (fl/ -1.0 learning-period))))

… Permanence update [permanence += alpha * (bool(axon_is_active) - permanence)] becomes:

(if (member source reinforce-candidates)
  (let ([increment (fl->perm (fl* alpha (fl- 1.0 (perm->fl curr-perm))))])
    (make-syn source (clip-max (fx+ curr-perm increment))))
  (let* ( [decrement (fl->perm (fl* alpha (perm->fl curr-perm)))]
          [new-perm  (fx- curr-perm decrement)])
    (and (fx>? new-perm destroy) (make-syn source new-perm))))

… Connected test [synapse_is_connected = (permanence >= incidence_rate)] becomes:

  (fl>=? (perm->fl (syn-perm synapse)) incidence-rate)

Results seem on brief inspection to be very similar to previous version – total run time increased by about 1%

1 Like

In my experiment with sinus wave prediction with the same permanent_increment=0.1. I observed 2 things: how many learn steps to get the same “good feeling” prediction and anomaly. The original version provides good prediction results and very low anomaly after 4000 run steps, while the new one never reach a good prediction!

1 Like

That’s not a valid value for that parameter …
Also, you will need to initialize synapses at or near the threshold incidence_rate.

1 Like

Internally i check
ìf (decr ==0) decr = 0.01;

The new model can implement weighted synapses. Instead of applying a simple threshold to the synapse’s incidence rate, apply a sigmoid/logistic function.

  • Newly initialized synapses can have a small effect, while established synapses can have a much larger effect.
  • You can still make binary thresholds by setting the slope of the sigmoid to infinity.

The old model could not implemented weighted synapses because after running for too long the simple up/down counters almost always saturate at either 0 or 1, meaning that almost all of the synapses are destined to have binary weights.

2 Likes

I… don’t understand this. The dendritic segment or the entire dendrite? Is there a biological equivalent or does this only work in HTM code?

1 Like

Sorry, that statement was a bit handwavy…

Imagine that you have a dendrite that detects “N” many input patterns using “N” many synapses, one synapse per input. Now suppose that you see each pattern once and the dendrite learns each time: each synapse’s permanence will be incremented once and decremented N-1 times, resulting in a net change of “increment - (N - 1) * decrement” and if this value is negative then synapses will get weakened & disconnected until the dendrite stops responding to all N patterns.

  • Correction The maximum number of patterns is
    N = incr / decr + 1
  • This assumes that every pattern is seen at an equally frequency.
  • This assumes that every synapse is used for exactly one pattern.
2 Likes

Ok, yes, that makes sense. Thank you.

Could I press you on a little bit further? In biology (if I remember from Jeff’s lectures) about half of roughly 40 synapses within a short stretch of dendrite need to be triggered temporarily closely together to be able to initiate a dendritic spike strong enough to make it to the soma. Shouldn’t that amount also be taken into account?

2 Likes

Perhaps?

I’m using this as part of a spatial pooler.
SP’s dendrites have no limit on the number of synapses, instead they have a “potential pool” which is a random subset of the inputs and that will limit the number of synapses on each dendrite.

2 Likes