To refresh your memory: HTM synapses learn with following rules:
if dendritic_segment_is_active:
if axon_is_active:
permanence += increment // Potentiate the synapse
else:
permanence -= decrement // Depress the synapse
synapse_is_connected = (permanence >= 0.5)
The permanence variable is essentially an up/down counter.
New Parameters
First replace the parameters increment and decrement with:
The incidence_rate which is the decrement / increment
The learning_period which is 1 / increment
I feel that these new parameter are much more intuitive to work with and think about. The incidence rate is a measurement of how often a presynapse needs to correctly predict the postsynaptic activity in order to maintain the synapse. The learning period is a measurement of how many samples of data it can learn on before the permanences saturate.
The New Model
Redefine the permanence variable to measure the incidence rate of each synapse, using an exponential moving average. The new model is:
I’ve tested both models and they both work. The new model is inexpensive to compute and theoretically sound. Exponential moving averages are better than up/down counters.
I wonder if it works with lower precision datatypes.
I have recently gotten used to working with 8bit synapses and they seem to work just as well as 32 bit floats, and I’m still tring to find a good way to make 2-bit synapses work with no performance loss.
I have just implemented your new learning rules in Connections() class of htm.core
void adaptSegment(const Segment segment, const SDR &inputs, const Permanence increment, const Permanence unusedPerm = 0, const bool pruneZeroSynapses = false, const UInt segmentThreshold = 0)
{
// use new learning mode here
const auto learningPeriod = 1 / increment;
const auto incidenceRate = htm::minPermanence + htm::Epsilon;
const Permanence alpha = 1 -std::exp(-1 / learningPeriod);
const auto &inputArray = inputs.getDense();
if( timeseries_ )
{
previousUpdates_.resize( synapses_.size(), minPermanence );
currentUpdates_.resize( synapses_.size(), minPermanence );
}
vector<Synapse> destroyLater; destroyLater.reserve(numSynapses(segment));
for (const auto synapse: synapsesForSegment(segment))
{
const SynapseData &synapseData = dataForSynapse(synapse);
Permanence newPermanence = (1 - alpha) * synapseData.permanence + alpha * inputArray[synapseData.presynapticCell];
//prune permanences that reached zero
if (pruneZeroSynapses and newPermanence < incidenceRate)
{
// new value will disconnect the synapse
destroyLater.push_back(synapse);
prunedSyns_++; //for statistics
continue;
}
//update synapse, but for TS only if changed
if(timeseries_)
{
auto update = newPermanence - synapseData.permanence;
if( update != previousUpdates_[synapse] ) updateSynapsePermanence(synapse, newPermanence);
currentUpdates_[ synapse ] = update;
}
else
updateSynapsePermanence(synapse, newPermanence);
}
//destroy synapses accumulated for pruning
for (const auto pruneSyn : destroyLater) destroySynapse(pruneSyn);
//destroy segment if it has too few synapses left -> will never be able to connect again
#ifdef NTA_ASSERTIONS_ON
if(segmentThreshold > 0)
NTA_ASSERT(pruneZeroSynapses) << "Setting segmentThreshold only makes sense when pruneZeroSynapses is allowed.";
#endif
if(pruneZeroSynapses and synapsesForSegment(segment).size() < segmentThreshold)
{
destroySegment(segment);
prunedSegs_++; //statistics
}
};
and tested for predicting a simple sinus with new parameters:
learning_period = 1 / incrementPerm; incrementPerm =0.1, 0.2, 0.5
incidence_rate = standard min permanence
Unfortunately, the prediction results are worse than the original one.
Do I miss somethings from your idea/implementation?
Thanks
This is incorrect. The incidence_rate should be the permanence “decr / incr”.
Typically it should be pretty small but greater than zero, like 0.1 or 0.05.
If you’ve set it to 0.5 then it’s way too high!
//prune permanences that reached zero
if (pruneZeroSynapses and newPermanence < incidenceRate)
This line is incorrect. The incidence_rate is the new permanence-threshold. The cut-off for deleting synapses should be much lower than the connected threshold so that synapses can be disconnected but still exist in the system.
I don’t have any direct comparisons, but I expect that the added computation is very easy (it’s just an extra two multiplications).
However the real issue is that now the synapse permanences will never reach zero. They exponentially approach zero instead. Synapses only get removed when their permanence reaches zero, so now synapses will never be removed. I recommend simply increasing the threshold for pruning synapses.
In my test with ApicalTieBreakTM, pruneZeroSynapses is already false, so that no change on results.
Once again, my test results are worse than the original ones.
In my experiment with sinus wave prediction with the same permanent_increment=0.1. I observed 2 things: how many learn steps to get the same “good feeling” prediction and anomaly. The original version provides good prediction results and very low anomaly after 4000 run steps, while the new one never reach a good prediction!
The new model can implement weighted synapses. Instead of applying a simple threshold to the synapse’s incidence rate, apply a sigmoid/logistic function.
Newly initialized synapses can have a small effect, while established synapses can have a much larger effect.
You can still make binary thresholds by setting the slope of the sigmoid to infinity.
The old model could not implemented weighted synapses because after running for too long the simple up/down counters almost always saturate at either 0 or 1, meaning that almost all of the synapses are destined to have binary weights.
Imagine that you have a dendrite that detects “N” many input patterns using “N” many synapses, one synapse per input. Now suppose that you see each pattern once and the dendrite learns each time: each synapse’s permanence will be incremented once and decremented N-1 times, resulting in a net change of “increment - (N - 1) * decrement” and if this value is negative then synapses will get weakened & disconnected until the dendrite stops responding to all N patterns.
Correction The maximum number of patterns is N = incr / decr + 1
This assumes that every pattern is seen at an equally frequency.
This assumes that every synapse is used for exactly one pattern.
Could I press you on a little bit further? In biology (if I remember from Jeff’s lectures) about half of roughly 40 synapses within a short stretch of dendrite need to be triggered temporarily closely together to be able to initiate a dendritic spike strong enough to make it to the soma. Shouldn’t that amount also be taken into account?
I’m using this as part of a spatial pooler.
SP’s dendrites have no limit on the number of synapses, instead they have a “potential pool” which is a random subset of the inputs and that will limit the number of synapses on each dendrite.