As I mentioned many times in this forum Similarity using OVERLAP or for that matter hamming distance, dot product, cosine similarity OR any other operation on any type of Vectors (SDR,Kanerva hyper vectors, Real vectors …) is not adequate.
After many mental gymnastics and thinking I came up a very ingenious but convenient idea.
!!! Keep in mind I have not tested any of this. I’m still figuring it out !!!
Normally SP cells samples source SDR and requires those inputs to have encoded similarity which SP ought to preserve.
But as I said using Vector similarity is not granular enough.
What if we do not require that SDR encode similarity (or we use it as a lower level similarity) and then we make a simple tweak to the SP algorithm.
There are two different but same tweaks. Which I use to externalize overlap/similarity/union?.
- Sequence/time based
In this scenario we postpone the Winner Take All part of the algorithm.
We sample several sequential inputs one after the other (instead just one) then averaging the samples scores i.e. i = 1 …n, AVG(BinVec * Permanence).
And only then we apply WTA to pick the winning cells.
What are the differences if we do this change :
- The output vector is the new kind of OVERLAP, not normalized input SDR
- The output vector could be also brand new Compound SDR, that represents combination of terms/SDRs a different kind of UNION
- Similarity can be encoded by the full vector 0 to N bits, not just by part of 2% of N bits
- Similarity between SDRs is decided not by SDR overlap, but by being part of the sequence
- You can still use the normal overlap as low level similarity
- The inputs can be randomly generated SDRs which are orthogonal. Good for Symbols where you dont know how similar they are in advance.
- You can calculate similarity between more than 2 SDRs
- Similarity is ‘trained’, data dependent, not intrinsic
Concatenated : Instead of sequence just concatenate the input, but mirror the connections for every segment.
i.e. if inputA1 <-> cellX1 then inputB1 <-> cellX1, inputC1 <-> cellX1 …
- Benefit : no change of the algorithm, but mirrored connections
- Drawback: too much memory for all additional Permanence’s
The second tweak which is not fully fleshed in my mind is to bolt on top of SP a single layer TM or to be more clear… SP represents the Proximal part of group of HTM-neurons, putting single layer TM on top simply means we just add the Distal part, thus implementing Group of FULL-HTM-neurons.
What this gives us is Context. This context is SDR that filters/limits the possible overlap candidates at WTA.
How can you use this ?
W/o Context you may want to say this (every word is a SDR):
Cats have claws.
Chairs are furniture.
you have to do this :
means1 = SP([cats,claws])
means2 = SP([chairs,furniture])
with Context it becomes akin to :
means1 = SP([cats,claws] if have)
means2 = SP([chairs,furniture] if is )
This way we can encode :
Chairs are-not fish
Chairs have 4-legs
Cats are mammal
The benefits of this variant of SP are obvious , IF IT WORKS
The idea is by Externalizing Similarity to be able to implement more complex relations.
My thoughts are to implement Concept Grid we need Concept metric system , which needs Distance i.e. 1 - Similarity.
Having this may be I can implement Thought process as a Navigation problem trough Concept Space