This is beautifully animated, and it explains how they create semantic maps very well.
One thing that has puzzled me about the semantic folding process, is the step where inputs with similar meanings are placed near to each other. From my understanding of SDRs, semantics doesn’t have anything to do with nearness of bits, but rather with percentage of overlapping bits. Is there a functional reason for this step, or is it meant only for the visualization aspect in the resulting SDRs?
The reason is topology! Cortical’s semantic fingerprints are topological. I think the semantics are more valuable with this topology, because overlap means more visually.
Ah, of course. This is the one step in the process that I am having trouble replicating (I’m trying to create a variation of semantic folding for use as a “universal encoder”). I can replicate everything up to the step where I’ve generated a 1D word/context vector, but in the whitepaper it then states:
the contexts represent vectors that can be used to create a two - dimensional map in such a way that similar context - vectors are placed closer to each, using topological (local) inhibition mechanisms and/or by using competitive Hebbian learning principles.
I’m wondering if anyone has a little more detail on how this is done. I suppose topology isn’t necessarily needed in my case, but it would help fill the remaining gap in my understanding of the process.
A thought occurs to me: If you had another map connected to the “retina” during the training process as the corpus was fed in you could capture the fingerprints during the training phase.
The relationship between the two maps would be similar to the relationship between Broca and Wernicke’s areas.
Putting the H in HTM to good use …
That is in fact what I am doing in my implementation – the “fingerprints” evolve as more inputs come in. Allows the system to learn over time to produce better encodings, and not needing to necessarily have an exhaustive data set from the start.