Nice. But what kind of similarity are the neurons clustered by?
I’ll reference this reply to @cezar_t which I thought was a clear explanation too. And it nests two or three references to the structure I’m looking for in other threads too. These things get lost in the length of the thread(s):
So one way to think about this is to say I want to replicate the Brazilian paper, but do it for the clusters formed by language sequences.
For instance, take the nice clear sequence example @complyue posted for the sentence “The quick brown fox jumped over the old lazy dog” (pulled out of this simple network using the prompt “q”):
I started OscBrain/Verif-001.ipynb at main · complyue/OscBrain · GitHub for the case of single-sentence learning.
The paths form well for so short a sequence (experimented with “the quick brown fox jumps over the old lazy dog”), and when there is no ambiguity, a single letter (“q”) would prompt the exact (sub)sequence accurately:
(HTM style SDR does form well, e.g. the “o” row, each spike of the letter is of different SDR instance,
The “lines” for the words are easy to find in this case:
If we sorted them vertically for time, the y-axis would spell out the sentence.
But for just a single sentence there is no time gap between the words. If you think of the whole language though, there are going to be many more paths between the letters of the words than there are between letters across words. So, for instance, there will be far more paths over “t-h-e” and “q-u-i-c-k” than there will be between “…e-q…” which is the sequence between the words “the” and “quick”. So if we can find a way to cluster more tightly according to the number of paths through letters, we should be able to cluster the spikes for the words more tightly together, and distinguish the words by synchrony of spikes instead of just sequence of spikes.
That might be enough to take in one go. But to sketch how I see this extending beyond just word identification, take the slight extension of this by adding a variation of this sentence:
lnet.learn_words_as_sequence(
‘’’
the quick brown fox jumps over the old lazy dog the quick red fox jumps over the lazy old dog
‘’',
sp_width=(3, 20), # width of spike train: [n_columns, n_cells]
sp_thick=60, # thickness of spike train
)
So the “corpus” now is two sentences, which are the same except “brown” is replaced by “red” in the second. If we “prompt” this with “q” as before, we get two sequences being traced through the network. They both start with “quick”, but then we get “brown” and “red” being traced at the same time:
And then after that it becomes a mess, because the sequence after “red” continues with “f-o-x” even as the sequence with “brown” is still spiking “-w-n…”
The thing to do would be to use the tighter clustering of “brown” and “red” to separate them from “fox” which comes after.
And then, if we could manage that, it introduces the interesting possibility of other sentences which are shared by “brown” and “red”, which continue with words other than “fox”, but which might be used to generalize “brown” and “red”. So we would have essentially projected out a “class” of words which include “brown” and “red”, and can be used to generalize the sequences of either by informing them with the other. Such clusters would be of the same type “learned” currently by large language models.
I’ll stop there, to see if any of this makes sense to anyone.