You’re equating oscillators to SDRs. Which may be quite reasonable. And no doubt you could pre-generate many.
But why do so, if the selection is the same as the generation?
Interesting talk panel though. Also including Christoph von der Malsburg (who I’ve had something to do with through the Mindfire, Swiss, AI initiative.) I’ll listen to that (2 hours?)
But just to get some ideas out.
I was thinking about going down to a letter representation level as I discussed with @complyue . It might be something to try. The logic would be exactly the same. But thinking first in terms of letters may clarify what we are looking for.
Thinking out loud here, so there may be logical errors. But in the spirit of encouraging early feedback.
If we imagine the letter raster plot in time, then we might imagine coding the input of a sentence (a prompt) like “Place wood blocks” with node spikes for the constituent letters as (letter vs. time of spike):
a x
b x
c x
d x
e x
f
g
h
i
j
k x
l x x
m
n
o x x x
p x
q
r
s x
t
u
v
w x
x
y
z
Where I have token time gaps of three spaces between each letter as it occurs in the phrase in time.
Those are the input, or “prompt” spikes. The actual network would also be coded like this. So you can think of these spikes as having edges between them in sequence. And for the whole network, representing many phrases, each node would have many, many, such edges.
As input, the spikes for the letters making up the words are equally spaced. A bit like this:
P l a c e w o o d b l o c k s
The hypothesis is that sub-sequences of letters like “place”, “wood”, and “blocks”, would tend to have more edges at their beginning and end, and thus under oscillation would tend to synchronize their constituent letter spikes, so the spike times might be pushed further together, like this:
P l a c e w o o d b l o c k s
Because there would be more edges to the rest of the network at the ends of words, rather than the middle of words.
Thinking about it, as such, this is might be seen to be very much like the way pooling was already done in HTM (and still is? @Bitking , @DanML ?) Which is to say, based on the idea the predictions of the next letter are more diverse at word breaks. Assuming an HTM were coded to represent letters.
The difference might be that we would be identifying diversity of prediction at word breaks using spike synchronization under oscillation. Which would be a kind of “external” network representation for each letter (external in the sense of representing each letter by that letter’s external connectivity to other letters.) Whereas HTM currently does it based on… training? Training of an internal SDR representation for each letter to predict each next letter?
So HTM currently is an internal SDR representation, and trained. Whereas this would be external SDR representation, and dynamic, by synchronization.
If that clarifies the thinking, it may identify some tweaks we need to make. For it to be useful to cluster dynamically, the clustering has to depend on the context. For words dynamic clustering and training will come to the same thing, because words are fixed. The context of the letters in a word is always the same. So HTM training which captures the preceding letters of a word (if this is done) will be adequate, and the two methods achieve the same results. But the dynamic method needs to capture that context of other letters in a word too. It may identify that even at this stage, we need some more nodes for each letter, so that the path through the preceding letters can be captured. So that the network can distinguish that the particular letter breaks at the ends of words, are the ones with the greatest diversity of onward network connections.
If we do that, it may equate to what HTM is doing already. But would place us in the position of having a framework to move forward, and pool “phrases”, which is something HTM currently cannot do, because phrases are not fixed like the letters in a word, and so training, as a way to represent pooling breaks, is inadequate.