The topic we chose for the next HLC call is “How to generate behavior with HTM?” Given that HLC is a learning circle, I’ve posted the question here so you can teach us. Thanks!
I guess a classifier would be the simplest example of behavior. I assume that all behavior requires learning. In the case of humans some of that is through evolution of our species but much of it is constructed through social interactions. From this perspective “emergent behavior” is a property of the observer’s perspective not the system’s behavior.
Rather than “classifier” would it be more appropriate to use “decoder” to have symmetry with encoder?
Is there a working example of behavior generation in HTM.core or another HTM framework?
My initial thoughts are that we can use something like a decoder as you mentioned to translate minicolumn activity to some output --such as movement on a 2D grid, for example. The decoding can probably be somewhat random with local similarity matching just like an encoder. This would actually make intuitive sense when you think about how babies learn to use their motor functions: by randomly flailing about until they learn which motor outputs cause what limb sense feedback.
Then, of course, there’s the question of goal feedback, or reinforcement, or whatever the subcortical structures are doing down there.
I would be tempted to stay with a concept of SDR as the input to the decoder. Which is indirectly minicolumn activity. But if it is an SDR then maybe this needs hierarchy to get meaningful commands. The first macrocolumn dealing with sensory input is unlikely to be driving actuator output (I guess).
Maybe the code of the output SDR does not matter in the beginning. The decoder just learns to align a repeating SDR with a regular behavior. Then the sensors detect that behavior and that is how the particular SDR gets associated with a particular behavior.
It seems behavior assumes a goal. So that is part of the problem.
The NEAT researcher suggested a goaless approach – novelty seeking. Well, novelty is a “goal” as such, but you get my drift. This was demonstrated with the NEAT algorithm controlling virtual robots finding its way through a maze.
Perhaps the same could be done for HTM? Not sure how that could be made to work. Perhaps for decoding mapped to behavior? Novelty decoding. Just a thought.
I think we have to see this behavior a little more abstract. The mC is really a pattern detector.
You can feed it any pattern, litterally make up a weird pattern, and if you feed it often enough, at some point it will start emitting a signal (over one or maybe a few bits) to say ‘Hey, you know that pattern you’ve been sending me? Well it’s active now.’
Would that output produce muscle twitching? In most cases probably not. But it will send this signal to another mC who will be receiving loads of signals itself and might perceive another pattern.
And of course each mC will learn an amount of patterns who are not necessarily related. They just use the same inputs.
Two extra points about the mC is that the pattern has a temporal characteristic, and that the pattern is somehow converted from one domain to another. What those two domains are is not clear to me yet, but one of them is possibly tied to grid cells. Whether the conversion happens in both directions is another interesting question.
You really REALLY should learn how HTM works. One of the essential features of HTM is single shot learning - one presentation is enough to learn something.
As far as how the cortical column learns different things, units hooked to eyes learn eye things, units hooked to ears learn ear things, units hooked to different places on the skin learn skin things, so on and so on.
It is starting to become clear that you seem to think that one cortical column is a computing unit that does all this “program stuff” by itself. Think bigger. It is a single logic element like one gate in a big computer. The wiring diagram that is given by DNA is the program; it configures these gates into the complex computer that is the brain. If it helps, you can think of it like the plug-board wiring in an old-style analog computer.
I could go into more detail in how this wiring diagram combines all these gate-level units into a computer but that answer would be book-length, vastly exceeding the effort that I want to put into this post. Do trust me when I say that there are people that are spending their entire careers working to untangle the puzzle that is the brain’s wiring diagram - it’s the program. The answer to how the brain sorts out lavender from Jennifer Aniston lies there.
I have the niggling feeling david.pfx is trolling us. It looks like he is spewing GPT-3 gibberish back at our posts. Some of it makes sense I guess, but… I don’t know.
I honestly have trouble situating the SDR. I suspect there are SDRs at several levels. Is it wrong to think of a dendritic segment as an SDR for instance?
Matt once told me that the output SDR of an encoder should always have the same bit size and the same active bit count (n and w). So I guess a dendritic segment does not comply to that. On the other hand we shouldn’t be too strict about the constraints for the biological examples. It’s biology after all.
Yes. Here again, I am confused. How much of the hierarchy principles from the early papers still apply? I thought the hierarchy concept was somewhat pushed back, at least for now.
And I agree the MC is very important. But I think it makes sense to start with the smallest functional unit, especially since it is so redundant.
Also, I might need to correct my earlier post a bit. This mC functional units does more than pattern recognition. It communicates with at least three or four different systems. It’s very likely that each of these connections performs a different role. Yet another question then is whether each of these roles can be studied independently?
I find it simplest to imagine the SDR being a consistent size (the input/output of an MC, the output of an encoder). This seems reasonable for a bio-inspired algorithm and unrealistic for biomimicry.
I am not sure, there has been a shift from HTM to 1k brains. Should be clarified in Jeff’s next book.
My suspicion is absolutely not Whatever open system you want to study requires understanding it as part of a broader system. For example, to understand the brain will require understanding the social context. That still does not turn the system into a closed system but it can allow for more reliable modelling of the sub-system’s behavior.
I find it hard to imagine that HTM can make sense without linking all three H & T & M. Obviously that is not enough to understand the brain. It’s open question (for me) if it is possible to get machine intelligence “out of it”.
I know how HTM works. I was responding to @Falco who said “if you feed it often enough”, which is the way you train an ANN, not the HTM model.
So do you subscribe to the idea that every CC starts out with exactly the same functional capability, and that all the specialisation is a result of the data inputs it receives?
Your next para suggests you think the specialisation is by means of fixed wiring driven during embryogenesis by DNA. Is that right? The CC is just a ‘logic gate’ and all the smarts are in the wiring from one CC to another? Sorry, but I’m not finding your position at all clear.
[BTW is your CC the same as the mC and MC in other posts, and if not is the distinction important?]
(1) EVERY dendrite is forming an SDR. All of them. If you have a cell with 8 dendrites you are feeding 8 (or more) SDRs into that cell. It is entirely possible that there is more than one SDR being learned in a single dendrite. Look at the basic diagram of a cell in a minicolumn. Those blue and green things are individual SDRs.
(2) It’s not that the H of HTM is not important, it’s it just so far ahead of the scope of where modeling is at right now that there in no real point in trying to model it.
I am going to offer an example that I hope will fix the situation in your mind. Let’s say you were setting out to understand the wonder that is Shakespeare. Your technology is a bit limited, far too limited to take on an entire play. So you try to model “to be or not to be” as your first project. Your researchers are unable to discern if that is a logical statement but they see hints that this is what is going on. They publish a paper on the logic of Shakespeare and the world is abuzz with this new insight. After much work, the researchers see that this is going nowhere so they pull in back to a single group of letters - one word. They study “to” and “not” as intense projects, release a few papers on it. The relation between the letters and word unit is starting to make some sense but after a while, the work stops bringing any new insights so the next step is to look at combinations of words, not too many, maybe a few at a time. A few new insights emerge but we still really don’t know that much about plays. It is clear that they are on the right track but very far from delivering the goods. After the initial buzz the world loses interest in the project as it does not seem to be bringing new insight. The work continues in the original research group, adding a little bit to the puzzle with each new project. It is clear that letters are the one true path but more work is needed.
I see minicolumns as letters, macrocolumns as words, and regions/maps as the sentences level, and the static snapshot of the entire brain as paragraphs. You have to add in the evolution over time and the subcortical structures to get the whole play.
As far as the confusion of what an SDR is or is not - the tiny models that are usually used in current research are so small that it is necessary to use a “fully connected” version to get them to work at all. Since every cell samples every input bit the idea of individual SDRs is shorted out as every cell sees every input bit. This is an artifact of the modeling and not the theory.
The addition of lateral connections is about at the stage of grouping letters together. Hierarchy will add grouping words into sentences.
To clarify this, there’s a difference between the SDR and a specific SDR, I guess. “The SDR” might refer to all the bits, regardless of whether they’re 0 or 1, e.g. 2048 bits of which 2% are 1s. “A specific SDR” might refer to a specific combo of 0s and 1s.
Dendritic segments recognize somewhat specific SDRs, with roughly 40 synapses. Usually most of the synapses will be off, but when the dendritic segment is above threshold, it needs a large fraction of those synapses on, so those 40 bits by themselves aren’t a whole SDR.
This is a good analogy. To extend it: without having a much broader understanding of human culture the concept of a play could not be understood. Then without actually going to the theatre and experiencing Shakespeare the meaning would still largely be lost. The idea that the intelligence is “inside” the skull is probably not a good model for understanding human intelligence. The brain is an open system - so to understand the brain requires understanding it as a sub-system of a much larger system. Brain science will need the expertise of social-psychology.
Bringing this back to the topic. Behavior is always meaningful within a practise - the goals are not emergent they are learnt. So we will have to teach the goal, rather than expecting behavior to magically pop out of a neural network.
Mammals always require some level of social interaction - it’s kind of implied with the whole feeding mom’s milk during external development thing.
That said, while evolution has factored that in with the configuration of the critter, the brain is capable of running it’s body without any social interaction. The social part of the program is usually badly broken if this social thing is not learned during some critical period but setting that aside, the rest of the development process does not depend on the social aspect to be able to move, seek food and shelter, explore, mimic behavior, and much of what we expect from a functioning brain.
Learning speech dramatically expands the repertoire of metal manipulations available but that should be considered separately from the organic capabilities of the brain.
The type of brain structure HTM is leveraging is based on the neocortex, all the activities that animals can do without a neocortex (largely those you described) are of far less relevance for that approach. The neocortex is very much associated with the ability to learn and that does not happen “inside the skull” it happens in interaction with a much larger system. But here we are off into philosophical territory so probably not much luck of reaching any agreement. The key point, I hope you’ll see, is to identify whether you are dealing with a closed or open system. The sort of intelligence that motivates the research in HTM is associated with open systems.