Emotional Logic in Behaviour Based Learning

Just some thoughts, I am not sure if this helps but it sure was fun to think about…

The emotional logic of behaviour based learning.

I have been observing complex human behaviours in a learning environment for many years now and I believe I have a theory of how emotional logic gates are developed and how they function to initiate behaviour in an endless cycle to make meaning of the world. I would posit that this is in fact something that all living organisms do in order to make meaning of the world and that in fact what we are looking at here could be the meaning, or purpose, of life. I would first like to thank Jeff Hawkins and his team at Numenta for starting me on this avenue of exploration some 9 years ago. Even if this theory is wildly inaccurate or wrong it was enormously fun to pursue as a thought experiment and perhaps it might help inform the thinking of others or provoke a new avenue of exploration which might bear fruit. It was Jeffs’ idea that the neocortex could in fact be processing disparate data sets of increasing complexity using the same algorithm in a hierarchical temporal framework which led me to start examining the problem of how we learn. I believe that my position as a teacher allowed me to see patterns in how people learn or don’t learn that gave rise to this theory which will be outlined below. Before I proceed it should be understood that this is an evolving theory and it is informed only by my limited understanding of logic circuits and gates and that I have little to no programming experience, so I hope someone who does understand these aspects can help me to understand how the circuitry is wired and whether or not there are holes in my logic. It should also be noted that I have not identified the mechanism by which the logic gates and circuits are updated to deal with increasingly complex behaviours and returns but I am certain that these algorithms exist. As Jeff would say these are merely plumbing issues which hopefully someone with a far greater understanding of the mechanics of these things will be able to figure out.

Some background

Some time ago I was struck by a thought which stuck in my head, namely what if there was a biological imperative to learn and if there was what would it look like. At the same time I was also being baffled by the complex behaviour of my students and how it seemed to either interfere or accelerate their learning, despite my best attempts to control it, usually with little to no success. It inspired me to dig a little deeper into psychology, memory formation and eventually back to Jeff and his team at Numenta. I found that much had changed in the field since my first exposure and the ability of the Numenta team to not only explain the workings of HTM but also to unravel all the complexities involved in learning from patterning was invaluable in informing my thinking about the problem. However, one thing that Jeff said many years ago stuck with me and was bothering me. Jeff said that true intelligence is the ability to predict and that prediction in a repeated cycle can be used to make subsequent predictions of increasing complexity and that all of this was happening in the neocortex. Furthermore, and please excuse me Jeff if I am misquoting you I may just have got the wrong idea, behaviour of animals was not particularly predictive and so was not all that interesting, a view that I was inclined to agree with given the kind of teenage animal behaviour I regularly experience. I then thought about the idea that behaviour is not predictive and I wondered how, if that were true, how does an alligator ever actually catch prey? If an alligator is merely reacting to its’ environment how was it moving its body in a fashion that would get it to the right place at the right time to catch a moving animal which presumably was also exhibiting complex and novel behaviour. In other words does an alligator eat by accident? This bothered me quite a bit because there is a certain level of prediction involved in identifying prey/threat and where in space and time that prey or threat would be in the future if I wanted to either eat it or run away. If all prediction happens in the neocortex then how could an alligator do this, without one. All of this led me to one inescapable conclusion, namely that behaviour must drive learning and that learning must result in increasingly complex behaviours intended to make more meaning out of the information that returns after behaviour has been initiated. Using a limited but I think fairly good understanding of how SDR (Sparse Distributed Representation) can be used to make semantic meaning of any kind of information I then started thinking about the elegance of this idea and how meaning can be very efficiently be represented almost more by what isn’t there than by what is there, ie. three on bits arranged in a novel pattern within a 256 bit array. It struck me that the kind of power saving and elegance of this simplicity is the kind of thing I see in natural biological systems all the time. I was also becoming acutely aware of the fact that other living organisms including plants have been shown to exhibit complex behaviours and that there is increasing evidence to suggest that even plants do some sort of learning on some level. The interesting idea that arises from this is that much like Jeff suggested that the neocortex was doing the same kind of algorithmic operations on different data types the same learning circuitry could be repeated over and over again in many different types of organisms. Coupling these ideas with another firmly held belief of mine that nature usually finds the simplest and most elegant solution to a problem I believed that I might be able to unravel this ball of yarn. Since I am not a neurobiologist or programmer and given that I have no laboratory I decided that the the only kind of experiment I would likely be able to conduct was simply a thought experiment, the results of which are what follows.

If behaviour is involved in learning and behaviours are controlled by emotions then I think it is fairly obvious that these two things must be controlling the circuitry of learning. I also believe in the power of self reinforcing feedback loops and so armed with these two pieces of understanding I set out to try to map out how a learning circuit might work based on emotional control of behaviour. Since the neocortex seemed far to advanced for me to comprehend I started from the other end and asked the question, how could a baby learn about the world around it by making increasingly more complex behaviours and predictions and what would that look like. Fortunately I have a couple of kids and so I was present when much of this primitive learning was being done. At this point if you are already ahead of me please scroll through to the diagram embedded below for an admittedly poor attempt at algorithmic pseudocode which might or might not be accurate. I am sure if it doesn’t make sense you will be able to poke all kinds of holes in it and need not waste any more time trying to unravel how I came up with it.

I decided to try to envision what it would be like to be a newborn baby with no concept of the world and no idea of self or environment. I tried to imagine the kinds of hard wired circuitry that already exists and whether or not any kind of memory might have formed already that could be used to control logic gates. I assumed also that there would necessarily be a need for an innate self preservation mechanism in order to protect the body and mind from harm which led me to the idea of pain as a stimulus. Pain is a good starting point because it provided a simple binary output and that was simple enough for me to wrap my head around. If my logic proves accurate then a pain signal should initiate some action to stop this signal from being produced. I imagined that in a pleasant world the babys’ first experience of pain in a low noise environment would probably come in the form of a hunger pain. From there if follows that the pain signal initiates a behaviour ie. crying, that is intended to elicit a response from the environment, if you have kids you will know this is not an easy behaviour to ignore. Keep in mind that the baby has no concept of the environment and that for all intents and purposes the baby’s’ body and the wider environment at this point are basically the same thing. The environment is providing a response to the behaviour, a memory of the state of the senses and the pain signal, represented by an SDR, or similar structure, is formed. In the next iteration of this cycle another memory is formed and compared to the last, particularly in relation to the pain distribution and if no novel information is present then the cycle continues until mom feeds the baby. As soon as mom feeds the baby novel information is captured in the newly created SDR and the comparison to the last SDR shows increasing novelty, (all the smells sounds,tastes, and tactile sensory data from feeding) and thankfully a decrease in the state of pain which triggers the shutdown of this pain loop and activates the pleasure loop. The pleasure loop is also designed to initiate a behaviour but this time it is not a hard wired one (or maybe it is this could be developed further later) but a behaviour other than crying. The purpose of this loop is not to avoid pain but rather to just return novel information from the environment in response to this new behaviour. An example of this sort of thing is the little noises and flexing of fingers and toes that very young babies do. Of course all of these behaviours are eliciting novel returns from the environment and the current SDR is compared with the last to return a novelty score which of course also includes some information on the state of pleasure as well. Eventually patterns emerge and the current SDR when compared with the last will indicate decreased novelty which in turn triggers the formation of a new memory to be stored in short term memory as a pattern associated with a state of pleasure. Of course the fact that the behaviour is no longer returning novel results should also signal an end to the loop and this puts our baby back into a passive state in which no pain or pleasure signals are on and we could call this state neutral. The neutral loop when triggered is a passive state in which behaviour is not initiated but active listening to the environment is occuring. In this state no behaviour will be initiated unless the SDR comparison of the sensory data coming in moment to moment returns either a match to the previously formed pain SDR or the new SDR is highly novel in comparison to the last. Of course these loops can be nested within one another and call one another in a cyclical fashion to accelerate the learning of more and more complex patterns of sensory data and of course primitive emotional state. I think what happens over time is that these primitive emotional state representations evolve into four basic representations of the primary emotions. Fear, Anger, Sadness, and Happiness all are then used as logic gates to inhibit or initiate behaviours intended to elicit increasingly more complex returns from the environment. I also can see the potential for these gates to put our bodies into various states of sleep to upgrade or update the logic circuits and store short term memories to long term. Essentially put, your behaviour is merely an involuntary expression of the routines and subroutines of your innate drive to make meaning out of the world. It occurs to me that perhaps the representation of pain evolves into a concept of displeasure over time. We need to have a pain circuit which remains in kind of primitive state where it is either on or off in order to protect the organism but at the same time we could form a semantic representation of displeasure from our exposure to pain signals over time which would help us to initiate behaviours, and continue behaviours which might have a component of pain in them. This would be very important in engaging in activities or learning which has some degree of unpleasantness. From three representations of pleasantness, unpleasantness and neutrality towards something we could then make a representations of differing densities of these three components which could evolve into increasing complex emotions. The emotional representation then would be used as a logic gate to initiate behaviours and analyze the returns. I think this is probably how a sense of empathy is formed as well. I am still thinking about all of this and am sure I will come up with other ideas but for now I will just put the idea out there and see if it resonates, if it does please get back to me and let me know if I’m making any sense. Humbly, Jake Kurtz.

1 Like

The aligator cortex is 3 layers, while different from the human cortex, this would imply that only 3 layers are needed for basic predictive (temporal) evaluation and learning.

The aligator (and other animals) may lack the ability of association (non temporal), which is where the additional layers are maybe the key factor that makes us who we are. We can associate in the learning and predictive process and this is what the other layers are needed for.

Aligator (3 layers) : if it moves, it’s food

1 Like

Just so. The issue with AGI is that so much of animal behaviour is inherited in the DNA, which is a very slow process. AGI is about learning from sensory input, building a model of reality, setting goals, choosing actions. It would seem the cortex provides much of that capability.