Jeff Hawkins on Scale and Orientation in Cortical Columns - 30 November, 2020

Nothing new here. This is all familiar material: it’s called embryogenesis It describes the biological equivalent of manufacturing and assembling chips and boards. I got that, now I’m looking for the software.

So describe to me the next bit: how does this group of cells recognise vertical lines, that one does Jennifer Aniston and over there is the smell of lavender and the colour hot pink?

WRT “Jennifer Aniston” cells (i.e. “grandmother cells”), here is my perspective on that.

I believe this is actually a similar problem to recognizing vertical lines. There is an intimate relationship between topology and semantics. Using hexagonal grids, one can, for example, take a series of randomly distributed SDRs with semantic overlap and transform them into a line with an orientation. I’ve been meaning to post some visuals on this, but I can describe this in a little more detail here if you are interested.

As far as the qualia question (why the color pink doesn’t smell like lavender), while an interesting discussion, is ultimately less important from my perspective than the fact that both end up becoming semantics encoded into SDRs by the time they are being processed in cortical columns.

How does “software” do all those things?

While I don’t think the hardware/software distinction is helpful for the brain, if you know any ML you’ll have the idea of a “learning algorithm” vs “trained model” distinction. In ML, the learning algorithm is things like BatchNorm and SGD. The trained model is a bunch of specific matrix multiplications etc., which perform analyses using methods explored by the OpenAI Microscope Project for example. I think it’s not a bad start for the neocortex specifically to say that the genes set up the learning algorithm, and then the trained model is built over your lifetime by the learning algorithm, which involves Hebbian learning and other stuff.


No, that’s not quite it. WIthout trying to box anyone in, here are some choices. (I’m going to assume that ‘cell’ is an abstract reference to some place we can monitor, not necessarily just a single neurone. By training I mean adjustment of synaptic weights, or something similar).

  • The cell was programmed in the DNA to respond to the smell of lavender and nothing else. There is a lines gene, lavender gene, hot pink gene and Jennifer Aniston gene…
  • The cell was programmed in the DNA to respond to smells, and was trained by data to respond only to lavender. There is a lines gene, smell gene, colour gene and faces gene.
  • The cell DNA is the same for each case, the cells start out generic. It was trained by data alone to recognise lines, lavender, hot pink or a particular face.

Does any of these reflect your views or known science?

Generic neuronal processing units and a ‘library’ of hundreds or thousands of little programs (form unknown) that control the generation of output SDRs in response to various inputs. The lavender cell is not running the same program as the hot pink cell.

I know ML. Remind me how much labelled training data it takes to make that work?

Close. The sensors in the sense organs make spike trains. The CCs all respond to spike trains - a universal data format for neurology. The semantic meaning is given by what is hooked to the sensors, or where it is in the processing chain.

First I wrote that it’s useful to think of a “learning algorithm” vs “trained model” distinction, at least in the context of the neocortex. Then you wrote this reply, which I guess is a snarky way to say “ML has low sample efficiency, therefore we shouldn’t use concepts or intuitions from ML when talking about the brain”. You can correct me if I’m misunderstanding you.

I agree with you that today’s popular ML models typically have low sample efficiency compared to brain-like algorithms. In fact I wrote a whole blog post exploring why that is:

But I don’t think that’s a strike against “ML” in general, because “ML” is a very very broad category, way beyond “deep neural nets trained by SGD”. For example, Numenta’s HTM algorithm is ML, because it involves a machine that learns. If someone someday creates a brain-like Artificial General Intelligence, that will have be ML too, by definition, because again, it will involve a machine that can learn new things. And being brain-like, it will have a similar sample-efficiency as humans.

I think a “learning algorithm vs trained model” distinction is a generally useful concept when thinking about a large swathe of ML systems (though not all of them), including ConvNets and Transformers, and also including the neocortex and certain other parts of the brain.

You’re welcome to disagree with that—I’m very open to the possibility that this concept is somehow misleading. But I don’t see how the sample efficiency of ConvNets and whatnot is evidence for that. It seems entirely unrelated to me. Sorry if I’m missing something.

1 Like

I understand that you see each cell having hundreds or thousands of software programs in it (presumably from birth) and choosing which ones to run (based on the inputs presumably?). Of course a given neuron really has only one output (plus some internal states), so presumably the software is all about deciding when and how often to fire. Which makes me wonder how many programs would be needed to make such a relatively limited decision…

Personally, I think most of us here believe neurons are much simpler than that (though clearly not as simple as ANN Point Neurons). You need large populations of different types of neurons, each running simple local rules, for sophistication to be manifested. I recommend looking up Cellular Automata if you are not already familiar with the concept of emergent complexity from simple local rules.

I recommend doing some research on unsupervized learning. And of course, for humans, we also do a lot of labelled training through the use of language.

1 Like

I’ll just mention again that this is all triggered by the existence of a regular repeating structure in the neocortex, and the discovery of a data representation in the SDR. If the regularity suggests a ‘neural processing unit’ of perhaps a few hundred neurones and SDR is the data then the missing piece is software. After that, it’s all speculation.

Neurons are complex (for cells) but simple compared to NPUs. It’s the NPUs that need to execute the program, not each individual neuron. The program is all about deciding which SDR to emit. That could be quite a complex task.

I’ve followed Cellular Automata since the days of John Horton Conway and Scientific American. I’ve programmed a few. And I recall Wolfram found one that is a universal computer. Not sure it helps.

Neurons do a helluva lot of work with mRNA, both transcribing and non-transcribing. That’s where I would be looking for the software, either the mRNA itself or the generated peptides, or both.

It seems you dismiss the possibility that the basic premise of CA might be applied to biological systems, and in particular neural structures in the brain. Anyway, I don’t mean to discourage you, but like I said, you’ll probably need to do some research to further refine your idea before you’ll have more luck convincing others to invest time in it.

1 Like

To give you some encouragement, I think we are in agreement that the cortical column (or NPU as you call it) is performing some functions on its inputs and producing outputs. Most of us here believe it is the structure of the CC which will reveal those functions, but ultimately once the functions are understood, we would simulate them in software and optimize them. I think your angle is looking at it from the software perspective first, but maybe you end up in the same place at the end.

1 Like

Not at all. I kind of like the idea of a CA machine, because different starting patterns produce different behaviour. If the NPU is a CA machine, then the software is simply a range of starting patterns. Evolution is well-suited for finding those.

I say NPU only to avoid being too precise about the anatomy, but yes, CCs are the best candidate.

No argument here. I just don’t think the structure will tell all, and in particular it won’t tell you why similar CCs do such different things. Time will tell, presumably.

Good lord, how can someone in “HIERARCHICAL TEMPORAL Memory” Forum thinks geometry and time (delays) created by geometry, the pathway of the neuromodulators or any LOCAL rule’s effect anywhere has nothing to do with the geometric constraints of brains. If neuroanatomy is not important, you shouldn’t use the L2/3, L4, L5, L6 terms or granular agranular terms, ANYWHERE. Those are the most basic cytoarchitecture information available, still architecture. And I wonder how can you talk about anything in this forum if you restrain yourself with the “People without my background think that neuroanatomy and connections will get us there, but they’re dead wrong” idea.

1 Like

Although your post is a bit incendiary, I have to agree (yeah, so what?).

For a justification of this, look to the recently cited paper by S. Murray Sherman and W. Martin Usrey, Cortical control of behavior and attention from an evolutionary perspective. What they do is summarize the current SOTA surrounding our knowledge of how the cortex is involved in what is essentially a very complex, preprogrammed robot equipped with an array of central pattern generators (CPG) that control it. Start with decerebrate cats that walk and run without the benefit (or interference!) of their cortex. What we find out from that paper is that the cortex is not necessary for pretty much all of the behaviors of a mammal. The cortex ‘dips down into’ the control systems (somatosensory and motor) of the animal and serves as an executive controller that has, in the words of Sherman and Usrey:

For animals to survive, they must interact with their environment, taking in sensory information and making appropriate motor responses. Early on during vertebrate evolution, this was accomplished with neural circuits located mostly within the spinal cord and brainstem. As the cerebral cortex evolved, it provided additional and powerful advantages for assessing environmental cues and guiding appropriate responses. Importantly, the cerebral cortex was added onto an already functional nervous system.

When we get to Homo sapiens and Homo troglodytes*, the cortex is not only large, but also has command of one of the most complex of the robot structures possible (arms and legs). All of this combines to produce a very sophisticated animal. The cortex then is used for learned behaviors, and that is where HTM comes in. But the most important behavior is reserved for humans and that is language. What language does is allow humans to produce and repeat near infinite sequences of action that are mediated by language. This is effected by a learning process that every human undergoes.

Something else happens, but not until basic language skills are acquired. That, and this is just my theory now, is a self-organizing operating system that we call consciousness(C). C allows us to introspect, it allows us to navigate a virtual 3D map in our minds both in the 1st person and also as an observer. It also establishes our sense of time and includes the ability to look ahead and back into the past. This is all done in the software of the human brain, not the hardware. The architecture of that hardware is still being determined and how it allows for this OS to self-organize is yet to be determined.

Think of everything a chimp can do as firmware with the exception of tool use. I will proffer that if a chimp could be genetically modified to speak it would develop C, but only to the level of a five-year old human and that level is just speculation on my part.

*I am siding with Morris Goodman on this.


I agree with all of that, except some aspects of ‘C’. There is a clear progression in evolutionary terms of complexity and richness of behaviour associated with brain size. You don’t need ‘C’ to explain the adaptive and problem-solving and planning behaviour of a cat or a raven. Language is the supercharger, but the neocortex or the avian equivalent is the engine of intelligence. A firm understanding of cortical function might not help with understanding sensation or motor control, but it sure would help a self-driving car or work-horse robot at the higher levels.

But the ‘fire’ I was trying to light is: what about the software? SDR and HTM are novel and plausible, but at most they account for a few percent of what the cortex can do, even without language. I don’t buy the idea the cortical column is created as a blank slate and learns everything from scratch in every organism. I would expect evolutionary advantage in passing on a selection of successful algorithms to offspring and if it’s not ‘wired in’ where is it?

To me the obvious links to explore would be junk DNA, RNA and neuronal synthesis etc but I don’t follow the field so that could be totally wrong.

Having given this a deal of thought over the past few days, I think this may be the single most important observation I have seen in a good while.

Assume a complete and functioning robotic ‘horse’ with the cortex sitting on top as a ‘jockey’. The rider has no idea what’s going on down below (which accords with our introspection) but gets full access to a (slightly delayed) sensory input and all the levers of output (within certain constraints).

You can breathe when you like, but you can’t stop breathing. You can choose what to listen to, but you can’t choose silence.

Where this leads in IT terms is a search not just for software but also for an API, a communications channel between jockey and horse. Stitch into that channel and the possibilities are endless.

1 Like

Welcome to what I have been calling the dumb boss/smart advisor model.
It’s nice to see someone else that sees how that model could work.

1 Like

I’ve read a lot of these posts over the years and I now have a lot of difficulty seeing how any other model is possible!

1 Like

You might want to take a look at some of work Bill Calvin has done.