To Pay Attention, the Brain Uses Filters, Not a Spotlight

To Pay Attention, the Brain Uses Filters, Not a Spotlight

If you view a ReLU artificial neural network as system of switched linear projections, it can do quite good filtering and selection. However it is less than perfect. There are limits due to the CLT. For example if you average n simple noise terms the noise is only reduced by the square root of n.
How might you get better noise reduction/error correction/filtering?

1 Like

Is there any reason to think that the brain uses anything like a ReLU network?

HTM is loosely based based on the cortical column, the basic building block of the brain.

I asked you in a different thread if what you are mentioning about ReLU was applicable to HTM and it sounded like you said that it did not apply. If that is the case then jumping past that and asserting that the entire brain is organized around ReLU principles sounds a bit off.


What the article describes looks more like a system of addressable gates, with a feedback means to “bid” which ones to open and a very strict bandwidth which limits the amount of sensory input passing towards the cortex.

I was being snarky about ReLU networks by pointing out the best they can possibly do in terms of filtering would seem to be related to the square root of the width of net. Presumably the biological brain does a lot better.
Anyway that was a very nice article, very readable.
I did code some things in the 1000s of synapses paper but don’t ask me too much about HTM. Or you can but you won’t get much of a coherent answer. However you know that already. If you are underscoring a point that you know more about HTM than I do, you are absolutely correct, by a ratio that approaches infinity. Is that satisfying?

Actually no.
I have many questions about extending the theoretical underpinnings of HTM and am always looking to related fields to see if there is some deep and non-obvious connections that could bring enlightenment.

I stopped actively exploring deep learning several years ago as it became clear that it is a sterile branch of neuroscience modeling. It emulates the layered nature of the brain and draws from the power that brings BUT it does not emulate the biology closely enough to gain the power of online learning.

Deep networks are powerful as this feature of the brain brings a degree of micro-parsing that is not readily available in any other programming paradigm. The are many ad-hoc ways expressing the basic layers of the underlying model, each with its idiosyncratic attempts to program knowledge into the model. There are various twists in how the model run - but they are all expressions of the same basic underlying idea.

So I don’t pay much attention to these details as I see all of them as rearranging the deck chairs on the Titanic - it’s still a dead end.

If you do see some online learning method or coding scheme with possible connections, no matter how distant, I welcome your comments with open arms. This is why I ask - I am always hoping that you are offering something relevant to the HTM forum as you are posting here. Many forum users are actively working with deep networks so your non-HTM posts are welcome anyway.


Anyway the primary activity of scientific establishment is establish and maintain a pecking order. Actual science takes a back-row seat. That is very human and also tied in with people’s salaries as well as status.
That can have negative outcomes such as when the basics of a topic must be addressed with high school math but no one dare do so. Especially when the topic has been worked on with higher level math for a considerable time.:nauseated_face:

There are plenty of other potentialities. Both known but not investigated yet and unknown. Something like a Raspberry Pi 4 might be used construct a human level intelligence if you could discover the right algorithms.
One time I constructed an if-except if tree algorithm. If the current terminal letter is x then the next letter is y except if the prior letter is u then the next letter is v except if the prior prior letter … (x,y,u,v being variables!)
And that really worked quite well and could correctly finish the spelling of words and produce (fairly weird) sentences and paragraphs.
What escaped me at the time, and I just thought of is that should allow powerful text compression. You only need store when and where a prediction error is made and go and fix that as you sequence out the text. Then you have quite a strong compression scheme.
There are all these potentialities out there. They are like needles in a haystack though. So good luck and all that.

Other than a brief dalliance with Dr. Otto H. Schmitt at the UofM I did not have much to do with college, and any science I do is strictly a hobby and is in no way associated with any organized research organization. I have zero status and zero salary.

I am in this to work out the puzzle of intelligence; hopefully to the point where I can create it. Other applications are interesting but a sideshow to this personal pursuit.

Someone has to work on these things as they are useful in their own way - but I won’t be that person. If these things are your path to happiness - full speed ahead!

I think I am perhaps further down this path than most but until someone works out the full end-2-end solution nobody really knows how all this really stuff works or what you have to do to make it happen; and anyone who claims that they do is deluded at best and selling snake-oil at worst.

That said - my best guess is that the final answer will involve hierarchy with the attendant fine grained parsing, distributed memory of both sequence and patterns over this hierarchy, distributed learning over the entire hierarchy, prediction and recall mixed with perception, mixing of relevance values and semantic representation, and a separation of drives, sequencing, and evaluation from the memory system. There are a lot of moving parts that have to fit together into a working system.

On the plus side there is a working system to examine and learn from; I would hate to have to work this out from a blank sheet and a list of specifications.


The last common ancestor of humans and chickens must have been quite something. I suppose there was an adaptive advantage that conserved a behavioral trait across the 2 lineages.
Anyway it seems an in depth understanding of weighted sum/dot product and a few small tricks will get you very far with machine learning. Yes, if you just stare at the dot product equation that isn’t helpful.
You need to take some perspectives on it, like how might you get it to work as a linear associative memory or how you calculate the effect of noise on the inputs etc.
If you just want to create an AGI as opposed to understanding the human brain that would seem quite easy. You link dot product controller neural networks to a large dot product single layer associative memory and let evolution get to work.
You maybe need to think about direct and indirect addressing into the associative memory in order to do the linkages correctly. In principle it is very easy, in practice who knows?

Good start. Now how do you do online learning with this?

This is a critical and deeper problem - assume that you programmed this memory array with large comprehensive databases - say wikipedia level on a variety of subjects: how does it pick actions that are good?
Sure - it “knows” everything a college grad does but should it use a sledgehammer, front end loader, explosive, battering ram, locksmith, or jiggle the handle to open a locked door in a room it is exploring? Should it start a sub-mission to search for the key? Should it occur to the AI that a locked door means that the contents of the whatever is behind it is not part of it’s exploration even though it has decided to explore the house as it was commanded?
Most AI researchers learn that common sense is not so common early on.
If I am correct, judgement in these matters is programmed right alongside facts of the environment - part of the online learning. This emotional shading of the facts that make up the basic learning through the agents life as it goes from basic programmed innate knowledge to “adulthood” is critical for forming judgements. Critters learn this with various emotional shading as they encounter things - fire bad, warm from fire good, hot food from fire good, very hot food from fire bad …
Every object, every interactions - all are stored with some sort of personal judgement of personal relevance on multiple dimensions of emotional values.
So - add emotions and a hippocampus to store episodes of experience and a limbic system to form these personal judgements, an online learning system to record these experiences, and some method of transferring these judgement salted memories to the correct location in the multi-level associative memory that you described.

I can see, in broad brush detail, how the biology does this; I can’t see how to do this with any traditional deep learning memory.

The problem of common sense is one of the huge stumbling block for any AI effort, and the source of most dystopian views of any AI implementation.