An HTM compatible theory of attention

I’ve been working hard to grok HTM theory. If I’m understanding correctly, the mechanism behind “attention” would best be explained by SDRs inhibiting more parts of the brain as they are recruited to model something intensely (the net effect pushing other conscious thoughts out of awareness). Does this sound right (or at least indicate that I’m starting to think out HTM theory correctly)?

I tend to think of attention as a behavior designed to reduce uncertainty and/or surprise. That is to say that our attention is directed towards thoughts and/or obtaining sensory inputs that allow us to gather additional data to resolve existing uncertainties or learn new novel sequences.

HTM excels at generating predictive states that are consistent with previously observed patterns. When a novel pattern is observed and columns burst in response to being activated without having been in a predictive state, then the thalamus is gated to flood the cortex with additional signals and information to assist with learning a new sequence.

While these ‘surprise’ events will certainly grab the attention, and focus the agent on learning the new sequence, novelty is not the same as uncertainty. Uncertainty can exist if the system is making multiple predictions that are all plausible given the current context. In that case, attention could be used to generate behaviors that would allow the agent to acquire additional information (input from senses or recall of stored memories) that might reduce the number of simultaneous predictions.

While performing this behavior, many columns are recruited to model observed patterns right? As more parts of the brain are employed, doesn’t this have a inhibitory effect… essentially filtering other unrelated patterns from participating in conscious thought?

I’m imagining the enforced sparcity leaving no room for conscious thought other than what you are actively modeling (Attention).

You might find Graziano’s work on Attention Schema Theory (AST) provocative.


Personally, I found AST quite ground breaking - and a good step away from the Theory of Consciousness to real application. AST predicts things you can test - and he has.

(As an aside, I also enjoyed his starting point from the applied world of proprioception, which itself highlights many non-conscious aspects of how bodies really work, and predicting what happens when it fails).

This is along the lines of how I’m thinking the process works :

Goal investigation → Unrecognised pattern (sub cortical instigator)
Unrecognised pattern → Surprise (SDR mini column inhibition / winner takes all)
Surprise → Attention (thalamic focus / wider column inhibition)
Attention → Working memory (PFC mini column identification)
Working memory → Thought (PFC/other/no idea)
Thought → Goal investigation

All surprise ends up in working memory, just depends on how quickly it fades out… we can’t avoid or block this process.

Surprise is only created in the SDR process once (within 200mS theta wave cycle - borrowing from Max’s hypothesis - then it is no longer a surprise and is then a known pattern) so this surprise is then held in working memory… to fade out or not…

