Paper referenced in A Thousand Brains: A New Theory of Intelligence

Hi @jhawkins , in the HLC call a group of us are discussing A Thousand Brains: A New Theory of Intelligence The following section from the book mentions a published paper which describes how voting happens quickly and reliably. Can you please tell us the title of that paper? Thanks!

“Surprisingly, a column doesn’t need to send its vote to every other column. The voting mechanism works well even if the long-range axons connect to a small, randomly chosen subset of other columns.Voting also requires a learning phase. In our published papers, we described software simulations that show how learning occurs and how voting happens quickly and reliably”


@markNZed: I think you addressed the wrong Jeff. Jeff Hawkins’ handle on the forum is @jhawkins.

Could I also ask in this same thread what we have to understand as voting?

Is it in the sense that a majority of voting neurons produce an action potential in a ballot neuron which determines an outcome?

Or is it rather in the sense that clusters of neurons gradually change their states (i.e. opinions) as if they are convinced by alternative propositions from opposing clusters? (Which is how honeybees take decisions I am told).

Or is it more in the sense of a rugby scrum, where the winning voters are pushing the whole pack aross a line? (Which is the way a starfish walks I believe).


I believe that is a reference to the Columns Paper. There are further details on statistics, capacity analysis, etc. in the Columns Plus Paper

The way they implemented voting in that paper is:

This is basically describing ambiguous input leading to a union of objects in the output layer, with bias coming from neighboring CCs to reduce the ambiguity.

Note that this is only describing the inference stage (i.e. voting on objects that have already been learned), not how they were initially learned.

A diagram from the paper showing how they hooked up the basic circuit:

(blue and purple arrows depict biasing signals, and green lines generate active states)


Falco posted the article which includes the code we used for the simulations. If I recall correctly, the idea that a column doesn’t need to have voting connections to every other column is not mentioned in the article. It comes from the properties of sparse representations. The number of synapses you need to form to recognize a pattern depends mostly on the sparsity of the pattern and not the size of the vector. If 1% of the neurons are active then forming 20 to 30 synapses to active neurons is sufficient to recognize a pattern regardless if size of the vector is 10K or 1M. So increasing the number of columns that are voting does not increase the number of voting synapses required for any individual column. The number of voting synapses needed overall increases linearly with the number of columns.


I’m not sure if what follows is a reasonable way to understand the paper, but here goes: the voting is almost the opposite of how one might imagine voting. The output layer collects all the votes which are fed back to the inputs of all the columns. Then each column takes the other columns votes and its own input and “votes” again. It is as if each voter (column) does its own counting and is then influenced by that count :slight_smile: There is no centralized vote “counting” as one might imagine.


Actually @Paul_Lamb did. :-). Thanks for the answer.

If you consider each (macro)column that has a model of what is being observed, as a little democracy, then its inputs from other columns will vote to reach a concensus within the macrocolumn, and drop the alternative opinion. We might call it a federal democracy. ;-).


You could think of it more like rank choice voting. Each column has a list of candidates that it sees as viable options that correspond to the input signal. It then sends it’s best guess to the other columns, but if that candidate is not well supported by evidence from other columns then it is thrown out and the next most viable candidate selected. This continues until consensus is reached, or the column gives up and bursts because all of its viable candidates have been rejected.


This is exactly my understanding of how the consensus building takes place. It is still very correct that there is no centralized vote counting. But rank choice voting still leads to the result that in the end, for the entire “superset” (collective) of cortical columns that are involved in the voting, the most common selected choice emerges as the stable “winner”. It is almost like finding the highest common denominator of a list of numbers (to use a mathematical metaphor). This is still a disambiguation process taking place at the collective level of the set of cortical columns. The collective has not decided which model is the best fit to respresent the perceived reality (sensory input) until this consensus is reached. So final disambiguation cannot be attained by any singular cortical column. It requires the collective of the cortical columns to reach that consensus. This is the true, decisive final disambiguation of the perception taking place.

Perhaps the problem understanding this, is that the representation of the final perception is not stored in any of the cortical columns. The knowledge representation of the perception taking place is only stored outside the cortical columns in the active distal axon connections for the group (collective) of participating cortical columns. We are used to looking inside the CCs in their SDRs for the representations that we associate with consciousness. But in TBT the representation that corresponds to consciousness is in the network of distant axon connections of the CCs. Only the set of distant connections that wins remains active. And that set has the common denominator (common model chosen as winner) in each active cortical column. The other distant sets of distal axon connections that are connecting CCs with “less common model choices” become weaker as they receive less outputs from the CCs in the collective. So it is a sort of competition of distal networks and the winner is the representation that wins the disambiguation process.

This vision of mine, for the TBT collective level of disambiguation would imply that for each of the say 100 models that are stored as SDRs in each CC there is a pyramidal neuron at layer 1 that only becomes active for one of the 100 SDRs (that is reference models inside the CC, of which only one can win within the CC). So if the CC stores 100 models (for each potential winning SDR) there are 100 different layer 1 pyramidal cells each projecting to a distant set of other CC with a corresponding model in their CCs. These 100 axons leaving layer 1 of the CC do overlap in the set of other CCs that they project to. But they will only reinforce the models (SDRs) in the afferent CCs that correspond to their source SDR in the efferent CC. In this way, you have feedback loops that modulate the competition of models across the collective (superset) of CCs.