This is a community-created and managed reference card about HTM theory and implementation for easy lookup for common information. Please add to it if you have something interesting to add.
If you see “?” and know the answer, please fill it up.
Just the numbers as rule-of-thumb in 10’s :
item | count | capacity | misc |
---|---|---|---|
pattern | ....10001010...... | (2000 choose 40) | 2000/40 |
synapses | eqiv to a bit | 10 syns to detect pattern | |
dendrite segment | 100 synapses | 10 patterns | |
neuron | 100 dendrite segments | 1000 patterns | 10000 synapses |
mini-column | 100 neurons | 50 microM | |
- capacity | 100 n * 100 segs * 10 pat | 100 000 patterns | |
macro-column? | 100 mini-cols | 500 microM | |
cortical-column | 2000 mini-cols | 5M transitions | |
cortex | 150 000 CC | 750B transitions | 20B neurons |
Brain
- Weighs ~3.3 lbs (1.5kg)
- 5% of a human’s total body weight
- it uses 20% of the body’s oxygen
- Total number of neurons in the brain: 86 billion
- Energy consumption: 20 W
- 1 sq. mm. contains: 170 000 neurons
Neocortex
- 75% of the brain
- Total number of neurons in the cerebral cortex: 21-26 billion
- Total number of neurons in the neuro-cortex: ??
- Total number of connections: ??
- Total number of synapses: 100-180 Trillion (depends on sex and age)
- Neo-cortical sheet is about 1000 cm^2 in area and 2.5 mm thick
- One mm^2 of cortex has about 100K neurons and 1B synapses
- Cortex is (horizontally) sliced in 6 layers (actually, there are more, we could say it has nine layers: 1, 2, 3a, 3b, 4, 5a, 5b, 6a, 6b)
Neuron
- It’s also referred to as “cell” in the HTM theory
- Pyramidal neurons are the most common type of neuron in the neocortex
- The pyramidal neuron is the core information processing element of the neocortex
- A pyramidal neuron has dendritic segments
- Each dendritic segment possibly has multiple synapses (or, in layman’s terms, “connections”) to other neurons
- Synapses are the “substrate of memory”
- Number of synapses : 5000 - 30_000
- 10% proximal, 90% distal
- 8-20 coactive-synapses (patterns) generate dendrite spike
- Number of detectable patterns : ~hundreds
- Average number of dendrite segments: ??
- Average number of synapses per dendrite segments: ??
The following is a diagram of an HTM neuron (representing a real and biological pyramidal neuron of the neocortex)
Note: need to add apical dendrites and synapses to the cell model visualization.
Cortical Region
- Although different regions of the neocortex process different inputs (vision, hearing, touch, language, etc.), at a fundamental level, these are all variations of the same problem, and are solved by the same neural algorithms.
- The regions are defined by connectivity.
- Regions pass information to each other by sending bundles of nerve fibers into the white matter just below the neocortex. The nerve fibers reenter at another neocortical region.
- The connections between regions define a logical hierarchy.
- Neuroanatomy tells us that every region of the neocortex has both sensory and motor functions.
- Therefore, vision, hearing, and touch are integrated sensory-motor senses:
- we can’t build systems that see and hear like humans do without incorporating movement of the eyes, body, and limbs.
- Therefore, vision, hearing, and touch are integrated sensory-motor senses:
Cortical Column
- A cortical column is about 1.0 - 1.5 mm^2 in area and contains about 2000+ mini-columns
- ~ 100k neurons
- ~ 500M synapses (1 mm^2)
- 10 cellular layers
- It is also called hypercolumn or macrocolumn
- In the current HTM model, 2048 minicolumns are associated with 1 spatial pooler (see below the SP section)
- All cortical columns are learning a complete model of the world of everything they get exposed to and they are all doing it in parallel.
- Each cortical column basically learns the same thing in parallel and votes (via layer 2 communication).
Mini-column
- About 30-50 microns wide with 100-120 neurons across all 6 layers
- Capacity : 100 neurons * 100 segments * 10 patterns = 100 000 patterns
- mCol field (say 2000 mCols) :
Capacity = (neurons * segments * patterns) / sparsity
C = 100 000 / 0.02 = 5 Million transitions
- mCol field (say 2000 mCols) :
- ~ 150M - 200M mCol in the cortex
- In the current HTM model, there are 32 cells per mini-column in layer 3 implementation
Encoder
- Takes sensor inputs and converts them into SDRs
- HTM Examples: Scalar Encoder, Random Distributed Scalar Encoder (RDSE)
- Take stimulus from the environment and translate them into a stream of SDRs that are neural activity going to the brain.
- An encoder takes some type of data–it could be a number, time, temperature, image, or GPS location–and turns it into a sparse distributed representation that can be digested by the HTM learning algorithms.
- The HTM learning algorithms will work with any kind of sensory data as long as it is encoded into proper SDRs.
Biological Examples:
- Vision: Retina
- Hearing: Cochlea
- Touch: Nerves
SDR
- A data structure which represents the activity in the neocortex, which is sparse and distributed
- It also represents the input (from the sensors, e.g. the eyes)
- Hence, it can be thought of as “the data structure of the brain”
- It can be implemented as a binary vector (or matrix), whose elements are either 1 (on) or 0 (off)
- iSDR i.e. indexed SDR is different representation which can be implemented as a tuple of (size, idx1, idx2, …) where ‘idx’ tells which bits are ON. F.e. 10100 <=> (5,1,3)
- The bits in an SDR representing an input (from a sensor) are associated with “features” of the input (i.e. some real or abstract object)
- The 1 bits of an SDR represent the fact that the specific SDR (representing some specific object) contains the corresponding feature
- The bits of other SDRs (e.g. the ones that the spatial pooler outputs) represent minicolumns
- 1 (or 0) bits represent the fact that the corresponding minicolumn is active (or inactive)
- An SDR is sparse because the percentage of 1 bits is very low compared to the percentage of 0 bits
- SDRs have useful properties
- High capacity (i.e., a lot of stuff can be represented)
- Robustness to noise (because of their mathematical properties)
- Efficient storage (as it’s sufficient to only store the “on” (or 1) bits)
Spatial Pooler
The following descriptions are, to some extent, a simplification of what the SP actually does.
- It’s a learning algorithm
- Its function is to identify common spatial patterns in the input
- In the current HTM theory, it’s an algorithm that is “executed” in the L3 a/b layer
- It receives an SDR as input and produces another SDR as output
- It maintains a fixed sparsity:
- the number of 1 bits in the SP output (which is an SDR) is always constant, even if the sparsity of the input SDR continuously changes
- So it can be thought of as a “normalizer”
- The meaning of the input SDR is maintained in the output SDR:
- the overlapping properties of the input SDR are maintained in the output SDR
- The bits in the output SDR of the spatial pooler represent minicolumns
- The 1 (or 0) bits represent the active (or inactive) minicolumns
- The minicolumn (associated with a bit of the output SDR) has a “potential pool”, which is a term used to indicate the set of bits of the input SDR which that minicolumn may be “connected” to
- Initially, the potential pool of each minicolumn is usually randomly initialized:
- In other words, the subset of input bits that may be connected to a specific minicolumn is, initially, randomly initialized
- The potential pool can contain both 1 or 0 bits
- Initially, the potential pool of each minicolumn is usually randomly initialized:
- Each of the bits of the input SDR that belongs to that potential pool has a numerical value associated with it, called “permanence value”
- The permanence value determines if that input bit is going to be “connected” to that minicolumn or not
- If the permanence value is greater than a “connection” threshold, then it is connected, otherwise, it is not
- The permanence value determines if that input bit is going to be “connected” to that minicolumn or not
- The number of connections between a minicolumn (associated with a bit of the output SDR) and the 1 (or “ON”) bits of the input SDR is called the “overlap score” (of that minicolumn for that specific input)
- Given that the potential pool is randomly initialized, the overlap score changes from column to column
- The overlap score also depends on the specific input SDR
- Hence, for a specific input (an SDR), the overlap score of each SP minicolumn induces a ranking of the SP minicolumns
- Minicolumns that have an overlap score over a certain threshold are called “active columns”, the other columns are called “inactive columns” (assuming a global inhibition area, i.e. every minicolumn is a neighbor of every other minicolumn of the SP)
- Active columns can now “learn”
- Inactive columns (i.e. the ones whose overlap score is not bigger than a certain threshold) do not learn
- Active columns learn by
- incrementing the “permanence value” of the connections to the 1 bits (of the input), and
- decrementing the permanence value of the connections to the 0 bits (of the input).
- This implies the formation and destruction of connections (during the learning phase).
- The connections that are formed or destructed during the learning phase depending on several aspects:
- The specific inputs
- The random initialization of the SP
- The connections that are formed or destructed during the learning phase depending on several aspects:
- “Boosting” and “inhibition” are regulatory mechanisms (the HTM counterpart of “homeostasis”) which balance the contribution of all minicolumns
See this, this and this HTM School videos for a more fluid (but still simplified) exposition of the concepts.
Temporal Memory
- In the current HTM theory, it’s an algorithm that is “executed” in the L3 a/b layer
- implements variable-order markov chain
HTM Implementation Parameters
- Num Columns (N): 2048
- Num Cells per Column (M): 32
- Num of active bits (w) : 41
- Sparsity (w/N) : 2%
- Dendritic Segment Activation Threshold (θ): 15
- Initial Synaptic Permanence: 0.21
- Connection Threshold for Synaptic Permanence: 0.5
- Synaptic Permanence Increment and Decrement: +/- 0.1
- Synaptic Permanence Decrement for Predicted Inactive Segments: 0.01
- Maximum Number of Segments per Cell: 128
- Maximum Number of Synapses per Segment: 128
- Maximum Number of New Synapses Added at each Step: 32
CC algorithm
-
motor input arrives before the sensory input and is processed by the location layer, which consists of grid cell modules. If this layer has an active location representation, it uses the motor input to shift the activity in each module, computing the sensor’s new location.
-
this updated grid cell activity propagates to the sensory layer and causes a set of predictions in that layer.
-
the sensory layer receives the actual sensory input. The predictions are combined with sensory input. The new activity is a union of highly sparse codes. Each sparse code represents a single sensory feature at a specific location that is consistent with the input so far.
-
the sensory layer activity propagates to the location layer. Each module activates a union of grid cells based on the sensory representation. The location layer will contain a union of sparse location representations that are consistent with the input so far.
After the fourth stage, the next motor action is initiated and the cycle repeats.