Encoder and Spatial Pooler Confusion


I’m just starting to program this system following the many materials available, but I’m confused how encoders->Spatial Pooler->Neurons works. so we need to have an encoder to turn the data into an SDR, and then we run that through the Spatial pooler to get a different SDR?

I might be confused as to what encoding is. If i were encoding hue, i could have a 100 bit array representing the hue, and lets say a block of 12 bits that moves across this, and then wraps around to the start. Would this be considered data that is encoded? Am I supposed to do something to this array before feeding it to the Spatial Pooler, or can I just let the Spatial pooler use this directly and work its magic?

Why does the Data passed into the Spatial Pooler need to be pre-empively Sparse if the Spatial Pooler is going to give you an SDR anyway?

Sorry if this is a stupid question, but I have a bit of trouble wrapping my head around this


I have taken this from a scientific paper I wich that helps you ^_^:

SDRs are the neocortically inspired data structure used in HTMs; they represent neurons and their state by means of a binary vector. For such a vector to be an SDR, only a small percentage, typically two percent, of its entries should be active (i.e., set to one). This is based on the fact that the relative number of neurons that are active in the neocortex at any given time is low. SDRs encode the semantic meaning of the data and have some valuable properties such as fault and noise tolerance, can be easily compressed, and the semantic similarity between two inputs can be quickly determined via a bit comparison, among others .

Transforming input data (It ) such as scalar values, dates, and categorical values into binary arrays (xt ) is achieved by using encoders. They are responsible for determining which entries in the array, or which bits, will be ones and which ones will be zero. This must be done in such a way that the semantic characteristics of the data are captured in its encoding. In this way, encodings for semantically similar data will have a larger number of overlapping bits than dissimilar ones.

Each encoded pattern xt is transformed into an SDR representing a set of neurons arranged in columns by the spatial pooler. For the purpose of this component, all neurons in a column are treated as a unit since it is assumed that all neurons within a column detect identical feedforward input patterns . Similarly to the input binary array, each column can be either active or inactive. Furthermore, the resulting SDR maintains the semantic properties of the input array; that is, SDRs of input arrays that are semantically dissimilar will also be dissimilar and vice versa. This is achieved by assigning each neuron with a set of potential connections (or synapses) to a random subset of the input array. Each potential synapse has a permanence value associated with it and only when this permanence value is greater than a threshold, the synapse is said to be connected. A neuron is then said to be active in the output SDR if the number of connected synapses to active entries in the input array is greater than a threshold. The learning process in the spatial pooler consists on adjusting these permanence values so that the model learns to represent spatial properties of the input data using SDRs in a way in which the semantic properties of the input vector are maintained.

The SDR produced by the spatial pooler, a(xt ), is fed as input to the temporal memory. In this case, individual cells or neurons within a column are differentiated upon as at any given point in time, a subset of neurons in the active columns will be used to represent the temporal context of the current spatial pattern. Neurons within a column can be in three different states: active, inactive, or predictive. Neurons in a predictive state become active in the next time step if they receive sufficient feedforward input, that is, if their column becomes active. In this way, neurons in such a state are anticipating that they will be active in the next time step, and hence are predicting what the next input may be. The output of the temporal memory, π(xt ), can then be interpreted as a prediction for a(xt+1), that is, columns that could potentially beactive in the next time step.


I had a similar question before. There can be a little bit of confusion brought by inconsistencies when SDR’s are described in some areas in this forum. By concept an SDR is basically the meaning of Sparse Distributed Representation, you may Google it up and see consistent meanings of it.

By implementation, the Spatial Pooler(SP) accepts an input SDR which is the encoded value of the raw input - the input of the SP. The SP also outputs an output SDR which is basically the activated columns at a given SP step - can be an input to another HTM component (e.g. Temporal Memory).

Yes and I don’t think there are strict rules for encoding data but there are guidelines about maintaining sparsity and semantic meanings. And of course they have to fit the desired input dimensions.


Encoders do not need SDRs for input. Sensory input is also not sparse. Encoders must output SDRs though.


so you get your data, you “encode” it into a Sparse representation, and then the Spatial Pooler turns it essentially into another SDR? I guess im having a hard time understanding why you need to encode the data into a sparse representation. it seems the Spatial pooler does this just fine


Because HTM can’t handle data types other than binary vectors which would have vague meanings if they’re too dense.


So with the Spatial Pooler a good density is 2%, could the encoded vector be like 20% dense, or does it need to be around the 2% mark as well to be useful? the htmschool videos (lesson 7) showed some encoded data of dates and times with the gym-power usage example, and they looked around 40% dense. is this the data that’s being passed to the spatial pooler, or is the encoding another step inbetween not shown in that video?


An SDR that an encoder outputs can indeed be quite dense and its density can vary a little.
But the spatial pooler turns it into another SDR has fixed and also very low density(of typically about 2%) so the temporal memory can process.


Well thanks very much everyone, that answers my questions. confusion cured!


You do not have to do this. Encoder input does not have to be sparse.