See HTM School for an introduction to several types of encoders in these two videos:
Encoder Resources
- Encoding Data for HTM Systems
- Geospatial Coordinate Encoder
- Random Distributed Scalar Encoder
- Beginner’s Guide to NuPIC
When creating an encoder, the main focus is to make sure the semantics of the particular data type are captured. Specifically, this means that similar values should have a lot of overlapping 1’s in their encodings while values that are not similar should have very few or none.
Current Encoders (see code)
- Scalar
- Adaptive Scalar
- Category
- Date
- Coordinate
- Geospatial Coordinate
- Delta (derived encoder for scalars)
- Log
- Multi - Takes multiple values and creates a combined SDR from multiple other encoders.
- Non-uniform scalar
- PassThru (Identity)
Scalar Encoders
Our scalar encoders use a few parameters to determine the encoding for a given value. There is a minval
and a maxval
that determines the absolute value range. Then there is a number of bits, n
, and a width, w
. The encoding will have n
total bits with w
on bits (1’s). Values are put into buckets. There are (n-w)+1
buckets that each represent an equally-sized value range between minval
and maxval
. The smallest bucket is represented with the first w
bits on and the rest off. The next larger bucket is represented by shifting the on bits to the right by one position. In this way, adjacent buckets have the most overlap which helps to capture the semantics of scalar values.
Example: A scalar encoder with a range from 0
to 100
with n=12
and w=3
will produce the following encodings:
-
1
becomes111000000000
-
7
becomes111000000000
-
15
becomes011100000000
-
36
becomes000111000000
The first thing to note is that values that fall into the same bucket are represented identically as you can see with 1
and 7
. For values that fall into separate buckets, however, the closest buckets share the most overlapping bits. For instance, here are two overlapping bits for 7
and 15
but only one for 15
and 36
. And there aren’t any for 7
and 36
.
Adaptive Scalar Encoder
This encoder is identical to the scalar encoder except that it will increase maxval
if it sees a larger value and it will decrease minval
if it sees a smaller value. In our implementation we did not update the spatial pooler connectedness to the input bits so the spatial patterns that had been learned would become out of date as the min and max values changed. As such, we recommend using the regular scalar encoder with a fixed min/max range.
Coordinate Encoder and Geospatial Coordinate Encoder
The Geospatial Coordinate Encoder (GCE) converts a GPS position to an SDR. It has the following desired properties:
- Positions spatially close together have overlapping bits in the encoding.
- When moving at low speeds, resolution of movement is more fine, and when moving at high speeds, resolution of movement is more coarse. Thus, when moving at higher speeds, bigger movements still retain overlapping bits.
- It works anywhere in the world; in fact, it works for an infinitely large space.
The Coordinate Encoder (CE) is a generalization of this. In fact, the GCE is implemented as a subclass of the CE.
While the GCE takes (latitude, longitude, speed)
, the CE takes (coordinates, radius)
. coordinates
determines the position to be encoded, and can be in any number of dimensions.
See this video and the associated slides for more details on these encoders and how they work.
Identity / Pass through Encoder
This encoder takes an SDR input and outputs the SDR without changes. It is useful for when your preprocessing creates an SDR or when you are experimenting with a new encoder scheme.