The output of the Encoder can be a dense representation?

In my mind , the encoder’s output can be a dense representation, however, the pattern that SP has learned is must be sparse, am I right? In other words, the input sapce can be dense ,and always dense, but the things learned by SP are always keep a 2% sparsity.
Does the width and n of an input space must be equal with the width and n of SP?

1 Like

That is generally correct, but in actual practice, the ideal usage seems to be more like:

  • Input (Most Dense) =>
    • Encoder Output (Medium, ~40% density) =>
      • SpatialPooler Output (Sparse, ~2% density).

I’m pretty sure that space sizes for inputs, encoders, and SP can all be different. The only sizes that must be equal are the SpatialPooler + TemporalMemory.

1 Like

Is the “Input” refering to what SP is learning or something encoded by an Encoder?
I think the output of an encoder is the input to be learned by SP, am I right?

Sorry, confusing terms. :slight_smile: Is this better?

  • Input (Dense) representation is Raw data, like [56.2, 3.456, 9.9, etc.]
  • This raw input is fed into an Encoder.
  • Encoding (Medium) representation is SDR with ~40% density, like [11011011001001].
  • This encoding is fed into SP.
  • Spatial Pooled (Sparse) representation is SDR with ~4% density, [001000010001].

Personally I wouldn’t refer to the raw data as a representation. While it is certainly technically correct, I think using this terminology in the context of HTM implies a bit array that is densely populated (possibly the source of @flash59’s confusion).

Sounds like a good starting place. This is an important property to tweak for optimizing your results, though. It really depends on the diversity and semantics of the data that you are encoding.

Some silly examples (that are easy to visualize) to explain what I mean: If you were to have only two possible values that were semantically dissimilar, then optimally you would use 2 bits and 50% density. If you had four possible values that were semantically dissimilar, then 4 bits and 25% density. If those four possible values required 25% semantic similarity with their neighbors, then optimally you would use 13 bits and ~30% density. And so-on. So the optimal size and density really depends on the data that you are encoding.

Yep. The job of the SP algorithm is to take whatever size and density the encoded data is in, and output a representation of fixed size and sparsity, while preserving the semantics of the encoding.


I’ve just randomly noticed a similar theme to what we’ve described above but from another part of science:

Analog(continuous) >> sampling >> discrete >> quantization >> digital

Perhaps it’s a slightly helpful analogy, at the least.

Thanks alot! Maybe this is a type of abstraction of the physiological model working style.

I agree with you!