HTMs as Serial Associative Memories

Hi all!

First of all I’d like to say how excited I am to have found Numenta and HTMs! After digging into the theory and learning resources I think it is very likely to be exactly what I am looking for!

I have been working on a personal research project for some years now. I am investigating computational structures that act as exact and deterministic mapping functions between stimulus-product pairs. A mechanically motivated example would be that of a robot with one optical sensor and a speaker. When the robot sees the color red it might make a soft beep and when it sees the color green it might make a loud beep. One might immediately think that this is a very simple algorithm to write which would be true. However, the data I am interested in processing is highly dimensional. In this case the robot might see a 2d grid of colors and would need to output a highly dimensional point cloud by association. The key, though, is that the mapping between the lower fidelity data and the higher fidelity data needs to be exact, just like a hashmap might act. But instead of a hashmap, I am interested in a flexible and trainable computational unit that is capable of producing an exact mapping.

Here is a very simplistic example:

Let’s say we have a rank 2 tensor of arbitrarily-valued scalars S where S = [[1],[2][3]]
Let us then say that we have a rank 2 tensor of arbitrarily-valued binary P where P = [[1,1,0],[0,0,1],[0,1,0].

The goal is to generate a mapping function M such that each element at some index I in S is perfectly mapped to each binary vector B in P at index I:

S[0] M P[0] would be [1] -> [1,1,0]
S[1] M P[1] would be [2] -> [0,0,1]
S[2] M P[2] would be [2] -> [0,1,0]

Up until this point I have been trying to use traditional feed forward neural networks to act as this mapping function M and train them using floating point scalars. I’m able to obtain some good results when the size of the tensors are very small. However, once I scale the data to a decent size the accuracy drops off. I believe this to be because of the simple fact that the task of perfect curve fitting a highly dimensional function is inherently intractable. Unfortunately, my use case requires that I map large and highly dimensional datasets together. So I am looking for a new model of computation to fit my use case. That’s when I stumbled upon HTMs!

I have been slowly getting through the material and specifically watching all the HTM school videos. The volume of information is a bit daunting and am struggling to determine if HTMs will suit my use case without spending days and days going through the material. So I thought I would post to present my potential use case and have a conversation about it with the experts!

Based on my example use case, do you think HTMs would be a good model for me to pursue?

2 Likes

*disclaimer: not an expert. Just a hobbiest explorer of HTMs and cortical operational theory.

HTM excells at two things:

  1. Continuously learning patterns and producing sparse density representations for those patterns (SDRs)
  2. Assuming that data is sequential in nature (such as path data, streaming data, etc.) it can remember sequences.

An added benefit which is then exploited is to take advantage of when a pattern is NOT recognized and to what degree, in order to detect anomalies.

It is composed of two parts:

  1. Spatial pooler
  2. Temporal memory

The spatial pooler works to learn patterns of input from the ‘input space’. As minicolumns within the pool learn patterns, they begin to uniquely identify semantically similar and dissimilar input.

The temporal memory then sits on top of, or in parallel to the spatial pooler, using cells within each of the minicolumns to create mappings (and strengths) between different timesteps, so that over time, it can predict which spatial pattern will come next.

It is completely possible to use the spatial pooler without the temporal memory component, and several companies have done this cortical.io, for example, does that, encoding word embeddings into their input spaces so that they can find overlap between word and concept meanings even.

So to answer whether or not HTMs are meant for you, you have to know if your data fits this style of system, or can be encoded to fit. If you want the mapping to be perfect, you may have to overtrain at first, then disable learning so that SDRs stop changing over time, and instead the system just becomes a multi-dimensional linked lookup table that outputs SDRs.

That’s my take, but I’ll let others opine as well.

3 Likes

Quick correction, cortical.io does not use SP to create their word fingerprints, but you could say that their fingerprints have already been spatially-pooled. They do it in a different way. But you can take those word SDRs and use them as nodes in sequences and that works really well with TM.

3 Likes

In my opinion, this doesn’t sound like the best fit for HTM. The HTM algorithms mainly apply to sparse, noisy, ambiguous data (though dense input data can be sparsified, as long as it has encoded semantics). Your use case, on the other hand, appears to be inherently dense, noise-intolerant, and deterministic, if I understand it correctly. Where the HTM algorithms shine are when they can rely on the properties of sparsity to be able to sample a few random bits in potentially error-prone and noisy environments, and still determine whether or not a given pattern (or one that is semantically similar) is present.

4 Likes

I’m not sure this would help… but what I do to do mapping is build a numenta-segmented-neurons where every segment reacts to different SDR-pattern.

have say 2000 of them…
memorize the mapping by filling the segments …for every output bit
then predicting by winner-take-all by overlap… based on input SDR

u can visualize it for youself as 3D array… X-input-segments, Y-output-bit, Z- SDR pattern
I switched to indexed-SDR, rather than binary to save memory and speed

BTW… NN do not work for distributed patterns

1 Like