Application of HTM algorithm

Has anyone tried to use HTM algorithm to learn the language (effectively like a language model)?
Just curious to know if anyone has attempted in that direction (or in general NLP).

Thanks.

1 Like

I have been on the forum for years. I don’t recall seeing anyone who has done this.
One company (cortical-IO) has used some of the principles of HTM to develop a set document clustering/seach tools.

1 Like

Interesting! if you don’t mind me asking, do you see any potential challenges with pursuing research in that direction? I have been quite interested in HTM theory and am exploring whether it could be a viable research direction.

One issue that comes to mind is how to learn encodings or SDRs for words and tokens. Additionally, since HTM is fundamentally a sequential learning system, I wonder whether that might limit opportunities for parallelization (although this a secondary issue in my opinion).

Thanks for the response,

1 Like

The framework required to make an HTM-like thing do a LLM would be extensive. A single module? No, a large network of modules, maybe.

1 Like

Also, the existing LLMs leverage GPUs. Not sure if HTM works well on GPUs. So we would need lots of CPUs. Perhaps the EPYC CPUs with lots of cores?

I think the HTM algorithm itself would have to be altered to work nicely for language models.

Doing the SDR encodings will be a challenge. Similarities between words / tokens are not as straightforward, and getting that right will be critical. The similarity between any two tokens will be multidimensional, and somehow that will have to be captured.

There is a possibility to make this more efficient than the current LLM approaches. But that will require a lot of work, I think.

1 Like

cool…thanks @flajann2 , that makes sense. Adding to your point on embeddings, I guess eventually, somehow a proper learning system should be able to learn the embeddings as well (contrary to the manual fixed SDR representation for each token with HTM) I guess?

2 Likes

That’s a thought. Basically, the entire approach with SDRs is that similar things overlap. What constitutes a “similar token” in this regard? And can those similarities themselves be learned via machine? It would be a daunting task to have to do that for every word in every language!

It just occured to me that rather than dealing with tokenized words directly that we should instead have a meta-tokenization system.

Thusly, The words “stop” in English, " arrête" in French, “halt” in German (and also a cognate in English!) or “pare” in Spanish, etc, would map to the same token, say ((stop)).

Now, a word like “dog”?
“Hund” in German
“perro / perra” in Spanish
“chien” in French

At first, one might want to map it to the ((dog)) token. However, that word is very nuanced in different languages. Calling someone a “perra” in Spanish is a very deep and derogatory insult.

So now we have to deal with context. So perhaps the ((dog)) token in normal usage, and in the Spanish case, depending on context, ((prostitute))!!!

So we can have our set of meta-tokens, with each token only having one meaning, and simply have the assignments done based on context and the language.

Obviously, going back from tokens to the human languages should not be hard at all, and to boot, you can – more or less – do it with any language. Of course, something has to deal with the differences in grammar, etc.

So the SDR representation of the meta-tokens become a lot more straightforward, since one is not having to encode for the nuances related to context.

Just my two cents.

2 Likes