Esperanto NLP using HTM and my findings

marty1885 · August 1, 2018, 12:08am

Hi all.
I have been messing with NLP using Temporal Memory in the weekends. I coded up a simple program with and made it to learn the structure of Esperanto sentences. (I’m testing with Esperanto because the language itself is regular and has obvious features.)

And my results are… not surprising. HTM isn’t the most powerful ML model yet and it performs close-to or slightly better then a RNN. Yet it is still powerful in the sense that there is no need to backprop all the way trough the entire sequence thus making it possibly suitable for long sequences.

This is by blog post detailing the results.

BTW, It guess it is not NLP since Esperanto is not a nature language.

marty1885 · November 15, 2018, 1:01pm

Well… I did some more experiments. And now HTM is learning way better.
The changes boils down to the following.

Sample from the result distribution. Always taking the character with the highest predicted probability will cause repeating words.
Add some noise if not using SP with boosting to beak symmetry inside TM.
Better hyper parameters
Better dataset

One key observation is that setPredictedSegmentDecrement helps HTM to learn the structure and memorize common words. TM performs way worse without it.

Now the algorithm is outputting something like this:

Tis al liajlej vo ka pronis altaŭ pli kstaŭ pro kortonis al laĉanto. Kun li ĵeokonis laĉanton si sukcesi ĉicestrigis al la turo korpofteno pren de alto. I nto en simirviro korpor blaj multo.Amirĥ.Sensonis en diren li s en sen sen linokorpoforto. Zohomalprektis laĉofacili bom .En simeno.Kiameno.Ake i aŭ pren lino ke iajŭen direktis lino.Kiamalpaiano pron larĝa dustrigaĥĥhetrige is en la arbekun si

It is very clear now that HTM is learning the construction of Esperanto words. Most works ends with either a a, e, o, i, as, is or n. And There should be a space after a period. (I don’t know why it doesn’t predict that in some cases.) And even have memorized some common words like la, li, aŭ and ke.

Again, source code is available on GitHub.

I think the result is comparable to the first experiment in this post. Am I getting somewhere?

sheiser1 · November 15, 2018, 9:03pm

Fascinating work @marty1885! Its interesting to see your findings there.

Do you mean that it does better with decrementing incorrectly predicted segments than without doing it at all? Did you try different increment/decrement ratios? Also, are you predicted one character at a time using a category encoder for each different character? Great work.

marty1885 · November 16, 2018, 4:10am

Thank you.

Yes, decrementing incorrectly predicted leads to a better result. Without it HTM generates really long words.

Ŝaj buŝondajne utis cer lcesterage onton senta korpoformomeĝon.Kio.Lcilmomes junidis arbaimilludis oko arigecer lmfŭtod.Sehomzor pripeĉo.Nitriis da vaĵvaspŝisonis ontigraĵo irontigaicaj le zormome.Niĥĉodkiunnbrue ĝustor pol eblkceskjo.Lbuŝocestagratis bruoe. Kor ie adriis bruo.Eovosoni detaŭdidis ĉi ta lhealisoj skfacilmomege ĉefe or policajn ĝis.Irtimilsekmirgisrorpoforipeŝoblisvestetorigeĥkunuli

Setting it too high leads to repetition and only generating short/common words.

Sen d sen dla l li dro kaj la . Ciloŭ noŝahon siro kajn peĝiris la brtupeĝone ajn pren dis la ŝovo dris la ce aze li ekto li plen dla sen du feta . En dulon pris li lovo li sta moboal li zeŝamen dren dŝiro kaj la o kie aranoĵola to li brj la aa sen dhone aon sen dblgatubono. Pris la ĉi brfacadiro kaj li ĵoĵohonoĝacaŝoĥen dĵoen dze kien dhone aa done anois la sta pidiris li ĥvesiro kajn sen dhon

Yes, I found that the basic rule is that the increment should be slightly lower then the decrement. Ex: increment=0.03, decrement=0.032. Yes, I encode each character using a category encoder and the model is a character-level one.

sheiser1 · November 17, 2018, 1:55am

Very interesting that the decrement should be even high than the increment value, I wouldn’t have guessed but it does make sense now that you describe it. I wonder if this could be slightly adjusted to do next word prediction, like when you text and it predicts a few possible next words (some of which usually make no sense). Thank you again for sharing!

marty1885 · November 17, 2018, 2:24am

Next word prediction is certainly doable. But as I pointed out in another post. TM does not give me any sense of probability. Making predicting the next character a lot inaccurate.
https://discourse.numenta.org/t/possiblity-of-global-inhibition-in-temporal-memory/

No need! I think sharing projects is an if not the most important way for the community to grow.

Matheus_Araujo · November 19, 2018, 2:26pm

Mi parolis la esperantan, sed mi ne parolas pli. Mi forgesis, kar mi neniam praktikis, sed en 2003 mi gastigis francan virinon ĉe mia domo en Savanto, Brazilo. Mi ^sin renkontis en liston de Esperanto-parolantoj el grupoj de yahoo. Mi perdis tuŝon sed ĝi estis bonega sperto. Ŝi vojaĝis tra Sud-Ameriko. Mi perdis vian kontakton.
Tio estis bonega sperto.

***** Mi uzis google tradukiston por helpi min skribi ĉi tion.

ExoBlue · November 19, 2018, 4:05pm

I spoke the Esperanto, but I did not speak more. I forgot, since I never practiced, but in 2003 I hosted a French woman at my house in Salvador, Brazil. I’ve met in a list of Esperanto speakers from Yahoo! groups. I lost a touch but it was a great experience. She traveled through South America. I lost your contact.
That was a great experience.

***** I used a Google translator to help me write this.

********** I used a Google translator to help me translate this.

marty1885 · November 19, 2018, 4:16pm

Ŝi vojaĝis mia domo en merikis sed ĝizilo. Mi pli. Sperto. Ŝi vojaĝis tra sudamerikis sed ĝiĵorgesisrto. Ŝi vojaĝis tra sudamerikis sed ĝituŝon gastigis francancan virinon ĉe mi ne parolas ĉe mia de ahoo. Mia de ahoo. Mi forgesiston.Tio esperantoparolaniam praktikis sed ĝimo ed ĝiniam pli. Forgesisi forgesisĵlistoj de esperantancan virinon.Tio el grupoj de esperantoj de estis virinon.Tio el grupoj de ahoo. Mia domo eniam pli. Jaĝis tra sudamerikis sed ĝiĉe mi pli. Can vian kar mi forgesismia de esperantancancancancan virinon.Tio estis pli. Gis frantoj el grupoj domo en listoj el grupoj

**** What HTM generates after learning from your paragraph.

Nice to meet another Esperanto speaker.
Bone rekonti alian Esperanto parorantan.

Matheus_Araujo · November 19, 2018, 5:22pm

(plene mirinda!) Totally wonderful!
(bela laboro, HTM) nice job, HTM!

sheiser1 · November 19, 2018, 6:12pm

Whoaa hold on a sec, I just want to fathom what the HTM did here. So @marty1885, you passed @Matheus_Araujo’s paragraph into your HTM letter by letter:

Mi parolis la esperantan, sed mi ne parolas pli. Mi forgesis, kar mi neniam praktikis, sed en 2003 mi gastigis francan virinon ĉe mia domo en Savanto, Brazilo. Mi ^sin renkontis en liston de Esperanto-parolantoj el grupoj de yahoo. Mi perdis tuŝon sed ĝi estis bonega sperto. Ŝi vojaĝis tra Sud-Ameriko. Mi perdis vian kontakton.
Tio estis bonega sperto.
***** Mi uzis google tradukiston por helpi min skribi ĉi tion.

Then your HTM generated the paragraph:

Is that right?

If so I’m curious:

once the model was trained on his paragraph, what did you feed in to invoke the predictions of all those characters? or did it start predicting them online?
was his paragraph the only data the model had ever seen?
does the paragraph generated by the model make logical sense in Esperanto?

I’m no expert in this area at all, just intrigued by your work here. I know there are LSTM systems used to generate content and even stories one character at a time, and it seems a worthy pursuit to compare HTM on that task. I know there are many potential ways to encode natural language into HTM (like Cortical.io 's approach), but this way of treating characters as distinct categories is certainly simpler to understand (if I have it right). I can’t help but wonder how far it could go. Anyways, thanks again for sharing and I hope I don’t burden too much with these questions.

Matheus_Araujo · November 19, 2018, 7:17pm

That’s the translation of HTM said:

Ŝi vojaĝis mia domo en merikis sed ĝizilo. Mi pli. Sperto. Ŝi vojaĝis tra sudamerikis sed ĝiĵorgesisrto. Ŝi vojaĝis tra sudamerikis sed ĝituŝon gastigis francancan virinon ĉe mi ne parolas ĉe mia de ahoo. Mia de ahoo. Mi forgesiston.Tio esperantoparolaniam praktikis sed ĝimo ed ĝiniam pli. Forgesisi forgesisĵlistoj de esperantancan virinon.Tio el grupoj de esperantoj de estis virinon.Tio el grupoj de ahoo. Mia domo eniam pli. Jaĝis tra sudamerikis sed ĝiĉe mi pli. Can vian kar mi forgesismia de esperantancancancancan virinon.Tio estis pli. Gis frantoj el grupoj domo en listoj el grupoj

She traveled my house in merchandise but hooked. Me the more. Experience. She traveled through South America but a song concert. She traveled through South America but a funeral hosted a French woman with me does not talk to me at Yahoo. Ahoo mine I’m a forgettler. An Esperanto speaking person has practiced but it’s just about it. Forgotten recorders of a Esperantist woman. This is a group of Esperantists of a woman. This is Yahoo group. My house never more. It wore through South America but it was more than me. “Can” you because a forgetfulness of a woman of “Esperantancecancancan” woman. That was more. Since the groups of groups are grouped in lists of groups

I used google translator in order to translate it and made some changes based on what I rememebr of the language I once spoked.

Paul_Lamb · November 19, 2018, 7:20pm

…

Matheus_Araujo · November 19, 2018, 7:35pm

I think it has a pejorative meaning in English, isnt it?

sheiser1 · November 19, 2018, 7:53pm

Thanks for the translation! It makes sense to me that even if the total sentences don’t make sense the letter-letter transitions are almost all do, except for ‘Esperantancecancancan’ I suppose.

I have to marvel at this last line too.

This makes me wonder what it would sound like if the HTM were fed words at a time with each word as a different category (surely necessitating some vocab limit). That of course gets more complex because many words are similar or have similar meanings, which gets lost when they’re treated as distinct categories (enter Cortical.io 's Semantic Folding approach). If nothing else it would further show the general purpose ability of HTM to learn many transitions from a limited amount of data regardless of the data type. Thanks again for sharing

Matheus_Araujo · November 19, 2018, 8:06pm

I’m really amazed about it.

Bitking · November 19, 2018, 8:10pm

Yes and no on the vocabulary limit.
If you programmed the retina at the same time as the sequence generator like the brain does then the two units should always be perfectly synchronized.

Matheus_Araujo · November 19, 2018, 8:21pm

Which units are you refereing to, @Bitking ?

Bitking · November 19, 2018, 8:36pm

I assume that he is referring to training two units: a “word generator” (instead of letters) and a word sequence generator (grammar?). There is no reason that the word generator could not be trained with letters as is being done here and now.

Matheus_Araujo · November 19, 2018, 8:54pm

Oh, I see what you mean, now. Thanks! I just don’t have enough knowlegde to infer whether you’re right. I think that if we’re thinking HTM as analogous to our brains, it should be able to learn based on character sequences. But to make it easier to HTM learinning a language we should use the same process a baby does, learning the words first by listening to it and than expressing through the learnt words. We only learn how to read and write a language after we can understand some language and then learn how to speak a language.

Topic		Replies	Views
Help to with NLP in NuPIC NuPIC	3	612	April 8, 2019
HTM and language modelling NuPIC	5	951	December 29, 2020
Application of HTM algorithm Engineering question , community	6	92	May 2, 2025
Is there a paper for comparing traditional Machine learning and HTM system? Machine Learning	2	493	May 15, 2020
Some questions about HTM theory Numenta Theory question	11	942	January 19, 2017

Esperanto NLP using HTM and my findings

Related topics