Htm.core -- code review proposition ($)

sheiser1 · January 7, 2022, 8:27am

Hey all!

My goal: Get my htm.core wrapper reviewed by the community
My question: Anyone willing to do so (for a little $)?

So I’ve completed (I think) a valid 1st version of my wrapper for htm.core.

The purpose of the module: is for full-on streaming capacity. For the scenario where data just starts arriving one by one, with 0 prior knowledge of it, and the module initializes and learns models.

The current examples within htm.core expect a whole file of data – and a pre-built model config valid for that particular data set.

I want to escape these needs, to handle the scenario where there’s:

no single file available containing all/multiple rows
no prior knowledge of the data before it arrives

To handle this, I have the module sample the first n rows before initializing models – just to set the resolution param for RDSE.

The proposition: I’d like to have my code reviewed by as many as 3 people, who have used or ideally helped develop htm.core – basically to sanity-check the logic. For anyone interested, I thought we could have a 30-40 min video chat where I show you a running example & walk you thru the code – and for this I’d pay you $50 USD.

If anyone’s interested you can reply here or email me at: sheiser1@binghamton.edu

Thanks so much again!

Falco · January 7, 2022, 11:54am

I’d be interested to follow along, but I don’t know much about htm.core. Maybe I can help with the occasional noob question, or just remain silent for the session and send you some questions and remarks by email afterwards. That way you can judge for yourself if it is useful or not.

Of course, I wouldn’t want any payment for that :-D.

sheiser1 · January 7, 2022, 12:03pm

Thanks @Falco!

That’d be great!
You can tell me how clear the concept & code is – and make any suggestions that occur to you!

I actually just finished a draft of a slide, trying to capture the big picture:

I’m enthusiastic to discuss this, so if its convenient for you we could have our own quick meeting if you like?

I forget what timezone you’re in, but for me its 5:30 PM now.
I’m game at your convenience

Falco · January 7, 2022, 12:15pm

I’m at Central European Standard Time. 12 minutes ago (time of your post) it was 13:03 over here.

But that said, I can stay up late or get up early without a problem. I think your time (and probably other interested parties’) is far more precious than mine, so I’m happy to accord to your schedule.

David_Keeney · January 7, 2022, 4:51pm

As one of the maintainers of htm.core I would be glad to take a look at your implementation. The htm.core library was intended for applications and experimentation such as this.Oh, and no need for a payment. It is enough for me to see the library being used.

sheiser1 · January 7, 2022, 5:32pm

You do man @Falco! I really appreciate the support.

Nonsense! You’re way to humble man!

sheiser1 · January 7, 2022, 5:33pm

Thank you so much @David_Keeney!!

Can I ask what timezone you’re in so we can find a compatible time?

I’m so excited for this!

David_Keeney · January 7, 2022, 5:55pm

I am Pacific Standard Time, it is 9:54am as of this posting.

sheiser1 · January 7, 2022, 7:28pm

Hey @Falco!

Could I get your email? I’ll add link on a thread with David.
Thanks and looking forward to it!

cezar_t · January 7, 2022, 11:13pm

Sounds like a general AI to me. Well, at least slightly more general

One tricky part of the chart there is to figure out of the 1,2,3 stages - to which one a new incoming frame belongs to.

If there-s any room left I’d like to see your ideas too. Although I might not be useful if all you need is intricate knowledge of htm framework.

sheiser1 · January 8, 2022, 2:48am

Sounds great @cezar_t, the more the merrier!!
Would you share your email address?
I’ve started a thread with David in terms of planning the meet and checking out the repo – would love to add you!

Thanks for your interest!!

sheiser1 · January 8, 2022, 3:39am

Well its more general way of building & running htm.core models at least – generating the model configs using only values sampled from the stream, instead of prior knowledge of the data sets – but the encoding+sp+tm+anomaly algorithms are standard!

cezar_t · January 8, 2022, 3:51am

Yeah, sounds interesting.
Answering the question “when have I’ve seen this kind of data?” seems to me an important ability of a general AI.

Do you want to distinguish between similarly structured inputs too?

e.g. MNIST and Fashion-MNIST are both 28x28 x 8bit gray images, a meta-classifier able to distinguish between the two streams would be cool. And quite difficult I guess.

sheiser1 · January 8, 2022, 4:08am

Yes this would be great!

I think a form of this meta-classifier could be done with a meta script – using the current module to generate models sets for the 2 streams, then feeding remaining data from either stream to both model sets.

Hopefully one set of models would stand out as more familiar to the data – embodied by lower anomaly scores.

Not sure if that would achieve your meta-classifier concept, but just one idea.

cezar_t · January 8, 2022, 5:20am

I’ll wait for details you have on this. Feeding both models seems expensive to me, I would attempt having a single model active at a time and use its surprise signal to check whether another model is more familiar with the data stream.

We (humans) are also mostly single-tasking and when anything out-of-the-blue pops out we have a delay before either switching to another learned model or load the meta “wtf-is-this” model.

This could be a step if not towards generality, at least to a dynamic allocating new (or switching between) simple(r) models, without forgetting, on very complex pattern streams.

Sorry if I digress here.

CollinsEM · January 8, 2022, 5:50am

@sheiser1 I’ve not yet used HTM.core, but I’d like to take a closer look at it. I would be happy to sit in on your code review and offer whatever insight and/or feedback I can. I’m in the US Central time zone. Please let me know when you plan on conducting the review.

sheiser1 · January 8, 2022, 12:04pm

Alright!! Thanks @CollinsEM!

We have an email discussion going to hash out when exactly it’ll be.
I’ll add you to it if you like! If so you could email me so I’ll have your address, I’m at: sheiser1@binghamton.edu

Looking forward to having you there!

sheiser1 · January 8, 2022, 12:08pm

Not at all!

The approach I mentioned is pretty basic, and wouldn’t scale realistically - due to the expense of multi-models like you said, and surely lack of sufficient complexity.

I’ll think about it more, and look forward to discussing it further!

cezar_t · January 10, 2022, 7:13am

I think the above ideas should be visible in some file on the github repository. I mean there-s info on how to install/test it, how it should work but no clues on what kind of problem(s) is it supposed to solve.

Maybe would be nice to be more specific upon what do you mean by full-on streaming capacity. Is there an HTM “daemon” which you can feed in arbitrary tabular data and it just instantiates a pool of models necessary to “learn” from input data then responds/replies with… what?

Because I don’t get how such a streaming capacity is related with running HTM examples. Or these are just two unrelated future use cases of your code (streaming vs running htm examples)?

sheiser1 · January 11, 2022, 1:13am

Good point!
The purpose of the HTM models I refer to is monitor & forecast numeric data streams, getting (at each time step and for each model) the:

anomaly score
anomaly likelihood
forecasted feature values from predictor
number of predictions made by TM

‘Full-on streaming’ is meant to make this process more flexible – as it allows a new & unknown data source to be modeled without any ad-hoc analysis or EDA done first. Usually there’s some such step required first – in order to set the encoder params like resolution. Now setting the encoder params is done automatically – allowing the HTM models to be initialized automatically.

The user just needs to know going in:

the names of the numeric features contained in the new stream (including timestamp feature if any)
whether they want to model each specified feature separately, or combine all into 1 model
how many time steps they want to sample before initializing the model(s)
what model hyperparams they want to use (sp, tm, anomaly, predictor)

So these HTM models monitor data streams like HTMs commonly do, but here the model configure & initialization process is also automated – not just the model updating.

I hope that can expand (to some extent) the scalability of HTM modeling too many new data streams automatically.

Topic		Replies	Views
HTM for IT Applications	62	3429	November 26, 2019
Is my data being predicted correctly? NuPIC usage-help	37	6239	August 9, 2016
EMG Processing Network Questions HTM.Java	11	2809	May 18, 2016
Advanced settings for model parameters in HTM Engine Applications	5	2374	October 26, 2016
Why isn't HTM mainstream yet Numenta Theory	77	10213	July 8, 2019

Htm.core -- code review proposition ($)

Related topics