Generalization benchmark?

matan_tsuberi · June 1, 2018, 11:09pm

I think one of the potential strengths of HTM as opposed to other ML approaches is the promise to be a very general model/algorithm.
I think we should put this to the test and come up with a set of problems to teach a single model watch for the following properties:

The model does not degrade very much in performance over previous problems.
The model is reasonable good at all tasks.
The model learns more efficiently tasks which are related closely to a previously learned task (i.e. is able to transfer knowledge from one task to another).

Ideally this set of problems should be simple and concise enough that we could compare them to other ML models and use them as a benchmark to test the generality of any model.

Paul_Lamb · June 2, 2018, 12:36am

Seems to me that the main requirement for such a benchmark will be the ability to “do” something (i.e. a working SMI implementation). This is of course where the research is focused right now. Unless of course you are just talking about a battery of classification, prediction, and anomaly detection tasks (perhaps across different modalities)

If anyone wants to work out the details of the benchmark, I’ll be happy to contribute ideas and help write up an applicable HTM based system. We could fake some of the unknown parts (such as coordinate system translations, etc)

keghn_feem · June 2, 2018, 2:28am

The process should be to find repeating data in a repeating pattern loop.
I like to use K-means, with Bag of words or bag of features algorithm, with weight
metrics.

To compare is a powerful thing.
Starting with the bag of features algorithm, sample of two different location of data by
shoveling them into bags with a big wide shovel.

Then compare what is in each bag. Starting with maximum generalization. Which
is to blur the data so everything looks the same. By change weigh pared with
each measurement that describes what is the bag.

First weighted value is the amount of things each bag.
At a less generalized level there could be five letters of “B”, six letters of “G” and two
letter in each bag the do not match. Which cold be represented by the letter “k” or two “k” in each sample.
Less generalization is the order of letters in each bag.
Each letter has a value that tell their position in a sentence and that is weighted.

When the samples are looked at the highest resolution nothing will repeat. Intelligence
only deals with repeating pattern loops. Also, A repeating distance between the two sample in a loop must be considered, and weighted.

At a very highly generalized level anything model can be forced into a loop. So if these loops are out of touch with energy management and damage management of the
system or life form then they are not real.

Paul_Lamb · June 2, 2018, 2:32am

@keghn_feem, I assume this is a description of a possible “other ML model” that would be compared to an HTM model, correct? It sounds like you are talking about processing letters/ words/ sentences specifically (or was that just an example use case?)

keghn_feem · June 2, 2018, 3:47am

For example. But it will work for anything.

Paul_Lamb · June 2, 2018, 12:43pm

Great, then I will accept the challenge and see how well I can apply HTM concepts and compare with the performance of your bag shoveling strategy. We just need to work out the details of the benchmark tests to use for the comparison…

BTW, I should point out that I am fully expecting to loose such a competition – I’m mainly curious to see how well the application of HTM concepts (in the current stage of its evolution) compares to other ML techniques specifically in the area of generalization.

jimmyw · June 2, 2018, 11:07pm

I think this is a great idea. Pretty big challenge to define compared to the canonical machine learning benchmarks, but I guess we have to start somewhere and refine it.

I know there has been a lot of talk here of applying HTM to robotics to achieve goal-oriented behaviour, what about a simulated navigation scenario?

Paul_Lamb · June 2, 2018, 11:43pm

Besides coming up with the specific tasks, what should be the general ground rules? For example, from an HTM perspective I could see a requirement being that you must reuse the same region(s) of neurons for all the tasks (versus building a separate one for each task). That would allow some interesting additional tests, such as how subsequent tasks might impact performance when it is re-applied to the earlier tasks.

matan_tsuberi · June 3, 2018, 11:34pm

I think it may be useful to view HTM as a reinforcement learning algorithm. What’s missing in HTM is a clear way to apply rewards to optimize some goal (i.e the reinforcement part…) and to transform the model’s understanding into actions to achieve this goal. I understand that this is a current area of research (i.e. goal oriented behavior and sensory-motor inference)? It could be very interesting to make a hybrid of HTM with conventional RL algorithms, with HTM possibly supplying the missing foresight data-efficiency of current approaches, I’d be very interested to see if we could surpass current performance with this approach.

Here’s a paper describing a generalization benchmark for RL algorithms: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/retro-contest/gotta_learn_fast_report.pdf

Paul_Lamb · June 4, 2018, 1:08am

That’s what I figured you were talking about. If you haven’t seen it yet, check out this thread which demonstrates HTM + RL. This is another area I have been working on for a while (though no good results to show yet).

Topic		Replies	Views
Has anyone tested Numenta on well-known benchmarks? Lounge	2	840	September 10, 2018
HTM benchmarks? Numenta Theory	1	708	June 11, 2016
Tradeoff between generality and optimallity in regards to AI alignment issues Tangential Theories	61	2503	June 4, 2018
Standard Benchmarks and Scientific Rigor NuPIC	1	672	August 31, 2016
HTM underfit/overfit Numenta Theory	8	999	January 16, 2017

Generalization benchmark?

Related topics