Integrating aspects of the HTM Algorithm

jordan.kay · December 24, 2018, 8:30pm

I’m working on an AI project and I want to use HTM to make predictions on datastreams within this project.

I have some continuous (must be processed in realtime) time series data. These data streams don’t fit into any predefined typical data types such as ‘audio data’ or ‘price data’ or ‘language data’. They’re just data streams of bit/binary patterns. and they’re sparse. So they’re basically SDR’s, they even have some very loose semantic meaning to each bit. But the data stream could technically represent anything.

Anyway, so I’m looking for a generalizable time-series-pattern-prediction algorithm. I don’t need the hierarchy stuff. I just want better than chance predictions on successive patterns. That is to say, I want a basic spatiotemporal prediction algorithm that I can plug into the system.

When I’ve asked what the most generic solution would be on other forms I get data scientists asserting “There is no general algorithm, you must tell us your exact use case!” Then they get all hung up on what type of data it is, yadda, yadda yadda. It’s annoying. I just want to predict the patterns, guys, let’s not overcomplicate it.

So, anyway, the question I have is, since I basically know the HTM algorithm, and I’m working in python, how do I use the Nupic library in python to just take advantage of the basic (what used to be called CLI) algorithm for time-series pattern prediction? Is the code base mature enough for me to do that? Is it modular enough? If so where do I get started?

Bitking · December 24, 2018, 10:39pm

If you want to detect excursions outside of known patterns - HTMs. If you want to generalize and predict - RNNs.

I have not seen anyone using HTM to supervise RNNs but it could be useful. Basically- the HTM would make the error signal.

sgaseretto · December 24, 2018, 10:39pm

I’m far from being an expert on HTM Theory, but if I’m correct, the data you want to feed to the HTM algorithms has to be encoded using some kind of encoder, and this encoder generates the SDRs that are passed to the HTM algorithms in order to perform learning and predictions. Different types of data require different types of encoders (Scalar Encoders, Datetime Encoders, Cortical.io has some encoders if I’m correct). If I understand correctly, you want to asume that the stream of data you are receiving is somehow already encoded?

jordan.kay · December 25, 2018, 7:34am

yes, it will already be encoded (put more accurately it will continually be in the process of being encoded). So, don’t worry about that, I’ll manage the SDRs I just want plug and play algorithms to use them with.

abshej · December 25, 2018, 11:37am

The only thing you need to run HTM on your data is a converter that will produce consistent and semantic binary patterns from your data. It doesn’t make a difference if you have binary data, the data has to come from an encoder that encodes semantically. Your cortex won’t help you function sanely if your eyes don’t use the same rules to encode all stimuli, but it would continue extracting semantics and patterns from the data. I would suggest making different encoders(converters) for different data sources and using them to create binary patterns which you can feed to your HTM.
If you don’t know the sources for the data streams then you might as well try running HTM on raw streams or using a single encoder.
I think you will find that the github page for nupic is sufficient for coding the network.

HTMs are better suited to extract semantics from and generalise data. RNNs might work better for a single type of data where the variability in data is low, but you need HTM-level flexibility in order to deal with high variability and in cases like this one where you don’t know the source and encoding algorithm.

One more thing that might be interesting: Try differently initialized spatial poolers on your data to see how they perform.

SporkingIt · December 27, 2018, 1:39am

I’ve experimented with some simple prediction for data series and classes and the only thing I’d claim is that you need to collect the data that is sent through the HTM system to be able to map any predictions into a format that is directly useful. If it’s enough to just compare predictions to actual input the HTM system is already doing exactly that but I’m guessing that that is not what you want.

There should be code around that, for example, translates a prediction into a numeral or a class. This is done by storing a mapping from the currently active columns to the input pattern that just got sent to the TM. Any future predictions will then be generated by comparing currently predictive columns to the stored patterns and sort on overlap with some cut-off.

The real challenge, as I see it, is to come up with an effective way to store the mappings while at the same time offer as much leeway as possible for inference. I’ve spent many hours trying to create something worth pursuing when it comes to natural language word prediction, with some success, but I’m sure so much more can be done.

subutai · December 27, 2018, 5:26pm

You could take a look at the SDR Classifier - it might be doing what you want:

github.com

numenta/nupic-legacy/blob/master/src/nupic/algorithms/sdr_classifier.py

# ----------------------------------------------------------------------
# Numenta Platform for Intelligent Computing (NuPIC)
# Copyright (C) 2016, Numenta, Inc.  Unless you have an agreement
# with Numenta, Inc., for a separate license for this software code, the
# following terms and conditions apply:
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero Public License version 3 as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# See the GNU Affero Public License for more details.
#
# You should have received a copy of the GNU Affero Public License
# along with this program.  If not, see http://www.gnu.org/licenses.
#
# http://numenta.org/licenses/
# ----------------------------------------------------------------------

This file has been truncated. show original

naive_bayesian · December 30, 2018, 6:46pm

The HTM algorithm is good at detecting deviations from a pattern. Assuming your time series is simple enough and that you don’t want to predict more than a few steps into the future, at any given point in the time series, hand it a range of possible “next time steps” in parrallel. Then just decide how to pick a winner. Most naive would just be to predict the least surprising one (softmax style). Alternatively you could do some sort of weighted average by likelihood. You could also use some kind of grid search / gradient descent to find better candidates to test.

MaxLee · January 1, 2019, 12:23am

That’s actually a great idea! It’d be a good fit.

Topic		Replies	Views
Neureal wants to use HTM, help? Engineering question	2	988	February 9, 2018
HTM do not perform well when learning a simple function like y=x! NuPIC	23	1552	February 6, 2018
Can HTM predict discontinuous data in time? NuPIC question	9	663	August 16, 2018
Why doesn't htm algorithm match the time series when it predicts? NuPIC	3	684	February 7, 2018
Comparison between HTM and other methods for Time Series predictions Engineering question , community	6	1823	March 25, 2018

Integrating aspects of the HTM Algorithm

Related topics