Advice For a New MS Student

bkutt · September 9, 2017, 2:57am

Hello HTM community, this is my first ever post!

I’ve spent the past few months learning about HTM and I have a passion for brain-inspired AI computing, like the rest of you. I’ve read Jeff’s book, watched the instructional videos, installed NuPIC, etc. and I’m very humbled and impressed by the great work you guys are doing.

I want to get involved but I’m unsure where to begin. I’m planning on completing a MS thesis in the next academic year and I’d love to work on HTM but what I’m lacking the most is a concrete problem to tackle/path to follow and I’m unsure where to look for inspiration. I have experience, knowledge and interest in deep learning, data science and computer vision with a background in computational mathematics and computer science.

I was working at a computer security company this past summer as a data scientist in silicon valley and I think HTM has great potential in security applications. An initial idea I had was to adapt HTM to build a brain that monitors temporal network traffic data and ultimately works to detect intrusions as a security measure. I imagine this would be something like an anomaly detection use case. Because of the temporal pattern detecting, noise tolerant, online learning properties of HTM I think it might be able to outperform traditional DL methods. I’m unsure if this is even feasible or worth exploring, however.

Any advice would be greatly appreciated.

Best,
Brody

sunguralikaan · September 9, 2017, 8:38pm

If you have DL background and if you like a challenge, find a problem that fits good to both HTM and DL (this is already hard). Do a comparative study both functionally and in terms of results. It may sound simple and there were a couple of attempts. However as far as I am aware none is comprehensive or complete enough to satisfy the ML crowd as I’ve read a couple. Even if you weren’t completely successful, we would earn a community member who conducts these experiments so that we are more informed about what exactly we are missing in practice. Or what exactly should we underline to communicate HTM. We all have our own answers here on this matter but very little is backed by actual experiments.

bkutt · September 11, 2017, 3:22am

Thanks for your response!

That sounds like a good idea…strong comparative experimental evidence is always important in any scientific endeavor. I could compare HTM with deep LSTMs and other DL methodology on a variety of time series anomaly detection datasets and problems perhaps.

The other thing that I was thinking was designing a parallelized implementation of the spatial pooler and/or temporal memory algorithm. As far as I’m aware there doesn’t seem to be much support to utilize GPUs or multicore computers. Given the independent nature of each cell’s computation you’d think running times could be extensively reduced with GPUs…

jakebruce · September 11, 2017, 3:28am

I second the idea of solid comparisons.

GPU usage has significantly improved runtime of the spatial pooler in my own implementations (using Tensorflow). The temporal memory is a trickier beast, since a naive approach using dense matrices uses a prohibitively large amount of RAM and can’t fit into system memory, let alone VRAM. A more intelligent approach using sparse matrices is possible, but gets less benefit from the GPU due to the scattered reads and/or writes. But I encourage you to see what you can do, there’s definite potential.

Jonathan_Mackenzie · September 11, 2017, 4:44am

I did a comparative study on HTM and LSTM (not deep LSTM, just 2 layers). The LSTM code (using keras) is here:

My HTM code is here:

github.com

JonnoFTW/htm-models-adelaide/blob/master/engine/evaluate.py

from __future__ import print_function
from index import create_upstream_model
from metrics import geh, rmse, mape
from collections import OrderedDict, defaultdict

import csv
import tabulate
from datetime import datetime, timedelta
from pluck import pluck
import numpy as np
import pyprind

steps = [1]
eps = 1e-6

def run_data(fname, limit=None, sensors=None):

    data = []
    # load up the data
    print("Loading Data")

This file has been truncated. show original

I’ll PM you my paper if you like. It’s been submitted to IEEE Transactions on Intelligent Transportation Systems. The main results were that LSTM with online learning (periodically resetting the memory cells) outperformed HTM in most measures. HTM came closer when the distributions in the data changed over time though, it was better at LSTM at doing this.

sunguralikaan · September 11, 2017, 7:25pm

Hey @Jonathan_Mackenzie, thanks for sharing your work! Oh and I hope it hasn’t been published yet (sounds wrong I know ). I would be sad if I somehow missed referencing your work. I looked hard for HTM vs LSTM and the closest was done by Numenta through anomaly prediction last time I checked.

sunguralikaan · September 11, 2017, 7:30pm

Can you provide a rough estimate for the performance increase if you are comfortable with it?

bkutt · September 11, 2017, 8:19pm

Interesting! I’d love to read more about your experiments. It sounds like we’ll be doing similar things. I’m currently exploring potential time series, anomaly detection datasets and problems to use as benchmarks.

Periodically resetting the memory cells sounds like pseudo-online learning or I guess simulated online learning if you get what I mean. If I understand you correctly it’s effectively just erasing and creating a new network periodically to reflect new data versus adapting an old model to new data in real time along the lines of what HTM does. It seems like the problem itself is easier for LSTMs in that case. It’s still worth comparing though, for sure, and I’m interested in your results.

rhyolight · September 11, 2017, 8:27pm

bkutt · September 11, 2017, 8:59pm

Lots of data! Thank you!

jakebruce · September 11, 2017, 10:42pm

Sure! Depending on the size (larger is better), I get up to a 10x speedup for the forward pass to compute spatial pooler overlap. With 2000 columns and 7056 input bits the speedup is around 6x on my system, but in my experiments with each cell having its own proximal connections (so 20000x7056 matrices) I get closer to 10x. And in my experiments with a convolutional spatial pooler it’s greater still (12x+ depending on size).

With smaller input sizes (84x84 is reasonably large) it’s hard to get as much of a speedup since transfer overhead starts to dominate. But if you can justify running multiple datapoints through in a batch (parallel sequences), you can get up around 10x even with small inputs and column/cell counts.

Unfortunately the selection of winning columns, the learning step, and the temporal memory algorithm are much harder to speed up on the GPU (and all three are fairly awkward to vectorize with Tensorflow) so I haven’t had much speedup on those parts in my implementations. Sometimes they’re even slower, so I tend to restrict those steps to CPU only.

Jonathan_Mackenzie · September 12, 2017, 1:37am

We got rejected initially by ACM Transactions on Intelligent Systems and Technology, so this is the second attempt at publishing. Still haven’t heard back from IEEE TITS though. I can PM you the paper if you like, the downside is that you can’t reference it.

rambler001 · April 19, 2018, 12:51pm

Can you PM your paper?I will be very grateful.

Topic		Replies	Views
HTM and Deep learning Numenta Theory	5	1365	September 5, 2018
Famous ML/AI/data science contests in which HTM/NuPic was presented Machine Learning	1	599	December 19, 2019
Speed of Spatial Pooler and Temporal Memory Engineering sequence-memory , spatial-pooling , question	11	1278	May 10, 2022
HTM benchmarks? Numenta Theory	1	710	June 11, 2016
Learning from real world data (ECG Heartbeat Categorization) as anomaly detection using HTM Engineering	23	2644	November 1, 2018

Advice For a New MS Student

Related topics