Simple Cortex

ddigiorg · August 27, 2017, 2:02am

The architecture image has been updated. Your suggestions made it much more precise.

I’ve never attempted to benchmark algorithm performance before, but it’s a good idea and I should learn how to do it properly. Also, I’ve never used NUPIC either, surprisingly enough. I guess it’s finally time to put my big boy pants on if I’m serious about writing a decent paper sometime in the near future. I’m also curious how well it’d perform compared to other GPGPU algorithms like Ogmaneo or a Tensorflow LSTM.

jakebruce · August 27, 2017, 2:14am

Let me know if you want an LSTM comparison. I have some code that I could quickly adapt for your purposes. What would be your metric of comparison, the accuracy of the pixel-level predictions?

ddigiorg · August 27, 2017, 2:28am

That’d be awesome if you have the time, I’d really appreciate it. Off the top of my head I think the two useful metrics would be accuracy and speed. Your suggestion for accuracy would be interesting. Something like comparing the prediction X time steps into the future with the current scene state X time steps later. Perhaps it could be a running average, or would that be too much? As for speed I’m not really sure what’s the standard for timing code.

jacobeverist · August 27, 2017, 2:31am

With the new picture on the side, enlarge the “Forest” ellipse to include multiple dendrites. Because right now it is indistinguishable from the “Dendrite” ellipse.

It’s also good that you added to figure to show recurrence. Makes much more sense now.

jakebruce · August 27, 2017, 2:40am

No problem. You definitely want some sort of accuracy comparison, otherwise I’ll write you a blazing fast prediction engine that does nothing but output zeros. You could do the mean squared error in the pixel prediction for timesteps 1, 5, 10, and 25 steps in the future or something. However, since your network makes multiple predictions, mean squared error is likely to be bad. If you want to not penalize the network for false negatives (predicting something that doesn’t end up happening, as HTMs often do) then I can game that metric with a prediction engine that predicts the ball in every position all the time. So I’m not sure.

How does your implementation do predictions further than one timestep ahead? Do you feed the current prediction into the network as a simulated state? I apologize if I missed that in the explanation.

Mean and standard deviation of execution time per frame, averaged over a large amount of data would be typical. You can report this quantity with learning on and off, since that’ll strongly affect computation time for pretty much any learning system. It may also change during the course of training, during the process of saturating your synapse pool.

jacobeverist · August 27, 2017, 2:46am

@ddigiorg

You might consider Ogma’s handwriting anomaly detection demo as your benchmark. It uses a standard dataset, the Ogma implementation is already set up as your desire to compare against that system, and it’s got a good metric of detecting anomalies in handwriting images. It’s also image based which is what you have tested your system on.

https://github.com/ogmacorp/OgmaNeoDemos#mnist-anomaly-detection

jakebruce · August 27, 2017, 3:33am

It must have been too early for my brain to work properly. The appropriate metric here is probably a precision-recall curve, where degenerate engines like the one I mentioned would get perfect recall but zero precision, and never making any predictions would get you perfect precision but zero recall.

jacobeverist · August 27, 2017, 3:41am

@jakebruce

Can you give a reference on how to implement this? Sounds interesting.

Edit: Never mind, it’s a widely used technique: http://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html

jakebruce · August 27, 2017, 3:54am

That’s a good reference. For the benefit of our readers:

Precision P is defined as the number of true positives TP over the number of true positives plus the number of false positives FP.

P = TP / (TP + FP)

Recall R is defined as the number of true positives TP over the number of true positives plus the number of false negatives FN.

R = TP / (TP+ FN)

These quantities are also related to the F1 score, which is defined as the harmonic mean of precision and recall.

F1 = 2PR / (P+R)

And max(F1) is a good scalar summary for when you want to quantitatively compare multiple entire curves (of different algorithms, for example, or the progress of an algorithm while it is undergoing training).

Side note: @rhyolight, is there a way we can get LaTeX parsing on this forum platform for decent mathematical typesetting?

rhyolight · August 28, 2017, 4:19pm

2 posts were split to a new topic: LaTeX Parsing on the Forum

thanh-binh.to · August 27, 2017, 7:28am

@jacobeverist i think the comparison to Ogma by mnist recognition makes sense, but the results are very sensitive to encoder you use. I am not sure that the current image encoder of SC is good for mnist because ball physic demo is not really in the noisy environment!

sheiser1 · August 27, 2017, 1:09pm

Numenta did a paper comparing HTM to LSTM and other sequence learning algorithms, on both artificial and real data (NYC taxi demand). Here’s a link to a slideshow giving a short version of it:

Real-Time Streaming Data Analysis with HTM from Numenta

david-ragazzi · August 27, 2017, 1:40pm

Very good and funny format, Matt! Best teacher ever!

jacobeverist · August 27, 2017, 6:49pm

@thanh-binh.to

Yes, I’ve noticed in my work that choosing encoders is a whole separate problem that deserves its own discussion. The encoders basically choose what phenomenon are visible to the algorithm and what phenomenon are lost. Of course, if you create a super-high resolution encoder that captures everything, your algorithm takes forever to train or it may miss the needle in the haystack.

thanh-binh.to · August 27, 2017, 7:20pm

@jacobeverist totaly agree with you, but i want to make sure that results are comparable…

ddigiorg · August 27, 2017, 7:40pm

I’ll modify the picture tonight when I get a chance. Drawing circles on the biological neuron images made it too cluttered as well.

No problem, hopefully I can explain it well enough. At each time step the network learns the current input image and the previous neuron states. To make a prediction of the next time step, the current neurons states are fed back into the network, which gives the next time step’s predicted neuron states. This can loop as many timesteps as needed.

That’s a good idea because MNIST is very well known and I want to make a demo with it anyway. As long as I use the same demo between SC, NUPIC, Ogmaneo, and a LSTM I’ll be happy with it.

True enough I haven’t demonstrated SC works in a noisy environment, but my first guess is it would work ok if I set the right dendrite threshold. Something like as long as 75% of previously observed stimulae is present then the pattern is recognized even if there is noise. Admittedly, I am not an expert so I may be missing something essential.

Also, Ogma algorithms can observe and learn from scalar buffers. Would I have to modify SC to have a fair comparison with Ogmaneo on the MNIST dataset? It wouldn’t be that hard, I’d just replace the overlap and threshold functionality with a euclidean distance formula. However, NUPIC only operates on binary arrays. Is this why choosing the right encoder is so important?

ddigiorg · October 6, 2017, 9:15pm

Simple Cortex: A Model of Cells in the Sensory Nervous System

David Di Giorgio

(Submitted on 3 Oct 2017)

Neuroscience research has produced many theories and computational neural models of sensory nervous systems. Notwithstanding many different perspectives towards developing intelligent machines, artificial intelligence has ultimately been influenced by neuroscience. Therefore, this paper provides an introduction to biologically inspired machine intelligence by exploring the basic principles of sensation and perception as well as the structure and behavior of biological sensory nervous systems like the neocortex. Concepts like spike timing, synaptic plasticity, inhibition, neural structure, and neural behavior are applied to a new model, Simple Cortex (SC). A software implementation of SC has been built and demonstrates fast observation, learning, and prediction of spatio-temporal sensory-motor patterns and sequences. Finally, this paper suggests future areas of improvement and growth for Simple Cortex and other related machine intelligence models.

Subjects: Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)

thanh-binh.to · October 7, 2017, 8:48am

David: thanks for very interesting paper and explain your implementation. I will be excited on

extension of the current version for handling directly with scalar values/images etc. and
your benchmarks results.

ddigiorg · October 8, 2017, 1:54am

Thanks for reading it! I really want to make those changes, but unfortunately I need to prioritize finding a job, haha. I will get to it as soon as I can.

Topic		Replies	Views
Why Neurons Have Thousands Of Synapses, A Theory Of Sequence Memory In Neocortex Related Papers	68	5733	April 17, 2019
A different point of view on building AI system Tangential Theories	45	4013	December 2, 2017
“Prediction” from the first principles Lounge	75	4266	August 13, 2018
Project to compare mraptor's bbHTM to biology Engineering	21	2098	June 22, 2016
Any questions for Jeff? Numenta Theory community	34	4020	March 22, 2017

Simple Cortex

Simple Cortex: A Model of Cells in the Sensory Nervous System

Related topics