Introducing new project: image2sdr



I think that neither method is suitable for anomaly detection. Feature extraction strips away most of the relevant information to detect an anomaly. For example, neither Clarifai’s classifier, nor an autoencoder will produce any relevant features for an image with people walking upside down, or cars driving wrong way.

In fact, thinking more about this, it seems that the self-supervised learning approach advocated by LeCun would be way more effective:


I see this as a huge unsolved problem for HTM systems.

HTM is great at saying “I saw this thing” before. It can even say “I saw this sequence” before.
I really don’t see how HTM will be able to match up some cue, like a perceived printed or spoken word to a paired sound or image or stream of tokens that make an image.

The current excitement in Numenta regarding the coding in tuples of grid nodes (object) (displacement) (object) does not really solve this problem yet. That gets us to a representation of the relation of perceived features. The work to use hierarchy to resolve that to a higher level representation is unfinished

I have high hopes that I can bridge this problem using hex-grid coding and tuples of tokens and sequences of tokens but at this point I do not have a functioning system. I do think that the enabling technology will be the dumb boss/smart adviser model but I have not worked out how these systems interact to the point where they function together organically. The few toy attempts produced very limited stereotyped behavior.


Depends on the use case. If looking for anomalies in an image, no this is not useful. For detecting weird physics like upside down people, also not useful. If detecting an increase in the frequency of cars passing in front of a camera at 8:00 AM on a Saturday compared to typical Saturdays, then maybe useful.


inspired by visual cortex != biologically plausible
CNNs fundamentally don’t work as the brain, it’s easy to demonstrate.


Can you please explain how your image2sdr method would be useful for this task? I just don’t see how the sdr for an image with 20 cars would be different from the sdr from a similar image with 50 cars, given that the tags from Clarifai would probably be identical in both cases.


Please do yourself a favor and read the article I linked to. Unless you are a neuroscientist and actually know what you’re talking about.


Classic HTM is designed for streaming data, not static data. So naturally you wouldn’t pass it an image with 50 cars… you would pass it a series of images over time, in which 50 cars passed by a camera.

Of course, it depends on the use case. This wouldn’t be useful for anomaly detection of vehicle frequency either if you were to try monitoring a busy highway where every frame always had cars in it.

And again, this was a demonstration for one easy way of linking HTM with classic AI algorithms using word SDRs. One could imagine a slightly different system which uses classic image AI to locate things in video frames, and then have both subjects and positions to do some streaming HTM magic with. Or an audio AI for identifying animal calls hooked up with HTM to detect population anomalies in a particular habitat, or tracking migrations, etc.

Also, just to be clear, image2sdr is not meant to be biologically plausible. It is just another tool that some people may find useful for a few AI / HTM related cases. There are probably many other tools that could be applied to many of the same cases.


I believe I know what I’m talking about, but I scanned the blog post you linked to and it’s full of stretches. We can discuss exact claims from it if you want, but you can start from the opinion on this of people who know the DL in deep details, like Geoffrey Hinton and Andrej Karpathy.