Introducing new project: image2sdr

The encoding (feature extraction) done by convolutional layers is inspired by visual cortex [1]

[1] https://neurdiness.wordpress.com/2018/05/17/deep-convolutional-neural-networks-as-models-of-the-visual-system-qa/

Convolution is inspired by the foveal scanning that extracts these streams of tokens. I understand that the biological mechanisms have been emulated to make a process that is more suited to use by digital hardware but it is an inexact copy of these mechanisms.

There are bunches of good ways to do image processing - I am just pointing out that reconstruction of an image does not seem to be biologically plausible.

The usual focus of HTM researchers is to make something that works in biologically inspired ways to help model and understand the wetware.

3 Likes

Sorry, forgot to answer your question. Typically the best way is to compile a good set of realistic test data and perform whatever prediction or anomaly detection task you are planning, to see how it does.

Perhaps. I don’t think we know enough about how a brain reconstructs an image while trying to visualize something. In this case we use reconstruction as a way to test the quality of an SDR.

In any case, using a convolutional network to encode an image seems to be way more biologically plausible than some of the methods used by cortical.io.

2 Likes

I think that neither method is suitable for anomaly detection. Feature extraction strips away most of the relevant information to detect an anomaly. For example, neither Clarifai’s classifier, nor an autoencoder will produce any relevant features for an image with people walking upside down, or cars driving wrong way.

In fact, thinking more about this, it seems that the self-supervised learning approach advocated by LeCun would be way more effective: https://youtu.be/7I0Qt7GALVk?t=2472

I see this as a huge unsolved problem for HTM systems.

HTM is great at saying “I saw this thing” before. It can even say “I saw this sequence” before.
I really don’t see how HTM will be able to match up some cue, like a perceived printed or spoken word to a paired sound or image or stream of tokens that make an image.

The current excitement in Numenta regarding the coding in tuples of grid nodes (object) (displacement) (object) does not really solve this problem yet. That gets us to a representation of the relation of perceived features. The work to use hierarchy to resolve that to a higher level representation is unfinished

I have high hopes that I can bridge this problem using hex-grid coding and tuples of tokens and sequences of tokens but at this point I do not have a functioning system. I do think that the enabling technology will be the dumb boss/smart adviser model but I have not worked out how these systems interact to the point where they function together organically. The few toy attempts produced very limited stereotyped behavior.

Depends on the use case. If looking for anomalies in an image, no this is not useful. For detecting weird physics like upside down people, also not useful. If detecting an increase in the frequency of cars passing in front of a camera at 8:00 AM on a Saturday compared to typical Saturdays, then maybe useful.

2 Likes

inspired by visual cortex != biologically plausible
CNNs fundamentally don’t work as the brain, it’s easy to demonstrate.

2 Likes

Can you please explain how your image2sdr method would be useful for this task? I just don’t see how the sdr for an image with 20 cars would be different from the sdr from a similar image with 50 cars, given that the tags from Clarifai would probably be identical in both cases.

Please do yourself a favor and read the article I linked to. Unless you are a neuroscientist and actually know what you’re talking about.

Classic HTM is designed for streaming data, not static data. So naturally you wouldn’t pass it an image with 50 cars… you would pass it a series of images over time, in which 50 cars passed by a camera.

Of course, it depends on the use case. This wouldn’t be useful for anomaly detection of vehicle frequency either if you were to try monitoring a busy highway where every frame always had cars in it.

And again, this was a demonstration for one easy way of linking HTM with classic AI algorithms using word SDRs. One could imagine a slightly different system which uses classic image AI to locate things in video frames, and then have both subjects and positions to do some streaming HTM magic with. Or an audio AI for identifying animal calls hooked up with HTM to detect population anomalies in a particular habitat, or tracking migrations, etc.

Also, just to be clear, image2sdr is not meant to be biologically plausible. It is just another tool that some people may find useful for a few AI / HTM related cases. There are probably many other tools that could be applied to many of the same cases.

2 Likes

I believe I know what I’m talking about, but I scanned the blog post you linked to and it’s full of stretches. We can discuss exact claims from it if you want, but you can start from the opinion on this of people who know the DL in deep details, like Geoffrey Hinton and Andrej Karpathy.

5 Likes

can we use this for images that are present in my desktop and use cortical.io and clarifai. if yes , can you say how?

If you are comfortable with coding in Javascript, then all of the logic is in api/image/index.js and could be fairly easily copied and pasted into another NodeJS application, or ported to another language.

If not comfortable with Javascript, then you will need to install NodeJS on your desktop to run the service locally. Once NodeJS is installed, clone the repository, and make a copy of config_example.yml named “config.yml”. Edit this file to plug in your API keys. Then create a folder called “uploads”. At this point, you can run “npm start”. You will now be able to access the service via http://localhost:80/ Post an image file to the service, and it will return a response in JSON format.

I am using python and I have a problem I converted video to images , image per second it is difficult to upload bulk of images and get the concepts. I have around 3000 images .how to do this.

Unfortunately, processing multiple images at once is still on the TODO list.

portrait –

[
  {
    "positions": [

      11,
      98,
      125,
      131,
      269,
      271,
      277,
      309,
      319,
      385,
      386,
      399,
      401,
      402,
      403,
      406,
      436,
      513,
      530,
      569,
      570,
      620,
      638,
      656,
      663,
      698,
      730,
      791,
      900,
      905,
      1007,
      1089,
      1129,
      1169,
      1212,
      1237,
      1292,
      1298,
      1350,
      1360,
      1361,
      1478,
      1489,
      1541,
      1542,
      1549,
      1553,
      1615,
      1626,
      1681,
      1683,
      1740,
      1746,
      1802,
      1809,
      1881,
      1931,
      1945,
      2060,
      2065,
      2118,
      2132,
      2137,
      2321,
      2498,
      2522,
      2572,
      2573,
      2576,
      2701,
      2702,
      2829,
      2830,
      2958,
      2959,
      3088,
      3193,
      3209,
      3210,
      3216,
      3225,
      3272,
      3273,
      3338,
      3339,
      3342,
      3343,
      3344,
      3399,
      3401,
      3469,
      3470,
      3597,
      3656,
      3657,
      3658,
      3772,
      3805,
      4156,
      4206,
      4410,
      4457,
      4544,
      4578,
      4624,
      4687,
      4688,
      4695,
      4696,
      4717,
      4757,
      4796,
      4800,
      4801,
      4846,
      4860,
      4931,
      4956,
      4969,
      4970,
      4972,
      4978,
      5051,
      5075,
      5106,
      5190,
      5205,
      5206,
      5265,
      5393,
      5423,
      5442,
      5459,
      5461,
      5487,
      5571,
      5606,
      5607,
      5700,
      5810,
      5845,
      5978,
      6119,
      6226,
      6233,
      6336,
      6354,
      6508,
      6610,
      6731,
      6863,
      6980,
      6981,
      7002,
      7107,
      7233,
      7251,
      7369,
      7381,
      7387,
      7397,
      7427,
      7494,
      7515,
      7617,
      7666,
      7667,
      7792,
      7921,
      8023,
      8151,
      8162,
      8176,
      8303,
      8304,
      8305,
      8326,
      8401,
      8459,
      8555,
      8661,
      8784,
      9322,
      9447,
      9450,
      9470,
      9530,
      9575,
      9583,
      9706,
      9714,
      9830,
      9831,
      9837,
      9838,
      9839,
      9841,
      9842,
      9956,
      9957,
      9958,
      9964,
      9965,
      10083,
      10084,
      10085,
      10086,
      10087,
      10090,
      10091,
      10092,
      10093,
      10094,
      10095,
      10096,
      10212,
      10215,
      10216,
      10218,
      10219,
      10220,
      10221,
      10222,
      10223,
      10224,
      10225,
      10340,
      10341,
      10344,
      10345,
      10346,
      10347,
      10348,
      10349,
      10350,
      10351,
      10352,
      10470,
      10471,
      10472,
      10473,
      10474,
      10475,
      10476,
      10478,
      10479,
      10480,
      10599,
      10601,
      10602,
      10603,
      10604,
      10605,
      10606,
      10607,
      10729,
      10730,
      10731,
      11125,
      11211,
      11216,
      11217,
      11308,
      11343,
      11344,
      11346,
      11347,
      11474,
      11475,
      11601,
      11602,
      11681,
      11729,
      11847,
      11980,
      12198,
      12331,
      12332,
      12337,
      12496,
      12589,
      12590,
      12625,
      12795,
      12844,
      12850,
      12971,
      12972,
      12973,
      12975,
      12998,
      13310,
      13330,
      13487,
      13550,
      13612,
      13613,
      13614,
      13685,
      13760,
      13809,
      13835,
      13872,
      13877,
      13937,
      14001,
      14127,
      14252,
      14262,
      14265,
      14268,
      14403,
      14475,
      14487,
      14661,
      14858,
      14859,
      15179,
      15204,
      15682,
      15810,
      15938,
      15943,
      15946,
      15947,
      16077,
      16267,
      16268
    ]
  }
]

Response Code
if this is the SDR what I have to do after this ?

Whatever you’d typically use SDRs for. Anomaly detection in a stream of video frames, for example.

I have converted video into frames(images) of 1 second. I want detect the advertisement in the video i.e anomaly detection. if above one is SDR what to do after this?

Before you expect magic - what feature(s) tells you that the frame is part of a commercial?
Is there anything about your video that reliably tells you that this is a commercial?
HTM will tell you that the scene has changed but are there also scene changes in the video?