Introducing new project: image2sdr

sdrs
image-recognition

#1

I have uploaded a new project on GitHub called image2sdr. It is a simple NodeJS service which you can post an image file to, and it returns an SDR.

Behind the scenes what it is doing is:

  1. Classify image via Clarifai and return the top 10 concepts and their confidence level
  2. Parse the concepts found and retrieve word SDRs via cortical.io
  3. Stack the word SDRs, scored by their confidence level
  4. Generate a standard bit-array SDR with the desired sparsity, containing the bits with the top scores (using random tie breaker)

This project is in response to a conversation we had during the most recent Hackers’ Hangout.

Couple of notes:

  1. You will need to copy config_example.yml into a file named config.yml, and enter your own Cortical.io and Clarifai API keys. You can get free API keys from the respective websites.
  2. The service does not currently support proxy settings. If running behind a proxy, you’ll need to add the necessary code
  3. The word SDRs from cortical.io are not a fixed sparsity (common words are more dense than less common ones), so might need some tweaking of the sparsity and sdr_size parameters (or refactoring if there is a better way)
  4. Multi-word concepts from Clarifai are skipped (such as “hard hat”). To enable them, special consideration will need to be given to negative concepts (such as “no people”).
  5. This is an open source HTM community project, so feel free to make updates and evolve the idea

#3

Great job Paul. I just ran it locally and it is working great. There are some fun things to do with this codebase. It was fun to print out all the Clarifai terms for each image I uploaded. It would also be interesting to send each SDR back to Cortical IO to see what one term matches the bag of words best.


#4

I am interested in encoder algorithm.
Could you please:

  1. explain your algorithm Paul?
  2. show us your input image and the output sdr @rhyolight?

#5

@Paul_Lamb Awesome!!!

@thanh-binh.to
Here’s an example SDR output that I’ve generated, and I think I’ve done it correctly. :crossed_fingers:

Input:
et

Output:

{"sdr":[125,163,212,309,377,380,392,406,428,436,619,663,768,811,812,900,905,939,940,1044,1089,1122,1148,1212,1297,1298,1317,1332,1361,1438,1515,1598,1768,1773,1788,1802,1831,1896,1931,1945,2020,2031,2060,2111,2118,2137,2154,2158,2159,2176,2281,2285,2289,2309,2397,2407,2408,2412,2413,2493,2497,2516,2522,2534,2535,2536,2537,2538,2539,2540,2651,2663,2664,2665,2666,2678,2750,2751,2782,2791,2792,2793,2794,2795,2806,2864,2912,2913,2916,2922,2923,3045,3046,3172,3175,3180,3189,3190,3191,3193,3225,3286,3300,3316,3328,3395,3420,3424,3443,3444,3456,3552,3560,3567,3569,3570,3571,3576,3684,3697,3702,3772,3825,3827,3828,4086,4186,4215,4319,4372,4378,4421,4452,4460,4461,4544,4572,4578,4672,4696,4704,4717,4721,4757,4769,4796,4800,4801,4831,4846,4860,4864,4920,4931,4956,4960,5088,5091,5114,5172,5174,5183,5214,5215,5216,5265,5304,5365,5432,5471,5472,5487,5567,5571,5599,5607,5685,5686,5810,5820,5821,5948,5978,5981,6010,6063,6176,6336,6508,6610,6777,6849,6981,7233,7427,7494,7515,7617,7917,8151,8304,8360,8385,8465,8486,8487,8516,8775,8776,8784,8795,8798,8834,8875,8894,8899,8904,8906,8965,9011,9034,9105,9159,9179,9225,9322,9350,9360,9447,9491,9530,9575,9706,9734,9737,9831,9935,10084,10090,10117,10212,10221,10222,10299,10337,10484,10490,10586,10605,10606,10728,10755,10833,10842,10844,10864,10988,11095,11112,11113,11240,11241,11256,11504,11505,11628,11629,11637,11638,11692,11826,11827,11879,11883,11886,11900,11941,11980,12278,12331,12529,12596,12664,12795,12916,12923,12997,13125,13310,13419,13550,13685,13691,13994,14022,14031,14333,14403,14454,14536,14546,14606,14659,14661,14786,14787,14855,14858,14914,14915,14916,15042,15049,15094,15243,15684,15696,15739,15907,15940,15943,15946,16009,16077,16100,16201,16259,16267]}

#6

@Jose_Cueto how about image and sdr size? Can you display sdr output as image?
Currently I am working on different concepts, based on extracting image features and converting these features into sdr.


#7

Sure, the basic process is:

  1. Retrieve the concepts found in the image and their level of certainty. For example, ET: .99, creepy: .88, ugly: .75.
  2. Retrieve the word SDRs for each of those concepts (ET, creepy, and ugly in this example).
  3. Loop through the word SDRs and for each “on” bit increase the score for that bit position by the level of certainty for the associated concept.
  4. After adding the scores for all of the SDRs, select the desired sparsity of the bits with the highest scores, using a random tiebreaker.

I called the core algorithm mergeWeightedSdrs. By “weighted SDR” I mean a word SDR and its level of certainty.

No, it is a single-direction encoding. SDRs from images, but not back the other direction.


#8

BTW, if you are referring to the algorithms for images to concepts and words to SDRs, those algorithms are performed outside of the above merging algorithm (in this case, Clarifai is used for concept identification, and cortical.io is used for word SDRs)

Clarifai uses a convolutional neural network algorithm.

Cortical.io uses an algorithm called semantic folding.

This project merges the two to provide a way of generating SDRs from images for use in the HTM algorithms.


#9

@Paul_Lamb thanks for your explain. I do not know about Classifai, but as far as I know CNN does not support online learning and decoding. Am I Right here?


#10

I don’t know the definition of online decoding, but yes CNN is a traditional supervised learning algorithm which requires a large labeled training set. It also has no relation to biological neural networks.


#11

Another way to think of the basic concept, is imagine one person using words to describe something they saw to someone else who wasn’t there. This scene from “Someone Like You” comes to mind:

image


#12

For anyone interested in doing this, you can use the /expressions/similar_terms API, with a body like:

{“positions”:[]}

I did this for the ET image and SDR that @Jose_Cueto posted.

The original concepts found by Clarifai were:

wildlife: .972
portrait: .970
nature: .969
animal: .968
staring: .952
one: .947
eye: .946
face: .945
looking: .927
isolated: .918

Running the SDR through the above similar terms API suggests:

animal  *
looking  *
eye  *
pet
cats
wild  *
face  *
animals  *
eyes  *
cat

(I added a * by the ones which match the original classification). An extraneous “cat” idea seems to have been introduced here, but given the rather non-specific original classification, seems like it did an OK job in this case.


#13

I just want to make sure that @sheiser1 sees this, since he was the one asking about it at the hangout.


#14

A fantastic tool. Big ups @Paul_Lamb!