Toward An Unsupervised, Incremental, Streaming and One-Shot Visual Class-Based Learning and Recognition System with Hierarchical Temporal Memory Theory

bkutt · April 19, 2018, 10:12pm

Hello community,

I wanted to share with you some HTM experiments I did for a class project this semester. In all reality, I whipped this entire thing up together in less than 2 weeks so its not super pretty or groundbreaking but the results are interesting.

Those who are comfortably familiar with HTM already can skip the background section.

This is the beginning of ongoing personal research I plan to continue: specifically the effort to design a biologically plausible encoder that encapsulates the functional properties of the human visual pathway i.e. everything that happens before any signal carrying visual information from the retina ever reaches cortex including the functionality of the lateral geniculate nucleus in the Thalamus and related structures.

~~You can find the PDF here: http://brodykutt.com/Brody_Kutt_CSCI-731_Project~~

EDIT: The report is no longer at the above link. You can now find the report along with the code here.

thanh-binh.to · April 20, 2018, 2:05pm

very interesting.
I am interested in your results. Do you have any demo video? Thanks

Gary_Gaulin · April 21, 2018, 5:50pm

Hello Brody, I have been studying your well written PDF. In it I noticed this:

We could go a step further as shown in the next possible encoding below.
  1 = 1 1 1     2 = 0 1 1     3 = 0 0 1     4 = 0 0 0
      0 0 0         1 0 0         1 1 0         1 1 1

For a hexagonally arranged cortical sheet why is this not instead true?

  0 = 1 1 0     1 = 1 1 1     2 = 0 1 1     3 = 0 0 1     4 = 0 0 0     5 = 1 0 0
      0 0 1         0 0 0         1 0 0         1 1 0         1 1 1         0 1 1

Including wrap-around bits maintains 6 unique states. It’s how I have been modeling possible network behavior from hexagonally connected neighbors.

The often square 2 axis arrays of HTM are still a mystery to me. It seems more biologically accurate to always maintain the 3 axis network geometry. In that case there are six possible states in your examples. The neighbor to neighbor signal bits are then circularly arranged, and will wrap-around as they would when a 6 bit binary counter is “rotated” left or right.

Bitking · April 21, 2018, 6:04pm

I am curious to see how this deals with distributed representation?

Gary_Gaulin · April 21, 2018, 7:12pm

It sounds like that question might be for me. I thought about the kind of representation each column would have, and ironically it’s the ideal thing for extracting angular motion based information. With a through a straw sized view a bit pattern can be seen “rotating” around itself as in the example I gave where bits will periodically wrap around:

When there is a small amount of rotational “jitter” coming to a rest:

There are still predictions being made. Difference is that each column is able to predict when it’s at the center of a rotating pattern, or going off to a side where the pattern widens out then goes from 000000 to 111111 at the same frequency and bit pattern as before. From what is moving in the picture and their relative location in it: column cells gain an overall view indicating size, spin rate, and all else there are motion related words for including “bounce”.

bkutt · April 22, 2018, 4:35pm

Certainly, there’s nothing stopping you from including wrap-around in this situation to bring the number of represented states to 5. That example was meant solely for illustration of a concept and not a suggested encoding scheme. I should make a note actually that there is assumed no wrap-around.

bkutt · April 22, 2018, 4:49pm

I found meaningful results only after setting up a topological organization of column’s potential synapses. When each column’s potential pool of connections spanned the entire input space, the results were junk. In the input to the spatial pooler (the binary pixel data), the bits have relative spatial semantics. That’s to be contrasted with something like a random distributed scalar encoder where input states are randomly distributed across the whole space. The encoder of course does not have to be designed this way, one could imagine even just doing a deterministic scrambling of bit locations to each state equally to distribute patterns across the whole space. In short though, when the encoder output has embedded this kind of relative spatial semantics of bits and their locations in the input space, it appears topologically organized columns do a much better job of making sense of it.

bkutt · April 22, 2018, 4:51pm

No demo video (although it’d be cool to make one) but I can share the code with you if you like.

thanh-binh.to · April 22, 2018, 8:10pm

@bkutt Fine, please share your code with me. I want to test it with the virtual retina. Thx

Gary_Gaulin · April 23, 2018, 2:46am

Thanks for the detail. Your explanation is so close to what I use hexagonally and circularly I was not sure whether your example extends to all six possible rotational states for any bit width pattern. In what I’m used to the process of finding a center angle causes a sparsing of information down to normally one central bit or position in between, from the original 2 or more bit signal pattern. Exception is occasionally more than one direction is possible, depending on current heading. When I saw three bits in a row the two sides were immediately redundant. It’s then in a form that takes it to six possible one bit states anyway, which you started with.

You’ll maybe be thankful to know that my having to ask the question is related to a serious limitation to the models being simulated using the Emergent platform, where the 2 axis squareness and oversimplification most shows, in a Reddit thread where I presented a new paper that might be worth trying to model from as a “put this into layman’s terms” challenge for all who are willing. This exactly:

https://www.sciencedirect.com/science/article/pii/S1074742717301971

To explain how the model I have works there is a link back to my Numenta forum topic and wherever that has since lead. It could otherwise seem like the only path to a detailed neurological model is through Emergent, when Numenta is so on a mission to put the whole cortical sheet together as one fully connected model:

https://www.reddit.com/r/neuro/comments/8dxdrf/would_it_be_possible_for_someone_to_put_this_into/

I’m not sure whether I should start a new topic for this or not, but all help making sense of the confusing looking connection diagram below is appreciated. It seems to be HTM friendly next step towards a model for the entire cortical sheet, where all may look puzzling right now but it will make sense when we better understand how the system all together works. Or I hope so.

scott · April 23, 2018, 8:10pm

Very nice work @bkutt! Would be great to see larger scale experiments now that you have the framework in place. Also, I’m curious if you can keep one-shot learning and still get good performance on bigger datasets.

bkutt · April 27, 2018, 5:21pm

Definitely…once I develop a more sophisticated and biologically plausible visual encoding scheme, I want to bump up the breadth of testing. I’ll certainly keep the community updated as my research progresses.

iliabautista · July 2, 2018, 8:42pm

Hi Brody,

I was trying to download the pdf from the link: http://brodykutt.com/Brody_Kutt_CSCI-731_Project
Unfortunately, that link is not working properly, is there any way that I could get a copy of the document?

Thanks
Ilia

bkutt · July 4, 2018, 3:47am

Right! Sorry for not updating this. That old link is dead. You can now find the report along with the code here.

iliabautista · July 18, 2018, 8:38pm

Hi Brody,

I appreciate your help. I am looking forward to read the paper(s) that come out of your thesis: http://scholarworks.rit.edu/theses/9797/

Best Regards,
Ilia

Topic		Replies	Views
Project to compare mraptor's bbHTM to biology Engineering	21	2090	June 22, 2016
Open Invitation for Fellow HTM Enthusiasts for collaboration Engineering	2	107	September 5, 2024
Simple Cortex Tangential Theories	38	3899	October 8, 2017
Thoughts on hierarchies, object recognition, and dreams Numenta Theory	0	588	September 11, 2016
Trying to understand how to advance HTM Tangential Theories	19	1789	September 24, 2018

Toward An Unsupervised, Incremental, Streaming and One-Shot Visual Class-Based Learning and Recognition System with Hierarchical Temporal Memory Theory

Related topics