I wanted to share with you some HTM experiments I did for a class project this semester. In all reality, I whipped this entire thing up together in less than 2 weeks so its not super pretty or groundbreaking but the results are interesting.
Those who are comfortably familiar with HTM already can skip the background section.
This is the beginning of ongoing personal research I plan to continue: specifically the effort to design a biologically plausible encoder that encapsulates the functional properties of the human visual pathway i.e. everything that happens before any signal carrying visual information from the retina ever reaches cortex including the functionality of the lateral geniculate nucleus in the Thalamus and related structures.
Including wrap-around bits maintains 6 unique states. It’s how I have been modeling possible network behavior from hexagonally connected neighbors.
The often square 2 axis arrays of HTM are still a mystery to me. It seems more biologically accurate to always maintain the 3 axis network geometry. In that case there are six possible states in your examples. The neighbor to neighbor signal bits are then circularly arranged, and will wrap-around as they would when a 6 bit binary counter is “rotated” left or right.
It sounds like that question might be for me. I thought about the kind of representation each column would have, and ironically it’s the ideal thing for extracting angular motion based information. With a through a straw sized view a bit pattern can be seen “rotating” around itself as in the example I gave where bits will periodically wrap around:
There are still predictions being made. Difference is that each column is able to predict when it’s at the center of a rotating pattern, or going off to a side where the pattern widens out then goes from 000000 to 111111 at the same frequency and bit pattern as before. From what is moving in the picture and their relative location in it: column cells gain an overall view indicating size, spin rate, and all else there are motion related words for including “bounce”.
Certainly, there’s nothing stopping you from including wrap-around in this situation to bring the number of represented states to 5. That example was meant solely for illustration of a concept and not a suggested encoding scheme. I should make a note actually that there is assumed no wrap-around.
I found meaningful results only after setting up a topological organization of column’s potential synapses. When each column’s potential pool of connections spanned the entire input space, the results were junk. In the input to the spatial pooler (the binary pixel data), the bits have relative spatial semantics. That’s to be contrasted with something like a random distributed scalar encoder where input states are randomly distributed across the whole space. The encoder of course does not have to be designed this way, one could imagine even just doing a deterministic scrambling of bit locations to each state equally to distribute patterns across the whole space. In short though, when the encoder output has embedded this kind of relative spatial semantics of bits and their locations in the input space, it appears topologically organized columns do a much better job of making sense of it.
Thanks for the detail. Your explanation is so close to what I use hexagonally and circularly I was not sure whether your example extends to all six possible rotational states for any bit width pattern. In what I’m used to the process of finding a center angle causes a sparsing of information down to normally one central bit or position in between, from the original 2 or more bit signal pattern. Exception is occasionally more than one direction is possible, depending on current heading. When I saw three bits in a row the two sides were immediately redundant. It’s then in a form that takes it to six possible one bit states anyway, which you started with.
You’ll maybe be thankful to know that my having to ask the question is related to a serious limitation to the models being simulated using the Emergent platform, where the 2 axis squareness and oversimplification most shows, in a Reddit thread where I presented a new paper that might be worth trying to model from as a “put this into layman’s terms” challenge for all who are willing. This exactly:
To explain how the model I have works there is a link back to my Numenta forum topic and wherever that has since lead. It could otherwise seem like the only path to a detailed neurological model is through Emergent, when Numenta is so on a mission to put the whole cortical sheet together as one fully connected model:
I’m not sure whether I should start a new topic for this or not, but all help making sense of the confusing looking connection diagram below is appreciated. It seems to be HTM friendly next step towards a model for the entire cortical sheet, where all may look puzzling right now but it will make sense when we better understand how the system all together works. Or I hope so.
Very nice work @bkutt! Would be great to see larger scale experiments now that you have the framework in place. Also, I’m curious if you can keep one-shot learning and still get good performance on bigger datasets.
Definitely…once I develop a more sophisticated and biologically plausible visual encoding scheme, I want to bump up the breadth of testing. I’ll certainly keep the community updated as my research progresses.
I was trying to download the pdf from the link: http://brodykutt.com/Brody_Kutt_CSCI-731_Project
Unfortunately, that link is not working properly, is there any way that I could get a copy of the document?