HTM Hackers' Hangout - Jul 6, 2018

Back at the normal time, 3PM Pacific.


  • Sneak peek at some upcoming HTM animations
  • Numenta podcast
  • Papers dropping this fall
  • several speaking events this fall
  • (your topic here, comment below)

HTM Hackers’ Hangout is a live monthly Google Hangout held for our online community. Anyone is free to join in the discussion either by connecting directly to the hangout or commenting on the YouTube video during the live stream.

If you have something specific you’d like to discuss, or if you just want to learn more about the HTM Community, please join HTM Forum at We have active discussions about HTM theory, research, implementations, and applications.

I will post a link to join the hangout immediately before it starts.


Join here (or watch live on YouTube above).

1 Like

I recall in a previous video, Jeff mentioned that if a font were physically small enough, then the lowest level could recognize and understand a word. I think the idea you are trying to convey is that levels in a hierarchy are better tuned to recognize objects of different scales.

From that perspective, I wonder if it would be possible to show three cups at different scales, showing that levels higher in the hierarchy are able to see a large scale object in the same way that levels lower in the hierarchy are able to see the same object at a smaller scale.

Of course, the visualization would become rather busy at that point, so might not work.

BTW, I have to say that I’m not entirely convinced about this view of hierarchy. I look forward to further details in the future to help clarify the concept in my mind. The mechanism behind how each hierarchical level connects to the next one (which will involve working out the details for how representations encoding the proper semantics form in the SMI output layer, for example) should help to explain things.

1 Like

Does a column form connections everywhere on the input (1) or does it form connections to a specific space/location of the input (2)?

Example 1:

Example 2:

1 Like

It is the second. The final animation will show how information flows also.

One possible confusion with the animation is it might seem like each cortical column corresponds to one section of the cup rather than one section of the space on the sensor.

1 Like

The main problem I have with understanding this view of hierarchy, is that it seems to contradict with other theories in Numenta’s current SMI research. Recent diagrams show layer connections like this:


In particular, the output layer (L2/3 in above diagram) is designed to form representations of objects:


Now since the output layer becomes the input for the next level in the hierarchy (please correct me if this is no longer the current theory), then that means the higher hierarchical level is modeling at a higher level of abstraction (not simply at a larger physical scale). This is more in alignment with traditional views of hierarchy (where the higher you go in the hierarchy, the higher the level of abstraction).

I feel like in order to have a purely scale-focused hierarchy, the output layer should not be forming unions of feature/locations into objects, but should instead be performing a simpler scaling operation. For example, this might mean using a smaller number of neurons, each of which sample from more than one of the input neurons in the input layer, and activate when above a certain threshold. I would expect the representations formed in the output layer not to represent an object, but rather to represent a down-scaled copy of the input layer.

Hopefully this doesn’t come off as argumentative. I am mainly just trying to understand more of the mechanics that enable a hierarchy of this nature. Reason being that this has some rather large implications on even the most basic implementation details.


Talking about this some more with @gmirey, I’m thinking that I have misinterpreted the visualization (which is of course the reason you showed it). In its current form, the point that is communicated appears to be all about scale. But I believe scale is not the only thing you are trying to communicate. Sure the lowest levels are looking at edges, etc, while higher levels are looking at something which covers a larger physical area, but the latter is more than just scale (it also encompasses abstractions).

Correct me if I am wrong, but I think the point you are trying to communicate is that even the lowest levels of a hierarchy are able to model abstract concepts. In a traditional view of hierarchy, you start with edges, then roll those up to shapes, and so-on, until you eventually reach a level that understands cups.

The difference with Numenta’s view on hierarchy (and I think the point you want to somehow capture in your visualization), is that although those lowest levels are looking at edges – they are representing them in the context of cups and other high levels of abstraction even without the benefit of a hierarchy.

Please correct me if I have misinterpreted the point that you are hoping to communicate. If not, I’ll do some thinking to see how your visualization might be updated to help capture this point. I suspect that I’m not the only one who will look at the current visualization and immediately be drawn to the physical scale aspect and miss the point about abstractions in lower levels.


Some day I hope to join the live hangout, it’s an awkward time in my current schedule with the TZ difference.

My feedback is that “classic” hierarchy made almost immediate intuitive sense to me, but this new model has taken quite some digesting and I’m still not quite there yet.

I can relate to @Paul_Lamb’s confusion at whether the hierarchy is giving you object/concept abstraction or scale abstraction, and very much this comment:

Some clarity on this would help me too. Having lower parts of the hierarchy that I’m used to thinking represent reusable common attributes like edges, textures, smells, etc suddenly containing full objects again feels counter intuitive, like some of the engineering elegance is being taken away :smiley:

I"m sure you don’t want to make significant changes to the animation at this point, but what if it could use some hierarchical concepts like in the original grid cell examples (can of coke I think?), and maybe a different coffee cup, just for the purpose of showing where abouts these diverge from each other in the hierarchy.

To clarify something @subutai mentioned to me offline, it is not really the 2nd picture. The red lines are misleading, that is not exactly how I was going to show this. So standby and don’t assume anything about how a column is connected to feed-forward sensory input just yet from these pics.

Bingo. That is the primary point of this animation. Your feedback is great, makes me realize we should describe this phenomenon without mentioning scale at all. That was quite a red herring. Good input!

1 Like

Nothing is off the table. My main goal is to communicate the theory.

Hey don’t forget that I plan every few hangouts 6 hours earlier so folks outside the US have a better chance to join. Is 3PM GMT a good time for you @jimmyw?

Main Suggestions

Here is my takeaway from this topic. Not sure I can do all these, but here is what I’ve gathered so far:

  1. don’t mention scale, focus explanation on abstract object recognition in all regions
  2. don’t focus on one object, explain how many objects are being matched simultaneously
  3. describe how regions get feedforward data from not just sensor array, but other regions below it in the hierarchy
  4. somehow show how lower region columns are not really seeing a whole cup, only recognizing a whole cup when only seeing a small detail of the cup
  5. make it clear that each column sees a portion of the sensor array, not a static portion of the object

Thank you all. I’m still accepting feedback. I might post a later version of the video for more.


Yup, maybe mention that (if the described hierarchy starts at V1), the column in V1 still sees an edge at that location, but that you’re proposing that it also recognizes it as being the edge “of that whole cup” if the cup is a familiar object.

I see it as quite similar phenomenon to the TM, where cells recognize ‘C’ but also know that it’s a C which is ‘C after B after A’.
Unless I’m totally off.

Cheers for the all the videos and animations and tutorial efforts.


It’s slightly easier, makes it Saturday lunchtime instead of Saturday morning, but Saturdays are generally tied up with parenting duties and sport. Evenings are the perfect time for me, which is the very early hours of the morning in PDT so not realistic. Both of the current timeslots will work a lot better for me in a couple of months when the soccer season ends.


Why is it so important to HTM theory that cup is recognized everywhere in the brain at once?

What is the driving force to diverge from virtually every other model of perception and push object recognition all the way to columns in V1?
And to virtually every sensory column in the brain. How do the columns in the small of my back learn all the coffee cups and all the other things I can recognize?

If I am strongly right handed and use it for everything does that mean the columns that are connected to my left hand won’t recognize all the things my right hand has learned?


I think I know what he will say, but I will ask Jeff this tomorrow morning and get back to you.


High speed data bus in modern computers, for accessing memory, work at giga Hz,
or at may mega Hz. And only access few location in one read / write cycle.
Brain does not do multiplexing that well.
I would think a brains with a slow data bus would be highly paralyzed. Copies of sensory data would be sent to many running programs in the brain.

Then there would have to be program that would have to select which of these program will have the highest chance of generate the right

Like a caching / boosting algorithm.

One the plus side: every connection is permanent and direct. The transfer time for any given transfer is as fast as one neuron can signal another.
There is a well known fact that the brain goes from perception to output in as little as one hundred cell-2-cell communication steps.

1 Like

A couple of observations which might help with understanding this view of hierarchy ( or it might help me learn where there are still gaps in my own understanding :wink: ).

The first observation is that a single HTM region is by itself a 2-level hierarchy. The representations in the output layer are more abstract than the representations in the input layer. This is the first piece of architecture which allows the lowest level to form abstractions.

Imagine a hierarchy of three regions separated by Spatial Poolers. For the sake of simplicity, I’ll depict each region as two layers, input and output (obviously HTM theory currently has more layers involved in SMI, but these are the important two conceptually for communicating my point)


You can see that the transition from the input layer to the output layer within the same region is actually the logical boundary between hierarchical levels (not the transition from the output layer of one region to the input layer of another region). The SP algorithm is designed to fix sparsity while preserving semantics, not to increase abstraction.

The second observation is that the representations in each output layer (depending on the temporal pooling algorithm chosen) should be able to integrate more details into fewer consolidated representations the more frequently a particular pattern is encountered. This means, when a new complex concept is encountered, it might initially require hooking into more levels of hierarchy to reach a single stable representation for the overall concept. As the object is more frequently encountered, the abstraction should be able to push further down the hierarchy.

If you apply this to sequence memory and pooling (the same concept would apply to objects, but I’m lazy and it is easier to depict sequences), initially a particular sequence might require three hierarchical levels to form a single abstract representation. As it is encountered more frequently, some of the lower abstractions would start to merge, and may only require two hierarchical levels. Even more frequent encounters might push the abstraction down to the lowest level.


Combining these two points, one can theorize that the lowest levels should be capable of recognizing complex abstractions for familiar things which are encountered frequently.