Right, and as Jeff said in the book, the first layer deals with rapidly changing sensory data. I understand that HTM in its current implementation doesn't deal with some parts of the brain architecture, but I still think this is a relevant problem, even if we look at it purely from a neuroscience perspective.
Following the way examples are presented in "On Intelligence", let me try to explain:
When we look at the apple, the encoding for it in the brain is not just "apple", right? The first layer detects small features and propagates them upstream until the information hits the IT layer. This is where Jeff said the same pattern fires if the "apple" object appears anywhere in the visual space. So how does this initial encoding happen, how does the neocortex know that this set of neurons should fire when we see edges, this when we see colors, etc in the first place? I understand that this may now even be known yet, but still curious.
If my logic is correct and we have some idea about how the brain designates neurons to process the initial sensory spikes, we may be on our way to creating the autoencoders?