I love the idea, but also think we are ready for what Jeff last week described for a cortical column.
Bitking found evidence of the required Speed variable and now have: place, body/head-direction, border/boundary, conjunctive, speed and grid cells to work with, which greatly simplifies the challenge of programming a self-exploring agent.
To show why the Objects=Places thinking is true while at the same time simplifying recognition problem a 2D cup would be drawn into a flatland world as a solid sphere with short projection out the side that it senses by bumping into. Optionally when cup is tilted top slice at rim creates an opening for it to get trapped inside for awhile where it then feels the shape of the inside too.
Letters of the alphabet can be used to place cups in a room but in the virtual world all solid objects are most simply places to physically touch the surface of, then only need one or more touch sensors. After that a (optionally sequentially presented) flatland view of the Numenta logo can be painted to the outer side of a cup surface and other objects including various sized ones on walls.