As far as HTM and spatial invariance - how does the brain do this?
It’s not all done in one go.
We know something about the processing that is done in the subcortical structures. We know of place cells, grid cells[7], head position cells[6], border cells (more than likely whisker cell signaling in rodents), and goodness knows what else.[13][14] It is clear that the brain is forming an abstract representation of space. We even know the native data structure format - distributed grids.
In part of this system, we have a gyro stabilized reference platform (the vestibular system) that is directly mapped to the eye tracking system to keep the eyes from being distracted by self-motion. (At that point, in that little spot on the brain stem, there is the closest thing that if you had the correct instrumentation, you could measure as the neural correlates to a sense of self.[1][2]) I believe that this is fed in a stream to the hippocampus. We know about the head position thing. I’m sure there is much more. [3][5][10]
As you progress around two oddly shaped hippocampal structures there is an interesting data reduction. This grid changes scales 1:1.4 (square root of 2) relative to the adjacent area.[8] That means as you sample longitudinally you have multiple scales of the same thing. If that pattern is projected to the same general cortex area you would get a “halo” of scaled representations all being imprinted/learned at the same time. It would have some of the properties of a scratch hologram[4]; these features are self-reinforcing for retrieval. You might call that scale invariance. I certainly do.
One of the virtues of well distributed neural implementation is the massive redundancy of processing power and data integration. The grid things are as good a distributed data format as anything I could have thought of. How exactly does that map to columns? I suspect that the column features are supportive of processing the kind of thing the hippocampus likes to signal. Look at the “place cells” activation area or size. Compare that to the cortical columns that connect to that area and the population of grid cells being sampled by those cells. That should give you a working approximation for the scope of the local processing that is being done to form the place recognition. What is it in the local grid representation that can communicate the idea of a place? [9] While grids provide a rich framework for this level of coding I don’t think anyone has looked for them as an answer before:
“At the top of the cortical hierarchy, where information is combined across sensory systems, it was often no longer possible to match the firing patterns to any experimentally defined stimulus patterns” [7]
The hippocampus is famous for being longish with the end being next to the Amygdala which is well known for being hardwired to sense certain patterns.[11] This system also demonstrates an interesting property - the translation from retina-centric reference to the coordinate system you perceive around you. I used to think that this some complicated math transformation and wondered how the brain could calculate that. I see now that I was making it too hard: this is the “anchor” on the other end of the vestibular system. You could think of the hippocampus as a spatial or geometry co-processor. Call it an HTM reality encoder if it sounds better.
If the amygdala is able to decode facial features out of the coded data SDRs should be able to read it too. The level of communications at this point is actual situational recognition.[12]
Summary: If HTM/SDRs in combination with some supporting neural network types are configured in a SYSTEM that provides the functions that are known to be in the subcortical processing system, they should be able to do coordinate transformations, recognition, and scaling of a stream of objects in images. A spatial reference must be provided with the stream.
[1] https://open-mind.net/papers/vestibular-contributions-to-the-sense-of-body-self-and-others
[2] https://open-mind.net/papers/self-as-subject-and-experiential-ownership
[3] The role of vestibular and somatosensory systems in intersegmental control of upright stance - Rob Creatha, Tim Kiemela, Fay Horakc, and John J. Jekaa,.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2938746/
[4] ABRASION HOLOGRAMS FREQUENTLY- ASKED QUESTIONS
http://amasci.com/amateur/holohint.html
[5] How Basal Ganglia Outputs Generate Behavior
https://www.hindawi.com/journals/aneu/2014/768313/
[6] Topography of head direction cells in medial entorhinal cortex
https://www.ncbi.nlm.nih.gov/pubmed/24440398
[7] Grid Cells and Neural Coding in High-End Cortices
http://www.cell.com/neuron/abstract/S0896-6273(13)00900-8
[8] Topography of Place Maps along the CA3-to-CA2 Axis of the Hippocampus
https://www.ncbi.nlm.nih.gov/pubmed/26298277
[9] Computational Models of Grid Cells
http://www.cell.com/neuron/abstract/S0896-6273(11)00650-7
[10] Time Finds Its Place in the Hippocampus
http://www.cell.com/neuron/abstract/S0896-6273(13)00491-1
[11] The role of the amygdala in face perception and evaluation
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3294209/
[12] The Amygdala Modulates Memory Consolidation of Fear-Motivated Inhibitory Avoidance Learning
https://www.ncbi.nlm.nih.gov/pubmed/10995852
[13] Neuronal Implementation of Hippocampal-Mediated Spatial Behavior: A Comparative Evolutionary Perspective
https://www.ncbi.nlm.nih.gov/pubmed/16801684
[14] Landmark-Based Updating of the Head Direction System by Retrosplenial Cortex: A Computational Model
https://www.frontiersin.org/articles/10.3389/fncel.2018.00191/full