Somewhat surprised that this topic hasn’t surfaced here, especially given Pylyshyn’s descriptive analogy:
Imagine, he proposes, placing your fingers on five separate objects in a scene. As those objects move about, your fingers stay in respective contact with each of them, allowing you to continually track their whereabouts and positions relative to one another. While you may not be able to discern in this way any detailed information about the items themselves, the presence of your fingers provides a reference via which you can access such information at any time, without having to relocate the objects within the scene. Furthermore, the objects’ continuity over time is inherently maintained — you know the object referenced by your pinky finger at time t is the same object as that referenced by your pinky at t−1 , regardless of any spatial transformations it has undergone, because your finger has remained in continuous contact with it.
Visual indexing theory holds that the visual perceptual system works in an analogous way. FINSTs behave like the fingers in the above scenario, pointing to and tracking the location of various objects in visual space. Like fingers, FINSTs are:
Plural . Multiple objects can be independently indexed and tracked by individual FINSTs simultaneously.
Adhesive . As indexed objects move around in the visual scene, their FINSTs move with them.
Opaque to the features of the objects they index. FINSTs reference objects according to their location only. No additional information about their referents is conveyed via the FINST mechanism itself.
I see this as plausible subcortical early visual processing, part of the mechanism that crudely parses the visual field and forces the frontal eye field to look at things so the cortex can process it.
This is interesting thanks.
Might well be a way of the current mind context is kept active.
I recall a presentation in which a toddler playing with stacking cubes was reaching his hand back without looking and picked another cube that he wanted to put on top.
Maybe that’s how the environment map is made and tracked internally - by keeping a virtual finger in “contact” with objects of interests
Looking at a sensory system homunculus, the hand gets the lions share of the cortex. I think the whole Numenta ‘coffee cup’ thing is readily explained via FINST theory.
After a quick look I did not see an explanation of how FINST is realized. Are there working systems implementing FINST? It seems more like a hypothesis than an explanation. For example, how does it explain the tracking of objects that are occluded?
Good points, but the theory shows a correlation between the somatosensory system and visual objects; i.e., the ‘Fingers of Instantiation’ tying into a world model, which would accommodate occlusion. I don’t know if they ever coded anything.