Can HTM explain this optical illusion?

zimt · December 24, 2017, 1:02pm

What strikes me interesting about this optical illusion is (1) that the interpretation shifts from circles to rectangles back and forth about every 3-6 seconds and (2) that the interpretation tends to be spatially homogeneous; at least I have a hard time to perceive of rectangles in some regions and circles in other regions at the same time.

Observation (1) can likely be attributed to a prior for temporal change in the human brain which has perhaps evolved to avoid brains from getting stuck in addictive simple states (Dark Room problem) in which model predictive rewards are large due to the simplicity of the sensory inputs. At the same time this prior encourages exploration: It allows to get out of local optima during inference (and perhaps during thinking in general).

Observation (2) seems to suggest that the brain infers two features separately: One feature for a repeating structure (in this case a regular grid); and one for the content of the repeating structure (circle vs rectangle). It seems top-down feedback then also sharpens features for the respective interpretation across the field of view.

I was wondering whether HTM might be able to explain these phenomena, or alternatively whether there is something that HTM could learn them.

DellAnnaLuca · December 28, 2017, 9:44pm

A possible explanation for (2) is that density of cones in peripheral regions is lower than in central ones; it wouldn’t seem unlikely that peripheral regions have thus weaker bottom-up visual signals and therefore rely more on top-down predictions/expectations rather visual input to infer what lies in front of us.

DellAnnaLuca · December 28, 2017, 9:47pm

A possible explanation for (1) is that first, the problem solving priority is “what does this aggregate of white, grey and black lines represent?”.
Once a solution is found (say, “circles”) the next question is: “what does the area between the circles represent?”.
The answer to the second question is: rectangles. However this answer increases expectations of rectangles and decreases the expectations of circles, and thus we only see circles.
And so on.

zimt · December 28, 2017, 11:50pm

You mention a very good point that the periphery is more constructive and it makes up most of the field of view, so it is more likely that the vast majority of the field of view is strongly determined by top-down feedback. But that does not explain why the top-down feedback tends to be the same everywhere.

zimt · December 29, 2017, 10:49am

As for (1), there are no parts of the image that either of the interpretations leave uninterpreted, so this seems to be an unlikely explanation.

DellAnnaLuca · December 29, 2017, 3:56pm

I find it unlikely that “I’m seeing rectangles lying on a sea of weird lines of different widths and colors” can be considered as a full interpretation. Unless something completely fills a pattern we already saw, our brain is always on the look for additional meaning.

DellAnnaLuca · December 29, 2017, 4:04pm

Let me give you an example of why top-down feedback tends to be the same everywhere.

You are standing in your house. Your eyes are closed. Something brushes your left foot. Your foot alone is unable to understand what touched it. With your eyes still closed, you touch this something with your left hand: its hairy and it moves. It’s your dog. Now something brushes your right foot.
You expect it to be your dog, don’t you? Your foot will perceive “my dog is standing against my right foot”.

Why did this happen? Because the “sensory neurons” belonging to your left foot, the ones belonging to your right foot, and the ones belonging to your hands all share the same areas, at some point higher in the hierarchy. This is necessary, in order for extrasensorial expectations to modulate sensorial perception (I need to be able to know, if I am in a jungle, that a sound of a cracked wood might be made by a lion, even if I don’t see any big fur).

zimt · December 30, 2017, 12:30am

That matches my intuition as well. All low-level feature detectors share the same top of the hierarchy. But this feedback is not very strict. With some meditation I can force myself to briefly see a rectangle at the top left and a circle at the bottom right, so ultimately this aspect is probably not easy to explain because it can be subject to complex emergent behavior and attention.

Not sure whether recognition of spatial repetitions have something to do with object tracking, or whether it also simply emerges as a pattern detector from a solution to the universal prediction task.

When I see circles, my mind comes up with a figure/ground interpretation where the foreground consists of circles with a continuous texture of vertical lines and the background consists of a continuous texture of horizontal lines. Interestingly, the figure/ground interpretation is itself bistable: the circles can either be holes or elevations. This set of interpretations explains all of the image, but it is not stable (at least for me and everyone who I have presented this optical illusion to).

DellAnnaLuca · December 30, 2017, 12:27pm

You make a good point with object tracking and attention.

Let’s take the example in the image below.

Seeing the faces completely explains the image; and yet, full stability isn’t achieved.
I was therefore wrong in my previous hypothesis.

You are probably on something with attention: when we focus on the faces we see the faces, when we focus on the chalice we see the chalice.
What makes us shift attention, though? Why can’t we focus indefinitely on an interpretation?

I would go with an evolutionary hypothesis: being able to entertain bistability, to shift to a second interpretation when a first one is available, is one of the hallmarks of creativity. An individual who is creative clearly has an advantage over the one who isn’t.

What is the underlying neurological mechanism?
It would be interesting to set an experiment in which, somehow, the eyegaze is fixed on a point (the faces, for example). Is it necessary to wait for an involuntary eye movement, in which the focus falls on the chalice, to switch interpretation? (in such case: are saccades predictors for creativity?)

rhyolight · December 30, 2017, 3:53pm

Moved from #htm-theory:neuroscience into #other-topics:community-lounge.

Topic		Replies	Views
Thoughts on hierarchies, object recognition, and dreams Numenta Theory	0	588	September 11, 2016
Trying to understand how to advance HTM Tangential Theories	19	1788	September 24, 2018
Question for the htm theorists? Community question , community	11	523	December 30, 2022
Toward An Unsupervised, Incremental, Streaming and One-Shot Visual Class-Based Learning and Recognition System with Hierarchical Temporal Memory Theory Engineering encoders	14	1406	July 18, 2018
Time Perception and Distortion: The Neuroscience of Subjective Time Numenta Theory	17	2251	September 22, 2016

Can HTM explain this optical illusion?

Related topics