Project : Full-layer V1 using HTM insights


So, I can’t go to sleep for a few hours without you bringing the party over here, is what you are saying ?
17 messages, really ? :grin:

Nah… great conversation guys :slight_smile:
Now I have to decide whether I shall be nitpicky or not :dark_sunglasses:

Lemme see…

I think what @spin’s talking about with respect to eye movements is that ANNs used for vision seem perfectly capable to work without them for recognition. I’d agree with him that chemistry of receptors accustoming to current illumination level is not necessarily something the camera of a T800 would have to replicate.
Yet, until we know otherwise, it is not necessarily something we can leave out either, in a model trying to get a clue how it actually works.
Although, up to a point, we ourselves are perfectly able to massively parallel our way through image recognition without a saccade. A literate reader does not scan through letters and certainly does not saccade across every bar and loop of the calligraphy. Whole words are seemingly photographically recognized, per fixation.

@rhyolight, HTM 2016 ain’t much concerned about saccades either. What matters to the temporal thing of the model is that the input flows. You could as well fixate at a TV screen or through a train’s window. Now I don’t know of Numenta’s more recent SMI insights. I hope you guys tell us about them very soon.

I’m not necessarily aware of each and every paper on this as of 2018. I’m not sure we could say we know nothing of those, though.

I’m tempted to reply to that with two, orthogonal, yesbuts.
Yes, I think it is fair to say that the response of V1 simple cells is very specific and covers only a tiny subset of all possible mathematical functions we could dream of applying to a local area of an image, but…

  • But that’s also quite precisely the reason why they are worth studying. Why did they wire to this particular choice of a function. And how is it relevant to an AGI substrate that they do so.
  • But again simple cells are not the only cells in V1. You’ll find many others concerned about other aspects. Still a subset of all possible functions, hence still worth exploring how and why, though.

They marked out those based on rough limits for their responses to a stimulus, before testing inside those rough outlines with specific stimuli.
if you look at the following video from the timepoint I linked to (tried to… or about 18 minutes in, if that did not work)… I believe the insert on the bottom-right is a video archive of their actual protocol back in the day.

I think you’d agree that a convolution kernel being applied to a local square does not prevent its sensitivity to any specific subset of shapes within that square either.

Today for more detailed outlines we look at the wiring. Relationship between V1 dendrite and LGN axon makes inferring a precise receptive field in visual space quite a complicated matter, though.

Anyway. I expect that the wiring range is quite standard throughout V1 topology itself. But the fact that V1 retinotropy is distorted (giving much more space for fovea than periphery) means that a given wiring range translates to very different sizes in the visual space indeed according to eccentricity, as @Bitking pointed out.

Studying V1 could give us clues into that black box also. At least I hope it will : LGN and V1 form quite intricate loops. I’m always happy to ask @Bitking to the rescue on many of those interrogations.
I don’t believe cortex and those older parts would work the same at all. Retina to thalamus to V1 looks like a very delicate clockwork, designed by ‘evolution’. And they provide carefully tuned filters, surely very concretely helpful for processing visual data (center-surround, etc). But once in V1 we’re dealing with our elusive universal learner… quite a different matter.


As for whether those structures have been tweaked since their inception prior to the development of a pallium… my guess is yes. Possibly heavily. It happens we have subtle differences in vision-related circuitry in thalamus and midbrain from one species to the next, even among mammals.

To me, the obvious explanation for this is that optimizing visual processing at the level of specialized networks is quite doable/probable/common… but that optimizing the wonderfully weird cortical substrate as a whole, without breaking something, is a much harder proposition than most.

Had to be done a few times still, though… birds don’t have the 6-layered version of it that we mammals share, but they have something covering their older parts too… and they’re not all that dumb.

I’d expect last change to the intrinsics of the cortical fabric itself happened for us between 120 to 320 Mya… So… not exactly your everyday mutation.

And if “Gaia” herself did not find a way out of it in more than one hundred millions of years, we’d better pay attention to the details for how to replicate it. Even to those insignificant-looking V1 edge detectors.

…as well as to the lizardy things coupled below, if only to appease @Bitking’s wrath.


This is true.


We currently only have a rough road ahead brushed for SMI is what I was saying. From what I got reading a recent intervention of Jeff himself your research teams are currently busy dealing with deeper layers in V1.

Which is quite exciting for me and I’m quite impatient to get a shot at studying those details.


A recent attempt at making sense of V1, using computational models which try to stay true to current biological knowledge.

direct pdf here:

hey @Bitking, pretty griddy, their L2/3 activation maps at times ^^


The integration of visual and target signals in V4 and IT during visual object search

Going a bit upstream from the V1 area.


This is interesting, topdown modulation is inhibitory. It is excitatory in the current top layer of SMI research. Interestingly L2/3l has short range inhibitory connections implicating cell grids as you pointed out.


It seems that there is a lot going on in V1.
High-dimensional geometry of population responses in visual cortex


Modeling the Visual System


Thanks for the links, @Bitking :slight_smile:

On my side, not much to report, except… I’m just starting to implement, as tightly packed as possible, my foreseen 16b-indexed/4b-valued synaptic scheme; trying to keep the thing compatible with both HTM and the possibly diverse V1 designs (Leabra & stuff).

and leaving the door open for some Calvin :dark_sunglasses:


Note the inference about visual cortex responses.