Project : Full-layer V1 using HTM insights

Thank you, that’s useful references and interesting thoughts.

I’m not familiar with the inner structure of the old brain, for me it’s a black box connected to the cortex. Do you think in general it uses the same principles for vision as the cortex? I mean, is it basically the previous version re-specialized to work in pair with the cortex, or does it work completely differently?

It would be interesting, thank you in advance. The global workspace theory and forming the world model it’s what I’m deeply interested, but didn’t have enough time to dig in.

Yes - there are layers. There are also sizable nodes that run on more of a boltzmann-hopfield network. Dragons be here in these poorly charted waters.

1 Like

So, I can’t go to sleep for a few hours without you bringing the party over here, is what you are saying ?
17 messages, really ? :grin:

Nah… great conversation guys :slight_smile:
Now I have to decide whether I shall be nitpicky or not :dark_sunglasses:

Lemme see…

I think what @spin’s talking about with respect to eye movements is that ANNs used for vision seem perfectly capable to work without them for recognition. I’d agree with him that chemistry of receptors accustoming to current illumination level is not necessarily something the camera of a T800 would have to replicate.
Yet, until we know otherwise, it is not necessarily something we can leave out either, in a model trying to get a clue how it actually works.
Although, up to a point, we ourselves are perfectly able to massively parallel our way through image recognition without a saccade. A literate reader does not scan through letters and certainly does not saccade across every bar and loop of the calligraphy. Whole words are seemingly photographically recognized, per fixation.

@rhyolight, HTM 2016 ain’t much concerned about saccades either. What matters to the temporal thing of the model is that the input flows. You could as well fixate at a TV screen or through a train’s window. Now I don’t know of Numenta’s more recent SMI insights. I hope you guys tell us about them very soon.

I’m not necessarily aware of each and every paper on this as of 2018. I’m not sure we could say we know nothing of those, though.

I’m tempted to reply to that with two, orthogonal, yesbuts.
Yes, I think it is fair to say that the response of V1 simple cells is very specific and covers only a tiny subset of all possible mathematical functions we could dream of applying to a local area of an image, but…

  • But that’s also quite precisely the reason why they are worth studying. Why did they wire to this particular choice of a function. And how is it relevant to an AGI substrate that they do so.
  • But again simple cells are not the only cells in V1. You’ll find many others concerned about other aspects. Still a subset of all possible functions, hence still worth exploring how and why, though.

They marked out those based on rough limits for their responses to a stimulus, before testing inside those rough outlines with specific stimuli.
if you look at the following video from the timepoint I linked to (tried to… or about 18 minutes in, if that did not work)… I believe the insert on the bottom-right is a video archive of their actual protocol back in the day.

I think you’d agree that a convolution kernel being applied to a local square does not prevent its sensitivity to any specific subset of shapes within that square either.

Today for more detailed outlines we look at the wiring. Relationship between V1 dendrite and LGN axon makes inferring a precise receptive field in visual space quite a complicated matter, though.

Anyway. I expect that the wiring range is quite standard throughout V1 topology itself. But the fact that V1 retinotropy is distorted (giving much more space for fovea than periphery) means that a given wiring range translates to very different sizes in the visual space indeed according to eccentricity, as @Bitking pointed out.

Studying V1 could give us clues into that black box also. At least I hope it will : LGN and V1 form quite intricate loops. I’m always happy to ask @Bitking to the rescue on many of those interrogations.
I don’t believe cortex and those older parts would work the same at all. Retina to thalamus to V1 looks like a very delicate clockwork, designed by ‘evolution’. And they provide carefully tuned filters, surely very concretely helpful for processing visual data (center-surround, etc). But once in V1 we’re dealing with our elusive universal learner… quite a different matter.


As for whether those structures have been tweaked since their inception prior to the development of a pallium… my guess is yes. Possibly heavily. It happens we have subtle differences in vision-related circuitry in thalamus and midbrain from one species to the next, even among mammals.

To me, the obvious explanation for this is that optimizing visual processing at the level of specialized networks is quite doable/probable/common… but that optimizing the wonderfully weird cortical substrate as a whole, without breaking something, is a much harder proposition than most.

Had to be done a few times still, though… birds don’t have the 6-layered version of it that we mammals share, but they have something covering their older parts too… and they’re not all that dumb.

I’d expect last change to the intrinsics of the cortical fabric itself happened for us between 120 to 320 Mya… So… not exactly your everyday mutation.

And if “Gaia” herself did not find a way out of it in more than one hundred millions of years, we’d better pay attention to the details for how to replicate it. Even to those insignificant-looking V1 edge detectors.

…as well as to the lizardy things coupled below, if only to appease @Bitking’s wrath.

1 Like

This is true.

1 Like

We currently only have a rough road ahead brushed for SMI is what I was saying. From what I got reading a recent intervention of Jeff himself your research teams are currently busy dealing with deeper layers in V1.

Which is quite exciting for me and I’m quite impatient to get a shot at studying those details.


A recent attempt at making sense of V1, using computational models which try to stay true to current biological knowledge.

direct pdf here:

hey @Bitking, pretty griddy, their L2/3 activation maps at times ^^


The integration of visual and target signals in V4 and IT during visual object search

Going a bit upstream from the V1 area.


This is interesting, topdown modulation is inhibitory. It is excitatory in the current top layer of SMI research. Interestingly L2/3l has short range inhibitory connections implicating cell grids as you pointed out.


It seems that there is a lot going on in V1.
High-dimensional geometry of population responses in visual cortex

1 Like

Modeling the Visual System


Thanks for the links, @Bitking :slight_smile:

On my side, not much to report, except… I’m just starting to implement, as tightly packed as possible, my foreseen 16b-indexed/4b-valued synaptic scheme; trying to keep the thing compatible with both HTM and the possibly diverse V1 designs (Leabra & stuff).

and leaving the door open for some Calvin :dark_sunglasses:


Note the inference about visual cortex responses.

1 Like

Maybe because of the discrete repartition of color blobs in V1 ? And utter reliance on edges.


I have been doing my best to keep up with this thread. I want to try to apply HTM to the CIFAR10 dataset. And this seems to be a good starting point.

Anything I should know before I go down the rabbit hole?

Hey there marty.

I can’t say for sure that my endeavour here (which was on hold for almost a year) would be a good starting point for this goal. I believe you’d rather aim for an encoder which, given an image input, produces a v1 like output, and start your network modelling after this point.

But if you find stuff in there which helps you better understand what a v1 output would be, I’d be glad ^^. Although afaic, I’m still having a hard time defining it :wink:

Some more grist for the idea mill:

Towards a Unified View on Pathways and Functions of Neural Recurrent Processing

What are the functions of corticocortical, pulvinocortical, and cholinergic feedback to primary visual cortex? The authors argue that together they provide mechanisms for inferring and interpreting sensory inputs in context. Pennartz, @LarsMuckli & CO.


You’d have thought Matt could dev this in no time, but one year later, still can’t give more than 1 heart per bitkingey post…

Sry mate !

Will need time to extract all info from this one :slight_smile: thanks a lot


A post was split to a new topic: More reactions for posts?

Wonderful article. After reading tons of neurological BS over the last six months, I found the article truly enjoyable and directly reflective of the actual process. While they are not entirely correct, they are not far off. Thank you for the share, it is greatly appreciated and it is definitely going in my research file for further correlation.