This is a follow up to this post: Allocentric Grid Fields on a Moving Sensory Surface
The same underlying concepts will be built on in this post, although the mechanisms will be different.
First, consider a 1 dimensional sensor with two points of sensory stimuli on it. The colors, textures, exact shapes, etc. of those points do not matter. Relating positions of stimuli to the characteristics of those stimuli is a problem left alone for now.
Receptive fields move across (scan) the sensor. When the receptive field hits a sensory stimulus, the unit starts oscillating. Then, when the receptive field hits another sensory stimulus, if the oscillation is in its depolarizing phase, the unit has a grid response.
Note: the sensory inputs shouldn’t be so granular. The receptive field moves across the sensor rather than jumping in increments. Also, the receptive fields loop back around to the left after hitting the rightmost part of the sensor continue until right before the first place scanned.
That produces grid fields on the sensor, which is necessary when contacting multiple things at once. However, sometimes the sensor will contact one feature and then move to another feature.To correct for movement, the unit scans the sensor faster or slower. By doing so, it keeps its grid (if it were extended off the sensor) aligned with the feature contacted before the movement, so it can determine whether the distance between the two features is a multiple of the grid’s spacing.
Purpose of this Version
Things get more complicated with more than one dimension. There isn’t just one way for the receptive fields to scan the sensor, and movements can involve rotations.
There were also some problems with the prior version. If there were more than two points, it would not work because, while the receptive field scans, it first encounters one feature which determines the phase of its oscillation, so it can’t find the distance between the next two features. The neuroscience of the mechanisms was also shaky.
Scanning Receptive Fields and Decoupling Feature Size from Allocentricity
The unit described in the previous post is a dendritic segment. All units with the same grid spacing either respond to two points or they do not, so it makes sense to use a single neuron for all of those units. However, there isn’t an exact 1:1 correspondence between these dendritic segments and the units, so I’ll drop that term.
Each dendritic segment does not scan the sensory input on its own. Instead, the nearby neurons* scan the sensor. They only scan a limited area of the sensor in the first level of the hierarchy, so the grid cell cannot detect distances between stimuli beyond a certain amount. However, other dendritic segments are near other neurons, extending the area scanned on the sensor. This is a bit of a problem in higher regions, but it is solvable, and it’s actually useful in lower regions. Lower regions are mainly concerned with small features and small areas on the sensor. But they still need allocentric responses, to an extent. By only considering small distances between stimuli, lower regions could learn smaller features while still not needing their thalamus-driven receptive fields to be aligned with the stimulus. This mechanism decouples the size of features considered by the region from the overall area to which each neuron will respond to features.
*Probably the same type of neuron, but let’s not make things confusing. (The steps would be separated into pre-sensory evoked inhibition and during or perhaps after that inhibition.)
The neurons which provide inputs to the grid cells only need to scan a small area. Well, technically they don’t scan. The number of neurons which respond depends on how closely their point on the cortical sheet is topographically aligned with the sensory stimulus (they can still encode things by which fire, but that’s not used here.) Therefore, there is less input to the segment the further away it is topographically from the stimulus. This causes the NMDA spike to take longer to evoke and/or take longer to increase enough to sufficiently impact the soma. As a result, in effect, those neurons near the dendritic segment take longer to respond the further away the stimulus is. This is equivalent to scanning the sensory input from the topographically aligned point outwards, radiating as a ring of points to which those neurons would respond. Depending on the mapping, this allows scanning more than two dimensions in the same radiating manner.
That’s the distal basal input. To fire in grid mode, the cell also needs distal apical input. There is a lot of redundancy. Hypothetically, all of the basal dendritic segments will respond, since all the units in the prior version either respond or do not. That means the basal signal to the soma will be pretty strong but still not suprathreshold on its own. So maybe the grid mode response is a 100+ hz burst or a more mild burst.
The oscillation described in the recap is local to the dendritic segment here. It starts when the segment receives input. That doesn’t seem like too big a stretch, because adjacent dendritic segments will have similar oscillatory phases anyway because of the scanning receptive fields.
Orientation and Accounting for Movement
By orientation, I mean the angle of the line between two sensory stimuli on the sensor relative to some arbitrary line on the sensor.
So far, the grid cells don’t care about orientation, only distance. So they need to detect distance along a particular axis, essentially. (There are alternatives to Cartesian coordinates and they’re probably not how the brain works but this isn’t exactly that.)
It might not be a good idea to use one axis for each dimension because that doesn’t seem very realistic. If the brain codes other things which aren’t in binary categories as locations, such as color (which has topography like color wheels in V1, I believe), then there can be more than three dimensions, perhaps even more than ten. Also, assuming egocentric regions operate like allocentric regions with extra data about egocentric positioning, more dimensions are probably utilized to represent egocentric positions. And especially if the number of dimensions depends on learning, like if we have different coordinate systems for different concepts or e.g. math versus writing.
Instead, each grid cell essentially has its own axis. If there were three dimensions, there wouldn’t be three perpendicular axis. Instead, each cell would have its own line through three dimensions at a random orientation relative to the other lines. In four or more dimensions, it would still be a line. By using a line no matter how many dimensions there are, it is easier and simpler to apply the concepts from the previous post.
To find the distance along the cell’s “dimension line” let’s call it, it essentially projects the line between the two stimuli onto its dimension line. This is easier and less mathy than it sounds.
Up to this point, I have been talking about thick tufted (TT) cells. Slender tufted (ST) cells have wide subthreshold receptive fields like thick tufted cells, but narrow suprathreshold receptive fields. This is thought to be because they receive a lot more inhibitory input from a little ways away than TT cells. So I think they’re a good candidate to detect orientation. Perhaps they are inhibited by nearby interneurons which receive excitatory input from a particular direction. That way, if they are topographically aligned with one stimulus, they are inhibited by another stimulus depending on the angle of the line between those stimuli. If there is a particular direction from which they are not inhibited, then they respond. That way, they respond to a particular orientation.
I’m not sure about this because ST cells are corticostriatal (more so than TT cells possibly) and it would be nice for them to serve a scanning function so in motor regions they can scan through the motor possibility space. Then, if TT cells jump rather than scanning, that might solve some apparent contradictions about presaccadic predictive remapping. Some studies found that they jump and others found that they shift.
Although, perhaps by selecting particular orientations as described they can scan particular sequences of motor possibilities. But they don’t scan anything in this system.
Also other layers exist so this whole thing is kind of forced (except some of the initial reasoning which was meant to explain some things about layer 5).
Once ST cells represent the orientation between the two points, determining the distance along the TT cell’s dimension line is simple since TT receptive fields scan in a radiating manner. It just needs to change the rate of scanning (equivalent to stretching or contracting space.)
That’s already necessary to account for movement. During movement, ST cells should use the same mechanisms they use to determine orientation (whatever those are) to determine movement direction. That way, it adjusts the TT cell’s grid fields based on how far the sensor is moving along that cell’s dimension line. Whether correcting for orientation or accounting for movement, ST cells project to the apical dendrite and the basal dendrite (unlike most things I say in this post, those projections pretty clearly exist). Together, one for oscillation phase and one for scanning response rate, those projections stretch or compress the grid fields, effectively warping space*.
All of the dendritic segments need the same oscillation rate to give the grid cell consistent spacing, but different grid cells should have different grid spacings, so it would be nice if a single, no-local-processing compartment like the apical apex determined that. The apical dendrite has a current called I(h), increasing distally, and that doesn’t play a major role elsewhere on L5 cells, I believe. I(h) might be involved in single cell oscillations. That’s why the apical dendrite might be involved in modulating the scanning rate and accounting for orientation. The apical dendrite also receives a lot of the feedback and input from motor regions, so it’s a good fit. How exactly the apical dendrite could change oscillation rates in the basal segments is something I’m not sure about. I’m not sure it’s even necessary because I need to think about it more.
To adjust the speed at which it scans space, the basal segments can be modulated in some sort of multiplicative fashion (probably just plain synaptic input) so the NMDA spikes reach threshold to influence the soma sooner, scanning space faster.
How to account for rotating during movements? I don’t know yet. It needs to happen to keep the projection onto the cell’s dimension line consistent. Perhaps ST cells do something similar to what TT cells do, but to account for rotations rather than translations.