I’ve been looking into RatSLAM, and the grid cells described there seem like a really good way of describing coordinates, as well as numbering systems in general.
It’s actually pretty similar to the number encoders already included in nupic, as long as repeating is enabled, but it seems like it has some extra properties.
For example, there are multiple encoders that increase size respective of what they represent, almost like binary encoded numbers. Also, I haven’t confirmed this with research, but it seems like multiple neurons in the same size grid would fire at different locations within that grid, but more than just two, like binary.
It’s more like the binary system has been expanded so each ‘bit’ centers around where the while number under its field is.
For example, encoding 010 into 3 encoders would make the top most encoder appear near the middle, even though it’s 0, because it encodes a more birds eye view of every number underneath it, rather than 1 or 0.
Each encoder could then be calculated as x<-x%2^n for center+size>x>center-size to code the full number when combined.
What do you guys think? Should I try my hand at another number encoder to test it?
Computer people like powers of two. Nature generally does not.
The ratio between adjacent fields as you go from dorsal to ventral follows a somewhat different distribution - more of a logarithmic spacing. A commonly measured value is 1.14 (square root of 2) See if this helps:
Buried in this is an amazing gem - " This spatial gradient mechanism is homologous to a gradient mechanism for temporal learning in the lateral entorhinal cortex and its hippocampal projections. Spatial and temporal representations may hereby arise from homologous mechanisms, thereby embodying a mechanistic “neural relativity” that may clarify how episodic memories are learned."
Connecting multiple spatial scales to decode the population activity of grid cells (Martin Stemmler, Alexander Mathis, and Andreas V. M. Herz)
A principle of economy predicts the functional architecture of grid cells (Xue-Xin Wei, Jason Prentice and Vijay Balasubramanian)
ROBUST AND EFFICIENT CODING WITH GRID CELLS (LAJOSVÁGÓ AND BALÁZS BUJFALUSSY)
What I don’t see in these papers is consideration of how this might be working in other parts of the brain. The WHERE stream is well represented but what about the WHAT stream? What do the patterns there mean and how are the two streams connected?
That is definitely going to be useful. The presentation will be useful for explaining things quickly when I need to.
Unfortunately, I have a class which I have to read three or four research papers entirely for this weekend, so it could take me a while to read those fully. However, I was able to skim enough to see the scale factor at approximately e for one dimension, and approximately sqrt(e) for two dimensions.
However, there’s enough wiggle room that I wonder if scale factors of 2 and sqrt(2) would be optimal for encoding things with computers. In the former case, I could go back to binary calculations, and in the latter, only one constant sqrt(2) would be needed for calculations.
I might just have to set up the encoders and build some tests to figure that out.
Help please, what does phase mean wrt grid cells?
Phase refers to the alignment relative to two different grid patterns. If both patterns had the center nodes in alignment the would be “in phase.” If the centers of one are in between the other they would be out of phase.
The thought here is that the repeating nature responds well (mathematically) like a wheel that stamps a position each time it rolls along.
180 degrees out of phase
Alright, given the information in those articles, as well as some hexagonal coordinate system math here (https://stackoverflow.com/a/2459541/782170), I’ll start with the simplest encoders:
Almost binary encoding. The only difference is that the inverse of the number is included alongside it to satisfy the phase requirement. So 10 would be represented by both 00001010 and 11110101.
Quaternary encoding. In this case, the x and y positional values can be translated into hexagonal coordinates with the following math (from https://stackoverflow.com/a/2459541/782170)
B = (y/(3 * 2^(n-1)))%2
R = ((sqrt(3) * x - y)/(3 * 2^n))%2
For each scale n.
And, similarly to 1-dimensional, phase has to be encoded. For that, b,r=0,0 maps to 1 neuron, b,r=0,1 to another, b,r=1,0 to a third, and b,r=1,1 to a fourth.
Of course, none of this is optimized in the slightest, for nupic, biological representation, or computer speed. But, it provides a simple enough starting point for me to implement, add tests, and then work on the optimizations, scale vs. resolution considerations, etc. from those papers.
Have you considered lookup tables?
I have not looked at what GPU’s provide but texture mapping could be thought of as a lookup table.
That’s an idea. Though, once I generate the initial fields, I should be able to shift the on and off neurons, turn neurons on by connection, and run some local excitation/inhibition algorithms on the GPU or CPU.
I’ve been making a lot of code on the GPU recently, so expanding into much sparser, larger representations shouldn’t give me any problems, and it’ll give me more variability than a lookup table.
I will be saving constants like sqrt(3) though.
I’m not sure if texture mapping is the concept you’re looking for. Using textures maybe, but the GPU is already set up for doing lots of calculations, so those b,g transformations, a rand function to select and excite very near neurons, a rand function to select and inhibit very far neurons, and an input to translate the on neurons, should all be well within the capabilities of a GPU.
I was able to get a retina simulation as well as a v1 simulation, up to basic end stop cells, using those methods.