I have came up with a new method, which I called the Rotational Cell encoder; it encodes N dimensional values without being effected too much by the curse of dimensionality (Like the current Grid Cell does, requiring exponentially more cells to represent higher and higher dimensional space). But I’m not totally sure that my method works. And it is purely mathematical now. It would be great if you guys could spend some time reviewing my thoughts and catch potential problems.
Simple, encode given a N dimensional vector, generate a SDR that has the properties of a SDR. - Being sparse and two similar vectors should generate similar SDRs.
Rotational Cell Encoder (RCE for short) accepts a real valued vector V of length N and spits out a SDR. And each RCE is formed by several Rotational Cell Modules (RCM). Like how Grid Cells in HTM works.
For each RCM:
- Generate a random unit vector U on the N+1 D hyperspace.
- From a rotation matrix M using the input vector V’s component as rotation radient.
- Maybe multiply by a factor of 2*PI
- Rotating in a N-D space need N-1 D angles. Thus a N+1 D space is needed to have N rotation angles
- Calculate the new vector W = MU. The new vector should also be an unit vector.
- For each component in W, use one 1D Grid Cell Module to encode it’s value.
- There are multiple Modules in a Rotational Cell Encoder. That should solve the ambiguity problem.
For each RCE:
- Encode the vector V using the RCMs
- Concatenate the SDRs generated from the previous step.
Some issues that I have think of. There might be more issues with this method. And lot’s of improvement is definatelly needed.
In higher dimensional space, an unit vectos’s component’s mean value gets lower
So the scale/grid size of the Grid Cell Module in each RCM has to get lower and lower as the RCE is encoding higher dimensional data. Maybe a unit vector may not be the best idea to begin with. The length of the vector U might have to grow exponentially according to N to maintain constant grid size.
Need a mathematician to confirm.
The components in vector W is not in an uniform random distribution
Ideally bits in a SDR should be equally likely to turn on. But since components in the result vector W is not uniform (should be self evident. Or integrate the area under curve; it’s not uniform.). The quality does not hold. There needs to be a way to map the components back to a uniform random distribution.
That’s all what I have came up with for now. Please leave feedback so I can improve my idea.