RDSE implementation is wasting buckets?

It should be possible to extend an encoder to arbitrary ranges and resolutions if you use something like a grid cell representation for the encoding. Choose some number of cells to use per module (say 64), and a unit length scale (resolution) associated with each module. Like for example, you may choose a unit length of one for the first module and thus you would have a representation of all integer values modulo 64 (i.e. the pattern would wrap around when you go above 64 or below 0). To extend the range, you would add another module with a larger unit value (like say 64). You now have the capacity to represent up to 64^2 unique integers. To increase the resolution, you would choose a smaller length scale (like say 1/64). Of course the length scales don’t have to be powers of one another, and there may be a robustness argument to make for allowing some overlap in length scales, or multiple modules with similar length scales. The point is that it should always be possible to extend the range and/or resolution of the representation by simply adding more modules. There may be some adjustment needed in how the downstream layers connect to the encoder, but all previously learned representations should be unaffected. At worst, there may simply be additional bits that are now available from the new modules.

EDIT: Upon further reflection, I realize that what I have proposed here is similar to how we add digits to either end of a number to increase the capacity and/or resolution of the value. However, we are typically not able to attend to all of those scales at the same time. So, it may be that we are biologically constrained to only a finite number of grid cell modules and therefore we can only consider/compare values on scales of roughly two or three orders of magnitude. If two-values are not on roughly the same scale, the values are essentially incomparable to one-another (besides one being “much larger” or “much smaller”).

I’m still thinking about how this notion of extensibility (in precision and capacity) can be adapted to other non-scalar and/or higher-dimensional properties.

2 Likes