Hi, I have been toying around with the idea of getting the most out of a Grid Cell encoder. We know GC works very well. But we don’t know when does it work and more importantly when it doesn’t. So, here we go.
Defining the problem
Before discussing whether if GC works or not. We have to define what is considered “working” and develop a method to evaluate the performance. Recall the basics of encoders (I can’t remember which episode of HTM school…)
- Similar values must encode to similar SDR
- Different values have (in the best case) 0 common bits
- It is difficult for a random SDR to have a lot of common bits with another
So to evaluate how well a encoder is working. I’ll compute a heatmap, where each point (x, y) on the map is how much overlapping bits between encode(x) and encode(y). Then because HTM relies heavily on thresholding. The heatmap will be thresholded, anything above becomes 1 otherwise a 0. Next, subtract 1 from the diagonal line of the matrix (since the distance between x and x should always be high). And the loss function is the mean of the resulting matrix. The lower the better.
This is by no means a perfect loos function of the task. It fails to capture the proper of similar neighbors. But it should be good enough for the purpose.
Baseline
The heatmap generated by a scalar encoder looks like this. The color indicates how many bits are overlapping. The x and y axis is, well, the x and y value. In our case, the x and y axis ranges between -0.5 and 1.5, centered at 0.
Loss: 0.203
The big blocks at two of the corners of the plot is caused by Scalar encoder isn’t configured to handle values outside of 0~1.
And the default GridCell encoder in Etaler looks like this. As expected, GridCells can handle any real values. But it seems to have a bad overlapping issue between it’s neighboring values. This may or may not be an issue for you. Maybe your application consider the value 0.1 and 0.12 the same. That’s totally fine. In case it isn’t. We’ll have to fix it.
Loss: 0.2328
Attempting fixes
The first solution one might come up might be to increase the amount of gird cell modules and cells per module. Unfortunately that does not work. Even with 2x the module and 2x of cell per module. We can find the edges of the blocks being smoother. But the issue isn’t fixed.
Loss: 0.2672
Using multiple active cells per GCM is also a bad idea. Tho, this looks like a good way to reduce the sensitivity of out encoder.
Loss: 0.495
Taking the union of two GridCell encoders also sounds promising. Yet again, it does not work.
Loss: 0.2502
Since the fundamental cause of GridCell encoders to not distinguishing values is caused by the GCM’s field of view being too high. Reduce them works (scale in the inverse of FOV).
Loss: 0.0294
Or alternatively, we can scale the input! Notice the banding in the image and from the previous one. This method seems to work better.
Loss: 0.019
Failure patterns
In the process, I also discovered the with too little GCMs and cells per module. Besides the bits starts to repeat very quickly. The encode also have problem distinguishing nearby values.
Loss: 0.2814
And apply scaling to the encoder just caused repeated SDR.
Loss: 0.2632
Conclustion
In conclusion, if your GridCell encoder is too loose. Try scaling your input value or reduce the FOV of the GCM. If it is too sensitive, try increasing the FOV of the GCM or set the number of activated cells in a GCM to 2.
Also be cautious about not having enough GCM/cells per GCM. It will lead to some very bad results.
Source code: https://gist.github.com/marty1885/082670c69709c0125f0da14c630719e5