I build Scalar Matrix-Hashing !encoder!.
Please read the Readme.txt

The interesting thing about it is that it should support as big numbers as possible.
Additionally you may not need to use Spatial pooler if the hashing works correctly i.e. you can link the Encoder directly with the TM.

The only problem I see is how to guarantee there are no collisions between the different hash functions.
For this we would need to figure some way to adjust the matrices, based on collisions that happen over time OR figure out way to that from the start !!!

My main reason to post here is to hear from you some ideas on lowering collisions between hashing functions.
In the current implementation you can check collision statistics :

se.collision_cnt
se.avg_collision

btw there is no similarity between nearby numbers ! which may disqualify it as Encoder.

Locality sensitive hashing (LSH) is the way to go: https://en.wikipedia.org/wiki/Locality-sensitive_hashing
If you regard LSH as feature (cue) detection and you could extend LSH with some unsupervised learning of those features and you created a read out layer…

Ok, you have a list of numbers, in a recomputable way randomly add and subtract those numbers and take the sign of the combination as a binary bit. Hey, did you just destroy all the information in the list? No, actually not. You can say if all the added values were greater than all the subtracted values or not. If you get another bit by using a different random combination that second combination is almost exactly orthogonal to the first. If the list of numbers is sparse in some way the compressive sensing crowd can show you can exactly recreate the list from not so many bits. Obviously if you have two lists that are almost the same they will tend to output the same hash bits. Simple as that really. It isn’t rocket science. http://dsp.rice.edu/sites/dsp.rice.edu/files/cs/CSintro.pdf

A simpler viewpoint is that if the input data is sparse the full information content of that data can be captured with a moderate number of random correlations. It is never going to be as compressive as say Jpeg but it makes less assumptions about the data and is easy to do.

For LSH if the data vector has 65536 (2^16) elements and you want 65536 hash bits you have a slight problem there. You would need to do nn (6553665536) (±) operations, kind of slow. You can reduce that down to n(ln n) 65536*16 (±) operations by doing (recomputable) random sign flipping of the input data followed by a Walsh Hadamard transform.

There is an old joke that everything in AI is dot product calculations, and so it is with the above when you look into it.