ANN : Scalar Matrix-Hashing ?encoder?

mraptor · December 12, 2016, 12:29am

I build Scalar Matrix-Hashing !encoder!.
Please read the Readme.txt

The interesting thing about it is that it should support as big numbers as possible.
Additionally you may not need to use Spatial pooler if the hashing works correctly i.e. you can link the Encoder directly with the TM.

The only problem I see is how to guarantee there are no collisions between the different hash functions.
For this we would need to figure some way to adjust the matrices, based on collisions that happen over time OR figure out way to that from the start !!!

My main reason to post here is to hear from you some ideas on lowering collisions between hashing functions.
In the current implementation you can check collision statistics :

se.collision_cnt
se.avg_collision

btw there is no similarity between nearby numbers ! which may disqualify it as Encoder.

rhyolight · December 12, 2016, 11:37pm

This is true… the first rule of encoding (from Encoding Data for HTM Systems) is:

Semantically similar data should result in SDRs with overlapping active bits.

I’m not sure the results of this encoder will be process-able by HTM systems without that.

Sean_O_Connor · December 13, 2016, 12:16am

Locality sensitive hashing (LSH) is the way to go:
https://en.wikipedia.org/wiki/Locality-sensitive_hashing
If you regard LSH as feature (cue) detection and you could extend LSH with some unsupervised learning of those features and you created a read out layer…

rhyolight · December 13, 2016, 12:42am

That looks promising. If anyone tries this, please let us all know.

mraptor · December 13, 2016, 1:31am

Yes, I figured it out after I posted it ;( … seems I was too quick to post … it seemed too good to be true

mraptor · December 13, 2016, 1:39am

Do you know of a hash function that increases collision for similar numbers , as example ?

Sean_O_Connor · December 13, 2016, 12:57pm

Ok, you have a list of numbers, in a recomputable way randomly add and subtract those numbers and take the sign of the combination as a binary bit. Hey, did you just destroy all the information in the list? No, actually not. You can say if all the added values were greater than all the subtracted values or not. If you get another bit by using a different random combination that second combination is almost exactly orthogonal to the first. If the list of numbers is sparse in some way the compressive sensing crowd can show you can exactly recreate the list from not so many bits. Obviously if you have two lists that are almost the same they will tend to output the same hash bits. Simple as that really. It isn’t rocket science.
http://dsp.rice.edu/sites/dsp.rice.edu/files/cs/CSintro.pdf

Sean_O_Connor · December 14, 2016, 1:41am

Maybe this paper is a bit easier:
http://www.raeng.org.uk/publications/other/candes-presentation-frontiers-of-engineering

A simpler viewpoint is that if the input data is sparse the full information content of that data can be captured with a moderate number of random correlations. It is never going to be as compressive as say Jpeg but it makes less assumptions about the data and is easy to do.

For LSH if the data vector has 65536 (2^16) elements and you want 65536 hash bits you have a slight problem there. You would need to do nn (6553665536) (±) operations, kind of slow. You can reduce that down to n(ln n) 65536*16 (±) operations by doing (recomputable) random sign flipping of the input data followed by a Walsh Hadamard transform.

There is an old joke that everything in AI is dot product calculations, and so it is with the above when you look into it.

brev · April 19, 2019, 2:38am

FYI, I think I ended up getting something working along these lines.

thanks.

Topic		Replies	Views
NEW: SimHash Distributed Scalar Encoder (SHaDSE) - DEPRECATED NuPIC	14	1929	July 10, 2019
New scalar encoder implementation: minimalism vs ambiguity Engineering encoders	2	975	March 3, 2017
Scalar encoder to SDRs Numenta Theory	1	1197	April 11, 2017
Proportional Scalar Encoder? Implementations	6	1037	September 16, 2022
Encoder and Spatial Pooler Confusion Getting Started	17	893	April 5, 2019

ANN : Scalar Matrix-Hashing ?encoder?

Related topics