The difficulty is here in my opinion. We need to overcome the scaling issues I think. One approach I’m currently investigating is related to triadic memory. This topic was suggested by a member of this community in one of my threads and it made me discover an entire thread around it here: Triadic Memory — A Fundamental Algorithm for Cognitive Computing - #8 by cezar_t.
The problem I see with this model is the size it takes in memory even without storing any value in it yet. It’s O(n^3) in the number of bits for the triadic, which is… bad… but O(n^2) for the dyadic one. The concept is interesting. That made me make a link with Bloom filters somehow because one can see the dyadic memory as a CBF where the hash functions are somehow defined by the way the hamming distance determines the neighbors and thus the locations of the counters to be updated.
I also thought about embedding things in 2D or more-dimension spaces like you are suggesting but the difficulty is that one somehow needs to perform geometric transformations that should keep the relations between some points but transform others without breaking everything. What would be the objective function for selecting the right points and the right parameters for the transformation? I thought maybe having a reference frame for each word in a dataset of sentences and update based on co-occurence of words but that would likely break all already established relations.