Are there any previous attempts for an associative memory, or (even better) a very large database for SDRs? @SeanOConnor - the associative memory you-re working at, do you have some performance vs capacity metrics you aim for? I mean in terms of how many patterns can it store and how many queries can it answer per second?
I have this feeling that an associative memory should very well scale with storage which is more cheaper and more power efficient than scaling with computing power which is now the trend with transformers.
As some research suggests there is some functional equivalence between associative memory and attention, one rising question would be what is the price/performance tradeoff?
Just as a back-of-the-envelope estimation, a cluster with 100 x 1TByte SSD drives would arguably match brain’s capacity of storing 100T parameters, while largest transformer models struggle with 2 orders of magnitude fewer.
Of course there remains lots of opened questions like
- how would one index/optimize the data in order to maximize query performance
- what performance would one get from an e.g. 100 nodes cloud, would that be sufficient to… ?
- then what? How would one design then implement “intelligence” on top of a sufficiently large associative memory database?
Sorry for the long message - microsoft’s new 500B parameter model is trained on a ~$100M machine (560 DGX A100 nodes priced at $199k each).