Before we talk about a possible design for a universal encoder I think it would be best if we are on the same page. I may not understand something fundamental about Encoders, so let's take it slow...
Here's what I understand about semantic SDR encoding:
An input bit needs to represent the same thing each time it is seen. (This requires sparsity because you can't fill up every possible combination of bits like in ASCII). As a quick example of this concept, if the first bit represents the color blue of whatever object you're encoding; if it's ever 'on' in the representation, the object that is described by the SDR must be at least partly blue.
Input bits that are spatially close to one another in the representation (such as the first and second indices) don't necessarily share any semantic meaning. What actually matters is that two 'similar' representations need to have overlapping 'on' bits. 'Similar' things need to have 'similar' representations.
The second point may be where I'm having trouble in my understanding. 'Similar' in what way? In all ways? You can have things be similar in different ways and when viewed from different angles.
Take a simple number line as an example - we have a scalar encoder where 'similar' as in 'close' numbers share the same bits. However, if you want to understand numbers in the light of other mathematical concepts you may not consider two 'close' numbers 'similar.' For example, if you want to traverse the number line with multiplication instead of addition and subtraction then prime numbers may all be very 'similar' to each other, more so than numbers that are neighbors on the number line.
Is it that the encoder needs to be specific to the application of your HTM model? That is my assumption. But a question I still have about that is: in what ways must it be tailor-made to best represent the semantic data to the HTM model? Different representations of an environment could be 'close' in many different ways. The simplest way might be 'causally' close; if I do an action (such as '+1') I get something that looks a lot like the last thing I saw (such as the next number on the number line).
Is causality an adequate guide to making semantic representations? What are other guides we could use to make semantic representations?
Let's put those two questions on hold for a second, and do a thought experiment (this should allow us to get to our conclusion):
Suppose causality was an adequate semantic guide. If so we should be able to make a Universal Encoder algorithm. All we would have to do is make a 'causal map' (what raw, or dense representations lead to what raw or dense representations) and encode them as similar sparse representations. As we see more of the environment our causal map will get filled out, and our SDR's will change over time. Thus, as we change our semantics of how the encoder represents the environment we must allow the encoder to reach into the HTM model and move columnar connections around to compensate for its modifications.
I believe this is possible (unless I don't understand something fundamental to the HTM theory) because evolution has created encoders. How did evolution create them? The better the creature is able to navigate its environment (measured by the number of its offspring that reproduce) the better the encoder organ was at encoding SDR's for the neocortex to consume. In other words, the information about what is semantically similar (or simply the information about how best to encode the environment) reached back from the neocortex, through the algorithm of evolution to create a highly tuned encoder. Thus it is my contention that if the encoder was appropriately intertwined with the HTM model (meaning it could get the appropriate information from the HTM model and could modify the HTM columnar connections to the input bits appropriately to preserve, at least in part the model that's forming), it could evolve in real-time with the HTM model. It would be, at that point, a Universal SDR Encoder algorithm.
But in order to create it, we need to answer the questions asked above: Is causality an adequate guide to making semantic representations? What are other guides we could use to make semantic representations?
Also, how does this feel to everyone? My understanding might be incomplete, so I might be oversimplifying the problem, and my interpretation of how evolution has created encoder organs may be inaccurate. But there doesn't seem to be a good explanation yet of how encoders should be created - it's more of an art than a science at this point. If it were a science we could automate it, I'm sure.