So I ran a thought experiment for this.
I thought let's imagine an environment with only 8 states, represented by the letters A through H.
Then I connected them as you can see in the image below (at random).
Then I wrote out all the states, what they lead to, and what leads to them.
Lastly, I produced an SDR for each that encoded the semantics of this system's causality, but of course not the likelihoods of seeing any particular state since I didn't create that information first.
The first portion in this image: personal identity is created in the same manner as described in my previous post above. The "personal identity" portion isn't sustainable, actually, you can't have one guaranteed unique bit for every state of a system.
The two and from are simply a mapping of what leads to what. The state E shares a bit with F and C because they both lead to H, it shares a bit with H and B because C leads to all three of those states. So we've semantically encoded, in a generic way, the causal map of the system. all that's left is to, as the notes below describe, increase or decrease the number of shared bits depending on how often one state leads to another based on your particular application. For instance, if E leads to H constantly, but F rarely leads to H perhaps E's representation can have several 1's that correspond to "to H" whereas F only has one of those bits on. This is where my hunch came from that we might want the encoder to be able to move columnar connections around. Let's say you discover H is a very popular node, so you give it more bits in the To representation portions. The columns that were connected to the singular Bit for "To H" should, once you add another bit for "To H", be connected to that bit as well.
So this is Semantic Folding for causality, nothing fancy. What if the states represented by single letters in this example were actually several characters long? Then the dense spatial representation features could be correlated and involved in the semantic encoding process. I think in any environment that can be traversed causal maps may be a good guide for part of the semantic encoding, but I don't think that it's a one size fits all solution. There's probably other information that can be pulled from the HTM model itself or elsewhere that can help the encoder become more appropriate to the environment as explored by the intelligence.
Also, this is a 'universal encoder' but not a 'universal encoding algorithm' that can produce highly fine-tuned feature and semantic encoders, like how evolution created our organs to better serve our neocortex. So this is not the end, but I think, a clue in the right direction. Thanks for that Semantic Folding suggestion Paul_Lamb!