Really - when it comes down to it - can you be more universal than either a single conjunction of what and where or the transition between a pair of what-wheres? Where does it say that encoding has to be a strictly parallel process? Part of this universal encoding is the sequencing of these things as the data streams in. A sequence or what-wheres is a unit of data. A sequence is also a description of a noun-verb-noun relationship of anything that can be perceived or imagined - an action if you will. The parsed input is transformed until you reach this relatively high level of description. The “letters” being seeming meaningless parsed patterns of bits until they map to the “words and phrases” of the WHAT and WHERE streams memory. The temporal part of HTM driving the transition of states in this stream.
The implementation being various hardwired encoders that are genetically tailored to whatever it takes to transform each stream. Each is transformed into some data format that is common to all encodings so that they can be combined in useful ways. A what and where stream. The joy of it all is that the system is designed to learn whatever pattern that it might encounter. It’s universal. I note that grid encoding is, in fact, capable of performing these functions and has been observed doing so in the wild.
If this is in fact how the brain does work I could see more of the WHAT stream fed to the side being the speech half, with more of the WHERE stream being parsed out on the other; the one doing spatial navigation and spatial sequences in general.
Combine this with the consciousness loop I described earlier - you would have yourself something interesting.