Universal Encoder

@Bitking Isn’t grid encoding exclusively related to location encoding?
And even if they map out which sensory pattern belongs to what in the map there is no way to verify their authenticity. Right?
PS. When can I be a Jedi?

Is there any reason that a method used in one sensory encoding would not be employed in another?

In the rat entorhinal cortex, we know that at a minimum it encodes vision, whisker barrel cells, self-motion, head direction, and vestibular sensations. All have been found to result in forming a location map.

Expand your idea of mapping - you map your way through your experiences. Some examples - navigating through a social interaction. Navigating through a book or story. A common memory trick is to place things in locations in real space - known as a memory palace. Some reflection on this can show that much of what you memorize has a spatial component. Relations (joining verbs) between things forms basic tuple to the relationship between nouns - a spatial relationship.

Yes, because they encode different information, even if it is about the same world.

Mapping is fine but a common map just unifies the patterns externally to locations.

Despite this, I don’t see how it gives a way to the method described in previous replies.
I am not denying the existence of spatial components in all patterns, that’s how they are encoded.

You ask about a universal encoder. I believe that is the central thrust of this thread.

Encode to … what?

So a universal encoder transforms whatever stream that comes to it into some common format that is readable by an SDR neuron.

The SDR is the reader of this code. What is it that an SDR reads? Conjunctions of spots of activity over a fairly short range.

If it is not intermingled grid patterns it is going to end up being something very much like it. A thought for you. If we know with a very high degree of certainty that a chunk of cortex works a certain way in one spot is it a stretch to expect it to work the same way in other places?

If it helps - the nervous system evolved to run bodies; to sense the surrounding space and relationships in that space and navigate through it. All external senses feed this basic mission. We have added some nifty tricks to this basic function but that has been the guiding principle from the start.

In the brain, we have a WHAT and WHERE stream. Each parses the raw information using different rules. The WHAT stream can parse semantic meaning by the grouping of like items. They don’t have to be physically adjacent - they only have to group in spaces that an SDR equivalent of a place cell can see them. See the cortical IO “retina” for a wonderful applied example of this idea in action.

In the WHERE stream, we already know the hippocampus reads the WHERE stream to learn places.

I agree with almost everything except a few details and the gist about the similarity of encodings.

A universal encoder will encode all semantic information from any type of data into an SDR(or a bit representation in general)

It might as well, but unless all encoders encode using grid cells that store semantic information, I dont see how all types of encodings are similar. And even if sensory modalities encoded using grid cells, I don’t think they will encode all semantics, it’s just not in the grid cell’s domain to do so. What I think is being discussed is that there is a reason why there are special cells and special, complicated processing paths for every sense organ and each sense organ encodes different kind of information. If there aren’t these differences then you cannot encode different meanings.

Encoders work separately and only create SDRs(or moderately dense representations). Those SDRs(bit representations) can undeniably be combined, split, used, abused, etc. by the HTM but that doesn’t change how they were encoded by the encoder in the first place.

Really - when it comes down to it - can you be more universal than either a single conjunction of what and where or the transition between a pair of what-wheres? Where does it say that encoding has to be a strictly parallel process? Part of this universal encoding is the sequencing of these things as the data streams in. A sequence or what-wheres is a unit of data. A sequence is also a description of a noun-verb-noun relationship of anything that can be perceived or imagined - an action if you will. The parsed input is transformed until you reach this relatively high level of description. The “letters” being seeming meaningless parsed patterns of bits until they map to the “words and phrases” of the WHAT and WHERE streams memory. The temporal part of HTM driving the transition of states in this stream.

The implementation being various hardwired encoders that are genetically tailored to whatever it takes to transform each stream. Each is transformed into some data format that is common to all encodings so that they can be combined in useful ways. A what and where stream. The joy of it all is that the system is designed to learn whatever pattern that it might encounter. It’s universal. I note that grid encoding is, in fact, capable of performing these functions and has been observed doing so in the wild.

If this is in fact how the brain does work I could see more of the WHAT stream fed to the side being the speech half, with more of the WHERE stream being parsed out on the other; the one doing spatial navigation and spatial sequences in general.

Combine this with the consciousness loop I described earlier - you would have yourself something interesting.

Maybe not. But this doesn’t pertain to encoders(hypothetically).

These various hardwired encoders are in question yes I think they have to be different for each data type.

This format is a bit representation or an SDR but again falling back to the main discussion, these are formed differently using the hardwired encoders that are genetically tailored. I argue that these encoders cannot be universal for all data types.

Will read that post.

You can still guess the gist of a sentence even if the order of the words are jumbled up, in most cases and you could also make a system with a single symbol representing a single ‘word’ and get rid of letters. The point being, this is subjective. The temporal nature is taken into account but the location is subjective. Correct me on this, I stand by whatever I said about universal encoders, but regarding this, I really don’t have sufficient information. Everything can be mapped with location but does everything need to be? The way we understand language could be explained without the grid cell logic, still using HTM, right? Our brain might have evolved to map everything the way you described and so everything gets processed that way, but it might not be functionally relevant in everything.

Maybe not a stretch but it is certainly not a given and without sufficient evidence with respect to other areas of cortex, there is no need or reason to assume so.

The cortex is not activated by a grid pattern but forms the grid pattern.
Look here to see how the columns do this feat with all the elements of SDR neurons.
It’s pretty much a one-to-one mapping. Please look at slide 45: note each column’s dendrites picking up some learned local part of a pattern and the winning columns inhibiting neighbors. This is all HTM canon law.

I wish I could say that this is all original with me but, truth be told, William Calvin has been teaching the basic elements of this for years. I don’t agree with all of his macro systems but at the micro level, he made a lot of sense.
The entire book:

1 Like

Thank you for the reference slides. Great material.

Exactly my point, with reference to the OP.

Will go through the slides in more detail. Very interesting.

You might find this post interesting:

Yes I guess my naive hope is that it’ll converge on the most useful / important features over time. But I guess I’m not sure if this would naturally happen or if it is something to be explicitly engineered.

That is why there is a WHAT and a WHERE stream. The semantics of WHERE share a common framework and can be intercompared directly. Likewise - the semantics of WHAT share a common framework and can be intercompared. In both cases, the higher ordered formats share enough properties that representations can mingle in the same space.

1 Like

I understand and this is possible but this won’t necessarily help in correcting the encoders themselves since semantic meaning is something learned from the patterns based on different data and the system will map any SDR to location despite its actual semantic validity with respect to the world(where encoders matter).

@jordan.kay I guess the possibility of correcting encoders seems plausible when a system has a lot of knowledge about the world and thus a lot of intuition with which it can validate the new patterns. Not sure though. But certainly hope this is so.

I tend to agree but I am not sure about this yet. I am not sure how this will be the case for different data types. The different high order representations can be mapped to one location but their accuracy cannot be corrected that way. Even if the high order representations contain the location information, the rest of the information is reduced from the encoder space.

1 Like

For a well known WHERE example - the entorhinal cortex successfully combines self-motion, vision, touch, body position, and the vestibular system all in the same area. For a WHAT example see the Cottical IO products to see how a map can fold together the semantic meaning from a significant fraction of the WIki database. That information could be read or listened to and form the same map.

Given that those systemic data are semantically correct and valid, this combination is valid and helps, but it could just as well combine faulty data using the same mechanisms.

Their encoder is interesting. I don’t see how it isn’t a special case for text encoding though. I have to read a bit more before I can come to a conclusion. Though I am not inclined towards calling that a universal encoding mechanism from the info I have about it.

I have described the WHAT and WHERE stream as being flexible and a powerful method to parse the physical world we find ourselves in. The end product should be compatible to the point that representations from various streams can mingle in the same neural structure. The addition of WHAT&WHERE sequences is sufficient to encode the WHEN of the world. I will allow that these systems are subject to errors (sometimes called illusions) but if the presentation is ambiguous so I don’t know that any encoder could do any better with that data.

I am not sure how much more encoding you need.

If you are looking for a perfect encoder for all information you are doomed to fail - Gödel has proven that this is a fool’s errand.

I am not disregarding the WHAT&WHERE stream as a method to parse information about the world. What I am saying is that this isn’t sufficient to be a universal encoder unless the WHAT encoding mechanism is made universal and also that this in itself cannot be used by the HTM to rectify encoders.

I am looking forward to your example of WHAT semantic information that can’t be put into words and then encoded in a system like Cortical IO makes.

The reason I go to the artificial text-based system instead of my much preferred entorhinal grid system is that nobody has demonstrated that the WHAT stream does work the same way that has been demonstrated for spatial information.

That’s not to say that it works this way or that way - it’s just that to the best of my knowledge - nobody has looked for it yet. For the reasons that I have elaborated above - I expect that it will be the same.

Out of curiosity - do you have a counter-proposal for a universal encoder?

Why would you want to put information into words? Lots of time and processing will be required even if we use that approach. But I like the idea of converting everything into an intermediate form, in fact I suggested this earlier. Lots of semantics will be lost if we use words unless we spend a lot of processing power to encode every little detail using them. I wouldn’t want to train a system on video streams by converting(describing) those video streams into words, if that’s what you meant.

My current stand is that I don’t think there can be a universal encoder. But we can try using genetic algorithms.

Perhaps this is off topic, but I’ve wondered this as well. Their encoding of words seem to be similar to a Latent Dirichlet allocation would that perhaps be a good place to start when trying to develop a universal encoder, or something that could evolve as a universal encoder?

1 Like