Using attractors to distill topology from semantics

I would start a new thread to discuss ideas on how to use attractors to distill topology from semantics. This for example could be relevant in an implementation of semantic-folding. .

I’ll start the conversation with an idea I had today. The first step is one that I have used in the past. Give each unique input when it is encountered for the first time, a randomly distributed “center of mass” on an oversized 2D array of bits. Then a sparse sampling of bits is chosen in such a way that bits closer to the center of mass are more likely to be chosen than bits further away from it. Thus every new input has a unique SDR initially (with some random overlap with other SDRs), but does not have any encoded semantics.

The idea I had today was that the bits in the 2D array could actually be moved around on the map (versus adjusting weights on the bits which I have tried in the past). When you wanted two inputs to share some semantics, you could essentially use their centers of mass as magnets that pull on the other’s bits. Bits which are closer to the “magnet” would be attracted more strongly than those further away. The bits could also have some amount of momentum used to displace other bits as they move.

The clumped coordinates of the bits on the overall map could then be scaled down to a working size, flattened, and normalized into smaller uniform 2D array. The SDR for the two inputs would in theory now have their bits closer to each other with some overlap due to the scaling down.

Any other ideas for how attractors could be used for generating topology? Obvious problems with the above strategy?

1 Like

One of the “breakthroughs” for me is that the cortical.io people have formed the SOM all in a single batch. I am thinking that with an attractor model that is defined as the content is added (forming and shaping pools of attraction) the map will form as you stream the training set at it, and with continuous use after the initial training sessions. The stream encoder to spatially distribute the training would be a key part of making this work.

This is how I see the pools forming in my mind’s eye. Of course, the data at higher levels of representation would not look like a picture of the object.

I have been noodling on how to form both the grammar and semantic content with the same training process.

The latest frisson of excitement to hit me on this is the post about a chatbot on another thread. In it, I referenced the “frames organization model” of world information; I don’t see any reason that this could not be formed using the same process.

I’ve been thinking along these lines for a while and it seems to me like a really interesting avenue for exploration.

I’m actually involved in a commercial project which has funding to develop an NLP application using similar principles. I’d be really interested to hear from anyone with time, interest and knowledge of Scala/Java and/or C++, especially in an HTM context…

In any case looking forward to seeing how the discussion evolves. For my part, I think that any general solution for human-machine communication will have to take into account the ‘fuzzy’ way that humans use it, and my hunch is that attractors will have something to do with that.

The recent HTM School video on grid cells got me thinking of another strategy:

Start with a 2D map. Give each unique input when it is encountered for the first time, its own unique grid cell definition for that map (randomized scale and orientation). Also give each unique input its own float array that is aligned with the map, to use for scoring coordinates on the map.

When you want two inputs to share some semantics, you determine the coordinates on the map where their grids best align, and increase the score for that point.

You use the concept of eligibility traces to determine how much two values should impact each other in a given context. For example, consider sequence:

A -> B -> C -> D

When scoring the coordinates where the grid cells align, C will have more weight on D than B, which will have more impact than A. Anywhere that grid cells for more than one of the inputs align will end up being scored higher, so those points become attractors for capturing semantic similarities shared between them.

If later we see something like X -> C -> Y -> D, the positions where grid cells align between C and D becomes a little stronger, and so on. At any point, a representation for a particular input could be returned by generating a sparse boolean array from the float array, using a winner-takes-all strategy with a random tie breaker. System could continue to learn online to produce better and better SDRs for the inputs as they are encountered in different contexts.

1 Like