Universal Structure Aware Agents

Universal Structure Aware Agents

Author: Tofara Moyo


The structure of data sets can be viewed with a single tessellation whose structure we describe and explore.Knowing this information gives us the ability learn a model of the data with this structure in mind. To that end we show that if an embodied agents objective was to find the lowest net cumulative entropy in its state visits it would suffice for it to exhibit general intelligence. And we use this tessellation as a basis to build its policy.

Untitled Diagram (6)

Can you provide an example of how this might work?

If you havent read the paper you wont know what a T system is either . Please read the paper

I did, and in particular, the section on how it might apply to music. I have been playing music for more than 50 years and did not see how this could build basic music theory as it makes many invalid assumptions.

As soon as you consider things like the common 7th chords or basic jazz structures this all falls apart. Who is going to decide what is “perceived as most pleasing?” There are many perfectly valid chord progressions that build and release tension that would not fit this simple definition of pleasing.

Likewise, microtonal music, such as the works of Harry Partch, present considerable difficulties for your assumptions. I have mentioned this to you before but you seem to have ignored this.

To flesh out the problem, the musical pitch interval must be considered in combination with both the duration interval AND the trajectory of past and future pitch and duration soundings. Grace note and chromatic leads count as pleasing but on the pure combination of tones they would be considered dissonant. The Time Out[Take Five] album (Dave Brubeck) or more modern Octavarium (Dream Theater) are filled with examples of odd time structures that make the pitch intervals more “pleasing.”

The “pleasing” nature will be based somewhat on the musical vocabulary of the listener. Many beginning musicians find jazz to be chaotic and unpleasant. As your musical vocabulary expands you may start to see structure in the chaos. Many Jazz players only come to jazz later in their careers as they learn more about the possible musical structures. The dynamics of the progressions reach intervals of 12th and beyond and are often further influenced with voicing to make the structures playable. This is a very big search space.

Why is it I see problems in T-structures in relation to music? Combinatoric explosions. The Partch scales are all about degrees of consonance and you can see that he explores this to a much greater depth than simple 12 note scales. When you add in the weighting of duration of both tone and silence in sequence the search space exceeds the power of current computing hardware.

I don’t see this as a tractable problem.


This may be more abstract than most of the people in this forum care about, and for that matter, most of the musicians on this forum, but there are fundamental issues with the “circle of fifths” that you may have been taught as the basis of western music. This article explains some of the problems with “just tuning” that is the basis of the equal tempered 12-tone scale.

" In Equal Temperament, the pitches become more important than the intervals as the demands of harmony and modulation necessitate that pitch identities are fixed throughout the 12 diatonic keys. If we attempt to tune the intervals Just in one key, they will not sound Just in a remote key, in other words, a Just ‘E’ in the key of ‘C’ would be different than a Just ‘E’ in the key of ‘C#’. The compromise is made so that the “E” is equally in (or out) of tune in both keys. This compromise was brought about to necessitate the advancement of musical composition, around the time of Bach, enabling composers more freedom to modulate and use all 12 keys equally, and to use more complex harmonies."

To understand why this seemingly minor detail is important consider:
“ To go back from Equal Temperament on keyed instruments is to scrap the music of two centuries. We may have entered on an evil course-it has ruined singing for instance-but we shall have to go on with it… The Indians are up against it too: they have imported the harmonium, the issue of which is inevitably European harmony, though they don’t know it. A 25-note harmonium has been invented for them but they won’t use it-too difficult-they are settling down complacently on a 12 note scale, and contenting themselves with the dozen or so ragas it will play, and scrapping the many scores of them they used to sing…it’s all very sad.” –A.H. Fox-Strangways

1 Like


1 Like

Not really, music is extremely important to MI. Like language, humans are the only ones that have it (OK, you over there about to say “What about bird and whale songs?” just shut up and get in the corner). Music is a language and is intimately caught up in it along with poetry, which is almost a bridge. But now we are in Consciousness territory and I really don’t want to distract the thread.

1 Like

1 Like

Or you can use a computer that adjusts (retunes) as you change keys.
Even with this approach, a pure circle of fifths progression is not supported through the entire pitch range.

Or you can abandon 12 tones and move to a microtonal scale.

Music occupies an interesting position (like economics) in being part rock-solid maths and part human behaviour. Let’s stick to the maths.

My instrument is piano, and I’ll see your 50 years and raise. To my ear the piano has that unique quality of built-in disharmony solely due to the well-tempered scale. It was that along with the freedom to compose in all keys which propelled Bach and later composers in the way it did. Many instruments of the orchestra do not: a violist playing in C is free to play a slightly different G from the same note in E-flat, E minor or a seventh in A. A piano in the orchestra limits the options, but without it the strings, brass and some others can be tuned to a more harmonic scale. Modern music has largely lost that option. Microtonal does not retrieve it, and other cultures are likely to be swamped.

So can the scale be discovered by analysing (or listening to) the music? Of course it can, they are just frequencies after all. The principal key? Modulations? Chord progressions? Without a doubt, and most likely has been.

Could a suitable AI generate music, by ingesting millions of tunes and associated metadata? No doubt. Could it emit another ragtime or swing or big band or music in 5 or 7 time? I’m sure it could.

But does it shed any light whatever on AGI? I very much doubt it. If the T system of this paper has value, I doubt it will be because of any these claims.

1 Like

You see there are two things here.

The music theory based music system. which is a T system. T systems ordinarily dont come with a way to optimise them.

But music has a way to be evaluated and it will spit out a number telling you how well it did.

That number is consonance level.

If an ML algorithm was to play random chords, and was rewarded with the consonance level of the “chords” it makes.

It will learn a model for the whole body of music theory eventualy.

Simply by maximising that one number. consonance.

What this means is that we have a T system that can be optimised.

If the policy of an agent is based off of a T system like this by having nodes assigned to pitches in an even temepred way.

Were we to reward consonance of “chords”, it would learn a way to orthogonalise based on movements of 5ths.

It would also be a T system that can be optimised simply by rewarding consonance.

Note that nowhere did we specify the number 5 as a hyperparameter of someweher else in the RL algorithm.

That value will be learnt as it maximises consonance.

Whats “perceived as pleasing” is high consonance. All music theory is contingent on this. Consonance is a function of the ratios of the frequencies of the notes. As such this measure of pleasing is not subjective.

I take it that if we consider the acitivations of a NN as some sort of grammar, we cannot guarantee that we are learning an optimal grammar simply by gradient decent without intentionally making it follow sensible rules of grammar.

if the system is simultaneusly using music theory to order neuron activations, we will induce a grammar that has a very clear system of rules that we know work, and that can be extended to the highest maxima.

Note i am not making “music for a listener”. Noone has to hear the music the activations of the NN would be making. I just want to create this grammar and be on with it. So even if chromatic chords use notes from scales that are dissonant, we just wont consider those chords as part of our grammar.

Please could you say something about the agent i described in the paper

Let me mention that i am not concerned with making an algorithm that makes music for the musics sake or even for a human listener.

consider that a neural network has a collection of nodes that fire in a pattern.
while a musical keyboard has a collection of notes that are pressed in a pattern.

Don’t you find it strange that these two things are so similar yet one is taught to learn to make that pattern (through training) in an ad hoc way, while the other has a whole body of knowledge showing sensible ways to come up with a pattern?

That is why i thought to place the neural network system in the context of the musical system by mapping the nodes of the neural network to the notes of a keyboard. Then rewarding activation’s that would have had high consonance if it was keyboard.

Music theory describes the rules for a language. these rules allow you to produce startlingly different statements (songs) using these rules.

if these rules were not used, songs would sound generic.

this phenomena can be interpreted as the songs are orthogonal while when we don’t use the rules they all share components randomly.

in DL it has been discovered that there is some value in orthogonal activation’s. they allow the model to learn distinct features and make it better at generalizing and ood performance.

That is some of the value we get from this system.

note that in real life, what is pleasing depends on the listener. here we do not need an explicit listener any where near the system. we just need a way for evaluating what a particular listener would like and it would optimise for that listeners taste.

This system would create music that a listener obsessed with consonance would like. That’s all.
We don’t need to make jazz or any other music that doesn’t exclusively optimize consonance to get our system to work.

It can be seen that once learnt the firing pattern of the NN will be similar to a structured language when constrained by the music system, as opposed to being an unstructured language were its pattern to be learned solely by gradient descent

Untitled Diagram (9)
This is a diagram of my agent.

Note the difference from the usual diagram.

The policy consists of nodes .( A spiking network or something else).
But the policy is split into two disjoint pieces A and B that do not communicate directly with each other.
They still communicate indirectly which i will show how shortly.

The A receives the state which causes some particular nodes , dependent on the state to fire.

B is stuck with nodes that fire in a pattern in a feedback manner. So this pattern is constantly firing and different nodes go off and on initially randomly.

The particular pattern that B fires with determines actions. This is randomly initialized so the actions are random since B’s pattern is randomly arrived at but are determined by the pattern B fires with.

This then causes the state to transition which affects the inputs to A and causes A’s nodes to fire in a particular way.

So this way A’s pattern depends indirectly on what B’s pattern does because

B’s patterns ->Actions_>State transitions-> A’s pattern

If we assign the nodes in A and B to the notes in a musical keyboard as if they were a combined set of nodes (not disjoint), then arrange for a reward to B that rewards it according to the consonance of B *combined * with A,

then B has to play a progression that through the indirect pathway causes A to play in such a way that the two sets of nodes harmonize.

B is controlling everything .Dependent on what it plays, A will respond, through the effect B has on state transitions and what states A is input.
and this affects what sort of reward it will get when the consonance of the set of (A and B) nodes is evaluated .

Why is this useful.

For one, we know that the optimal strategy for B to do to solve the problem is to arrange for similar states to be visited. The expected state visits should have low entropy. This will make the problem more tractable.

Unpredictability is not a good thing from the perspective of B’s problem statement. And Randomness is the worst case, order or lower entropy will naturally be favored.

So in order to form a grammar with its activation’s, the state must be seen as an extension of that grammar in order to modulate A nodes appropriately.

This is what we want. an agent who seeks to get the lowest net cumulative entropy in state visits.

Why is this useful. When classifying the state according to a T system, the agent, if it is human looking enough , will place itself in the same tile. since this is an optimization process and thats optimal.

From there other tiles include language and higher “thought” in the process of modulating language.

In fact the A nodes will form a manifold that interpolates these tiles,in the direction of reducing entropy which is exactly what we want.

Humans do this and that has led to civilization ,science ,art and literature etc…

1 Like

Is this still a concern of yours? do you believe that it will be restricted or not be ordered according to some parts of music theory

1 Like

Your agent will be learning the pairs that are whole number ratios, certainly not what I would consider music so I will agree with you. The T’s will be opposite sides of these ratios.

Would it learn about scales and keys etc in your opinion or is it too restricted. because idealy if it can learn these and complex rules then that would induce a complex policy with many nuances.

Note that its trying to maximise return, not reward, as such it may sacrifice conconance in one turn to acheive a greater net cumulative consonance, so this way it might get to something like jazz, which perhaps jazz does.

Also it wont just learn the consonance of two notes at a time , it could use multiple notes in a chord.