Structuring Concept Space With The Musical Circle Of Fifths By Utilizing Music Grammar Based Activations-Tofara Moyo

## Abstract

## In this paper, we explore the intriguing similarities between the structure of a discrete neural network, such as a spiking network, and the composition of a piano piece. While both involve nodes or notes that are activated sequentially or in parallel, the latter benefits from the rich body of music theory to guide meaningful combinations. We propose a novel approach that leverages musical grammar to regulate activations in a spik-ing neural network, allowing for the representation of symbols as attractors. By applying rules for chord progressions from music theory, we demonstrate how certain activations naturally follow others, akin to the concept of attraction. Furthermore, we introduce the concept of modulating keys to navigate different basins of attraction within the network. Ultimately, we show that the map of concepts in our model is structured by the musical circle of fifths, highlighting the potential for leveraging music theory principles in deep learning algorithms.

If you understand the abstract here is the part on adapting this to TBT:

(PDF) Structuring Concept Space With The Musical Circle Of Fifths By Utilizing Music Grammar Based Activations (researchgate.net)

Section:III- Related works

subsection B- Grid cells and using the thousand brains theory

This will be per column per layer:

So we use a Graph Neural Network with self attention to create the ground truth representation while another fully connected SNN is used to map states to outputs. The populations will be the common activations that this network has while the GNN will take in as input a vector that represents the SNNs hypothesis at time t that it (the ANN) outputs alongside actions. Since a learnt population has geometric significance(read the paper) and GNN with self attention have variable topology,we can arrange for each node on the GNN to correspond to a population of the populations a time t inducing a configuration of the GNN including angles of the populations. Only the nodes of populations activating will be represented as nodes in the GNN.Then the GNN outputs a vector which is its hypothesis of the object being attended to by the robot. While the SNN outputs its own hypothesis. We then train he SNN with the GNN’s output as ground truth. Since the GNN’s output depends only on the populations currently active (its topology) its hypothesis matches reality closer and so this is valid. Of course initially this will not be true. but as learning occurs with a global reward this becomes more and more true

The same GNN will evaluate each columns output. Since it will have different topolgy per column its hypothesis for the real object will affect the voted for column (read paper) differently, i,e it will lear a different hypothesis.

To see why it will converge, the GNN makes its predictions based on its approximation of concepts, (the nearest node from the set of activations.While the SNN makes predictions only on the total activations. Some of these activations will be noisy, ie. blur the concept (not harmonise) so the GNN

is working with better information and will converge faster to output more accurate concept (i.e thought out) based hypotheses., while the SNN will also converge but only becuase we are training it with the more accurate signal from the GNN.

Initially there will be no difference between the two, and the ANN will be learning a way to represent basins as populations from both the global reward signal plus the consonace maximisation reward signal. As these concepts start to appear, the GNN gets better and better topological prefferences and learns to output even better predictions than the SNN.

So its the SNNs ability to learn a better reprsentation of concepts that makes the GNNs hypotheses more accurate than itself.

The SNN makes its hypothesis based of fuzzy representations while the GNN makes its predictions on symbols (the nodes of the GNN) and logic, the GNNs topology. So the GNNs hypothesis will be more accurate as the activations resemble meaningful chords more and more with training time