When solving a problem mentally, it is useful to keep concepts related to that problem in short term memory. Our contribution is a way to generate thoughts with a function of a noise encoding, where the thoughts related to that thought are encoded with a noise vector and positioned proximaly close to the vector that generated the thought. This process is optimised with reward in mind. A novel aspect is that a thought sequence made by the agent is mapped onto different positions on the noise encoding using an agent that has the noise encoding as its state and moving in steps across it its actions. Simultaneously the thought sequence is generated by another agent with the same state and similar actions that maps the the noise encoding to thoughts (natural language). This oscillating system is connected to a reward in order for it not to collapse. If the state of the robot is mapped to this vector also, then memories are packed also in an optimal way.Because they are proximal, the network only has to manipulate its input slightly in order to consider the proximal thoughts or memories. This it learns how to do internally.
2 Likes