Crafting a Reward Function with a Heir-achy of Needs for Complex Goal Formation

Crafting a Reward Function with a Heir-achy of Needs for Complex Goal Formation

author tofara moyo
abstract

We show a simple method for an agent to learn levels of abstractions ordered by priority that ultimately increase the global expected reward. Each level is associated with a separate scalar output of the neural network at each time step t which is fed back to the agent as part of the state at time t+1. The agent then correlates them with features of the state initially randomly. It however learns the correct assignment by doing it in such a way that it increases the global reward. We describe an equation meant to order these scalar values and the global reward in order of priority and hence induce a heir-achy of needs for the agent. This then forms the basis of goal formation for it.

https://www.researchgate.net/publication/385249902_Crafting_a_Reward_Function_with_a_Heir-achy_of_Needs_for_Complex_Goal_Formation

1 Like