Hierarchy seems to me to be a red herring. why? Well hierarchy is essentially a tree structure.
To implement hierarchy requires a large data structure of finite width and depth to store the tree data and a means to link tree nodes to other using specific rules. Creating an arbitrary size tree is a complex programmatic affair and keeping track of exceptions even more challenging. A simpler solution would be recursion.
To implement recursion you need to be able to feed outputs back to inputs and a stack to keep track of context. Pattern need to be pushed on the stack and popped off to move up or down the levels of context, or simply cleared to exit the entire recursion.
With recursion anything can be a context.
Given the sideways behaviour of dendrites pattern storage, I posit that these could play the role of a stack like structure.
Recursion can model an arbitrary deep and wide hierarchy with simpler coding or biological hardware then a hierarchy specific coded system. It can also model graph structures, lists, relationships etc.
One of the issues with classic NN models is that backpropagation is a tuning or control path and this pre-supposes that the system has a model against which it can adjust error and learn using the backward paths. This line of thinking inherently, to be currently effective, requires a significant amount of a-priori knowledge of the world around us and makes unsupervised learning extremely challenging. Even state of the art deep learning requires vast number of examples to learn simple concepts.
It is not clear that the backwards paths are control or learning paths in the brain, given the high number of backward paths it makes more sense to me that most of the backward paths are recursive paths.
In turn, stacking these recursive units creates higher and higher levels of abstraction. All of which can be inherently hierarchical and also recursive.
A recursive model needs forward data, loop of output back to inputs, context storage and a control path to move up and down the level of virtual stacking.
Exceptions in the data paths could be a control path for learning. We learn effectively with new data that is repeated. However we don’t need vast library of examples to learn new things, using a stacked recursive structures which provide hierarchical like stacked contexts may be a way to learn from a much smaller set of examples
By learning within stacked contexts, there is already significant generalisation available from the context to slot in new data and open the door for single or few example learning.