Scroll up to page 369.
As I understand it, there's some confusion why L5 receives a driving input from L2/3. This source could explain that connection.
This book argues that L2/3 and L5 both use recurrent axon collaterals to produce attractor states, but of different types. L2/3 uses discrete attractor states, at least in the "what pathway". That way, you don't look at a chair and think, "that's more of a chair, but it's also a bit of a table". Instead, you just perceive it as a chair. Two objects with some shared features would get similar representations in the spatial pooler, but they should have completely different contributions to the prediction.
In contrast, L5 uses continuous attractor states because there is a continuous gradient between actions. For example, you can move your arm to a continuous range of angles. To produce this continuity, L5 mimics the discrete attractor state of L2/3 but uses its own recurrent collaterals to smooth them out into continuous attractor states.
My thoughts on this:
I wonder if this could work in the "where" pathway, which seems like it requires continuous attractor states since location is continuous. Maybe it mimics the continuous attractor states of L5 in the "what" pathway. It might be useful to represent egocentric location based on possible action & resulting sensory input pairs because it could be easier to generate location dependent actions that way, which could explain mimicking L5 if L5 cells can spike predictively. It could also be more effective to represent those possible action & resulting sensory input pairs continuously in some cases when similar actions lead to sensing an overlapping set of features (e.g. looking at two nearby points brings two overlapping sets of features into the visual field), so L5 in the "what" pathway could be especially suited to this task.
I think L5 shouldn't just represent actions continuously. For example, force is continuous but which muscles perform the action is not continuous, nor is the sequence of which muscles contract. Maybe that's why L5 smooths out a discontinuous representation rather than producing its own continuous representation. That way, it can smooth out some aspects of the representation while leaving other parts discrete.
I don't understand exactly how this hypothesis would work. However, intentionally producing continuous or discontinuous representations could be useful. Right now, the spatial pooler and union pooler probably produce semi-continuous representations.