Hinton says nns throw away position information and pose and hence are broken. I agree. He proposes capsule networks, a forced a priory structure. I think the the 5/6 layers of the cortex and their functioning in this regard as explained by Jeff Hawkins and company is less arbitrary and be used for this vision function. No need for capsule networks unless you need to simplify for expediency.