I may have another theory on why neurons have thousands of synapses - to classify many very different things as being the same thing.
A single cell represents a feature, eg one cell represents a nose, another cell represents a mouth, another eyes, etc. The union of these cells represent a face SDR (Cortical’s video is a great demo). The cells in this SDR can be combined as inputs to a cell higher in the hierarchy that represents the face feature. These cells learn to represent these features through feed-forward competitive self-organisation (unsupervised learning). However, this type of learning can only allow cells to represent similar inputs (dimensionality reduction). But how could a face cell represent a face when faces can be very dissimilar? What I mean by dissimilar is that a face can be turned to the side, tilted up, in bright light, in strobe light, far away, up close, etc. All of these different versions of the face need to activate the same ‘face’ cell.
A way to achieve this is by combining feed-forward (bottom-up) unsupervised learning with feed-back (top-down) supervised learning. The supervision is not coming from any external teacher, it is simply coming from the region above. If only most of the features of the face were apart of the input (eyes, mouth, hair) then the face cell will still activate. The face cell activation is passed back down to the lower region to the eyes, mouth, hair cells that were just active. It also passes activation to the other feature cells that were not active but usually are (nose, ears). Whatever feed-forward inputs are active during this feed-back get their synapses strengthened. It is essentially inferring that whatever feed-forward input occurred in placement of a nose - is a nose. If this happens enough then that input will become a feed-forward driver for a nose feature. This will allow for many different variations of noses to be learned by the nose cell - allowing for invariance. In other words, given a stable context (a face) the variations (nose) are assigned to a class (nose class/feature).
Imagine you have only ever seen one door in your life - it has a handle, window and a frame. When you approach it and you see only the frame and window you will expect to see a handle. However this time the handle looks completely different. It has been mysteriously replaced with some other object. However, given that object is within the context of a door you infer that it must be a handle. If the object were in any other context it could be something else entirely. This novel handle will become a new variant for the handle feature cell, contributing to its activity.
The “teacher” cell activates the feature cell allowing it to form and reinforce synaptic connections to novel input features. Classic Hebbian learning - “Neurons that fire together wire together”
I think this could be another reason why neurons have thousands of synapses - because the dendritic segments are learning hundreds of very different patterns so the cell can represent the same thing. In other words, they could learn up to hundreds of different variations of the same object. So all the very different variations of handles manage to activate the same handle cell. It is like a big OR operation.
This could explain how different lighting, positioning, scaling, etc. of the same feature can activate the same feature cell.
This kind of reminds me of back-propagation in ANNs except that the cortex is the teacher/supervisor, not another human. It is just mapping. Given a set of different inputs you want a single output (ie hundreds of different images of cats, to get a single output activation of a cat cell).
It is almost like learning similarities going up the hierarchy, and teaching differences going down.