After thinking about this question for a few days:
In general, questions about the difference between conventional artificial neural networks (ANNs) and the biological brain come up from time to time. These questions are especially appropriate for this forum because to some extent they also apply to “biologically constrained” (aka “biologically plausible”) models as being developed by the Numenta project.
First define the ANNs to be considered. A classic artificial neuron consists of synaptic weights and input values, and inference performs a dot product of the two and passes the result through a non-linear function (historically a squashing function). An ANN is then a hierarchy of ANNs where signaling is dense.
Furthermore, for apples-to-apples comparisons, the ANN is written in software that executes on silicon-based hardware that is plugged into the wall, one way or another. The biological brain is “hardware” so the ANN’s hardware should also be part of the conversation.
Functionality
If one were to show an image of a black bear to an ANN dog breed classifier, it would likely reply: “Newfoundland” - a big, black dog. The biological brain when faced with the same image would say: “Its a bear!” and adrenaline would be surging even before the brain is consciously aware of the bear’s presence. The bear recognizer, the dog breed recognizer, the racoon recognizer, etc., are all bundled together in the biological brain – they are not strictly compartmentalized – and this is an important functional characteristic of the biological brain.
Furthermore, the biological brain learns dog breeds as higher level concepts. The image, the texture of the coat, perhaps the sound of the bark, the spoken name, the written name are all bundled together somehow as a single concept. In contrast, classic ANNs (deep or otherwise) are incapable of doing this.
Finally, the biological brain learns in a way that is largely unsupervised, with some reinforcement learning, and it all works online in a dynamic, adaptable way. The ANN is just the opposite, it learns in a way that is heavily supervised and offline. Learning is expensive and dynamically inflexible. See the (excellent) 2021 paper by Hole and Ahmad for further elaboration.
Infrastructure
Many of the neurons in the biological brain provide infrastructure that supports stable, reliable computation performed by other (typically excitatory) neurons. Synchronization via oscillations is performed by inhibitory neurons operating in bulk – see the 2007 Fries and Singer paper on “the gamma cycle”. This is a significant number of neurons. Other infrastructure neurons provide temporal stability.
Biological neural computation is notoriously noisy and unreliable – yet it works. The reason is that there is a lot of redundancy – what may be characterized as stochastic population coding. As an educated guess based on my background in fault-tolerant computing, when components are as unreliable as in the biological brain the redundancy could be a factor of 10 or more.
Taken together, there could be two orders of magnitude more infrastructure and redundant neurons than required for actual computation.
The ANN counterpart as normally considered gets a free ride on both counts. In an ANN, the silicon hardware provides the infrastructure – it is not included in the ANN’s neuron count. Furthermore, silicon is extremely reliable, there is no need for high levels of redundancy (except for memory, no redundancy at all).
Power Efficiency
This factor is often overlooked. The biological brain consumes about 20 watts of power, and the resulting heat is dissipated through the skull, the cardiovascular system (or whatever else). If you were to wrap your hand around a 20 watt lightbulb, you would be impressed by what natural evolution was able to engineer. Power dissipation is a major limiting factor – especially when one considers the “engineering margins” that evolution has built in – heat stroke is very uncommon and only occurs when there is a lot of heat stress.
In the biological brain, dynamic power originates one way or another in activation potentials (voltage spikes) that emanate from a neuron body, are propagated down an axon along all of its branches, arrive at conductive synapses, potentially trigger dendritic spikes, which may result in further neuron spikes. A single spike ultimately results in a lot of power consumption, and a strategy for decreasing dynamic power involves increasing the neuron count (although this may seem counterintuitive). Sparsity enables communication with fewer spikes, so for a fixed communication bandwidth, power is reduced as sparsity (and neuron count) is increased. Therefore, it appears that natural evolution may have increased neuron counts far beyond what would be needed with dense signaling (as done in ANNs) in order to achieve a good trade-off point regarding dynamic power consumption.
Here again, the conventional ANN gets a free ride because its neuron count disregards power consumption in the hardware. The biological brain “pays” for high sparsity and low power consumption by using (a lot) more neurons.
Summary
If one could somehow separate out the biological neurons required for dog breed classification based on images, and nothing else, it would be a tiny portion of the brain’s neurons to begin with. Then infrastructure support, redundancy, and power efficiency collectively account for multiple orders of magnitude in neuron counts that are not “charged” to ANN neuron counts. All together, these things probably account for the perceived large differences in neuron counts.
A final somewhat related observation – the neocortex contains 20 billion neurons. If we discount by two orders of magnitude to account for infrastructure, redundancy, etc., Then 200 million purely computational silicon neurons would be sufficient. Today silicon can be constructed with 20 billion transistors per chip (or more). Comparing these numbers suggests that a future silicon neocortex is not far-fetched – especially considering that we would be tickled pink with 1% of a human neocortex.