Random question about point neurons vs biological ones


I was watching a YouTube video and they went over the history of Neural networks. Anyway one of the early networks (I believed it had 100s of neurons in the network maybe thousands ) was able to identify photos of different breeds of dogs more accurately then a human could. So my question is why is that? Why would a few 100/1000 point neurons out perform a human brain that has millions of Pyramid neurons?

Here are some guesses.

  • People don’t know dog breeds very well. I can identify the breed of like 1 in 20 dogs I see. I could get that number much higher if I studied for a week.
  • Dog breeds are a very limited thing. I think if the network needed to identify the object (maybe a cat or coffee cup), humans would’ve done much better back then.
  • Dog breeds have distinct colors and color / texture patterns. I think neural networks are good at that low level detail kind of thing, especially for a very limited slice of the real world.
  • We don’t directly sense pixels. It depends on the eye’s relative position and rotation (6 coordinates), plus other stuff like lighting, pupil dilation, lens focus.
  • The brain works differently. Point neurons can just recognize patterns, but brain neurons do more specialized functions to handle stuff like self-movement from motor output.

We do, on the level of rods and cones. They just don’t stay “unmixed” after that. That low-level detail simply doesn’t get to be acted on when we test humans.

Before responding further it would be good to have more precise numbers.
“believed” “100s… maybe 1000s” ? More precisely how many?
“different” breeds of dogs? how many different breeds?
“few 100/1000” point neurons – a few 100… or a few 1000? or…?

I meant when you look at a pixel on a screen, it’s warped etc. whereas whatever a neural network gets is much simpler, so we need more neurons to identify basic details.

There are I dunno 50 pixels in that rectangle, so 100 numbers can describe it. How many numbers would you need to describe the photo of that rectangle? It’s a lot of information to get rid of to recognize that it’s a rectangle.

I am more interested in the lines that are not there in this drawing - which neurons hold the illusionary lines?
Is it just possible that the brain is more complicated than “synapse per pixel” thinking?


I a willing to wager that if you trained a human for thousand or even millions of presentations like you do with a neural net that human would be pretty good at recognizing dog breeds.


So firstly, because it was designed to outperform humans. How? It was likely trained only with a specific dataset and do better at it - dogs and dog breeds.

Here’s a thought experiment, design yourself a system to memorize dog breeds so that you can outperform that ANN. I’m pretty sure you will succeed assuming you are up for the task.

1 Like

I don’t know why forum members are confusing this as a philosophical question.

@JamesLikesBrains tl;dr the reason is memory. DL models can inherently be more complex and store much, much more knowledge about the world than the average human. (What they currently lack is reasoning abilities which comes with scale);

In a task as diverse as Dog breeds, very few experts would be able to identify and remember all the markings of different dog breeds - whereas traditional CNNs even won’t have any problem doing this for thousands of such breeds. This is why there’s a disparity in performance.

Check out [2205.01917] CoCa: Contrastive Captioners are Image-Text Foundation Models CoCa, Google’s scaled up model (though in terms of parameters, it is pretty minuscule to its language counterparts) which can perform these type of tasks 0-shot, i.e with not a single training example. I won’t bore with the technical intricacies of how its different to a human brain but a good takeway is that CoCa can easily outperform Humans on such tasks due to its vast memory

1 Like

That’s true, but it had thousands of neurons, and we have 100 trillion synapses.

Why does that come with scale? Might be too off topic.

Most neural models capture two aspects of the brain - a few details of biological neurons and the overall hierarchical connections.

Perceptrons and all the derivative models capture the flexible nature of connectionist data modeling but most fudge on how those connections get trained using very non-biological hacks.

The hierarchical connection emulate the distribution of data between the layers which leads to micro-parsing; decomposing and distributing features at the most appropriate level of representation.

These general approach can be enhanced by “doing more of the same,” by adding connection, layers, and huge data sets for very intense learning in some relatively limited domain.

These two features abstracted from the brain tend to be very powerful but only capture a very small part of the overall brain topology and functions.

Even a lowly lizard or amphibian does things that even the most ambitious neural network researcher is utterly incapable of emulating at this time. These larger systems found in real brains may show us more tricks to add to our models to make them more capable. This is why I follow “biologically inspired” companies like Numenta; this is where I think the the good stuff is going to come from.

Link please.

Because you aren’t using all of your synapses for detecting dog breeds - only a small relevant subset relating to vision, spatial features etc. The model probably has millions of parameters, but that’s still very little synapses relatively.

Scaling is a huge topic on its own which can’t be explained in a forum. A simplified tl;dr is as scale/size of the model increases, it “unlocks” more intellectual capabilities and displays more human-like behavior.

That is to say, scale complements models - so there’s no reason why a more parameter efficient model cannot arise tomorrow. For instance, look at GATO - A Generalist Agent.

The same network with the same weights (no modification) can play Atari, caption images, chat, stack blocks with a real robot arm and much more

This is an example of a multi-modal model (i.e it can understand and connect text and vision modalities) that unlocks abilities like using a real robot arm (in real world) while still maintaining other abilities - like being a chatbot and beating Atari. It also performs SOTA on meta-learning tasks. That too with only a couple Billion parameters - which are 1000x away from the biggest models ever trained.

This is a pretty oversimplified form - LMK if you wanna know more and I can link you up with some papers, the rabbot hole of scaling is truly interesting because of how little we know of it :wink:

1 Like

Because it can be answered at that level? Don;t you think?

While I agree with this, this statement becomes prone to error if we retrofit it with the OP’s context - that old (early network )neural net he saw in “some” video. We all know this, we cannot easily generalize the performance of models using the same architecture because training effectiveness is a variable.

Hardly. ANNs have no memory and store very little data, just a modest number of weights derived by training on large quantities of labelled data. An ANN trained on dogs is hopeless on cats, or on breeds it’s not seen, or on pictures taken from the wrong angle or partly obscured, or other things like a dog with splashes of paint.

And when it gets one wrong it can’t explain the mistake, let alone learn from it.

The answer is that the training process discovers a small number of distinguishing factors. After all, if you could find the right factors it only takes 10 yes/no questions to distinguish 1000 breeds of dog.

1 Like

Probably worth putting that as a top level conversation on it’s own.
It is quite an interesting set up and even quotes J. Hawkins and S. Blakeslee. “On intelligence” as a lead.

Not when it can be answered scientifically

That’s incorrect; a large enough ANN can understand most of the objects, irrespective of occlusion angle etc.[[2205.01917] CoCa: Contrastive Captioners are Image-Text Foundation Models] without even requiring any examples, unlike vanilla CNNs. By and large, those factors do matter but unless you’ve totally destroyed the image by transformations to the point a human cannot identify it, then models usually do an excellent job at it.

It is pretty interesting indeed :slight_smile: , However I doubt this is the best forum to discuss as its for more neuroscientifically aligned audience - so wouldn’t be hugely relevant for people for whom its outside their domain, or in general aren’t interested in DL efforts towards AGI.

1 Like

After thinking about this question for a few days:

In general, questions about the difference between conventional artificial neural networks (ANNs) and the biological brain come up from time to time. These questions are especially appropriate for this forum because to some extent they also apply to “biologically constrained” (aka “biologically plausible”) models as being developed by the Numenta project.

First define the ANNs to be considered. A classic artificial neuron consists of synaptic weights and input values, and inference performs a dot product of the two and passes the result through a non-linear function (historically a squashing function). An ANN is then a hierarchy of ANNs where signaling is dense.

Furthermore, for apples-to-apples comparisons, the ANN is written in software that executes on silicon-based hardware that is plugged into the wall, one way or another. The biological brain is “hardware” so the ANN’s hardware should also be part of the conversation.


If one were to show an image of a black bear to an ANN dog breed classifier, it would likely reply: “Newfoundland” - a big, black dog. The biological brain when faced with the same image would say: “Its a bear!” and adrenaline would be surging even before the brain is consciously aware of the bear’s presence. The bear recognizer, the dog breed recognizer, the racoon recognizer, etc., are all bundled together in the biological brain – they are not strictly compartmentalized – and this is an important functional characteristic of the biological brain.

Furthermore, the biological brain learns dog breeds as higher level concepts. The image, the texture of the coat, perhaps the sound of the bark, the spoken name, the written name are all bundled together somehow as a single concept. In contrast, classic ANNs (deep or otherwise) are incapable of doing this.

Finally, the biological brain learns in a way that is largely unsupervised, with some reinforcement learning, and it all works online in a dynamic, adaptable way. The ANN is just the opposite, it learns in a way that is heavily supervised and offline. Learning is expensive and dynamically inflexible. See the (excellent) 2021 paper by Hole and Ahmad for further elaboration.


Many of the neurons in the biological brain provide infrastructure that supports stable, reliable computation performed by other (typically excitatory) neurons. Synchronization via oscillations is performed by inhibitory neurons operating in bulk – see the 2007 Fries and Singer paper on “the gamma cycle”. This is a significant number of neurons. Other infrastructure neurons provide temporal stability.

Biological neural computation is notoriously noisy and unreliable – yet it works. The reason is that there is a lot of redundancy – what may be characterized as stochastic population coding. As an educated guess based on my background in fault-tolerant computing, when components are as unreliable as in the biological brain the redundancy could be a factor of 10 or more.

Taken together, there could be two orders of magnitude more infrastructure and redundant neurons than required for actual computation.

The ANN counterpart as normally considered gets a free ride on both counts. In an ANN, the silicon hardware provides the infrastructure – it is not included in the ANN’s neuron count. Furthermore, silicon is extremely reliable, there is no need for high levels of redundancy (except for memory, no redundancy at all).

Power Efficiency

This factor is often overlooked. The biological brain consumes about 20 watts of power, and the resulting heat is dissipated through the skull, the cardiovascular system (or whatever else). If you were to wrap your hand around a 20 watt lightbulb, you would be impressed by what natural evolution was able to engineer. Power dissipation is a major limiting factor – especially when one considers the “engineering margins” that evolution has built in – heat stroke is very uncommon and only occurs when there is a lot of heat stress.

In the biological brain, dynamic power originates one way or another in activation potentials (voltage spikes) that emanate from a neuron body, are propagated down an axon along all of its branches, arrive at conductive synapses, potentially trigger dendritic spikes, which may result in further neuron spikes. A single spike ultimately results in a lot of power consumption, and a strategy for decreasing dynamic power involves increasing the neuron count (although this may seem counterintuitive). Sparsity enables communication with fewer spikes, so for a fixed communication bandwidth, power is reduced as sparsity (and neuron count) is increased. Therefore, it appears that natural evolution may have increased neuron counts far beyond what would be needed with dense signaling (as done in ANNs) in order to achieve a good trade-off point regarding dynamic power consumption.

Here again, the conventional ANN gets a free ride because its neuron count disregards power consumption in the hardware. The biological brain “pays” for high sparsity and low power consumption by using (a lot) more neurons.


If one could somehow separate out the biological neurons required for dog breed classification based on images, and nothing else, it would be a tiny portion of the brain’s neurons to begin with. Then infrastructure support, redundancy, and power efficiency collectively account for multiple orders of magnitude in neuron counts that are not “charged” to ANN neuron counts. All together, these things probably account for the perceived large differences in neuron counts.

A final somewhat related observation – the neocortex contains 20 billion neurons. If we discount by two orders of magnitude to account for infrastructure, redundancy, etc., Then 200 million purely computational silicon neurons would be sufficient. Today silicon can be constructed with 20 billion transistors per chip (or more). Comparing these numbers suggests that a future silicon neocortex is not far-fetched – especially considering that we would be tickled pink with 1% of a human neocortex.


This is an attempt to refute a tiny part of my claim: that ANNs with or without ancillary aids will fail to correctly identify a vast range of images that biological brains handle with ease, cannot explain their mistakes and do not learn. The idea that we let these things monitor us and even drive our cars without ongoing close human supervision fills me with horror.