This sparsity is related to the actions of the inhibitory interneurons. The strongest contender in each of the “hubs” neighborhood activate the interneurons. The winners will only survived the dominance battle with the inter-neurons if this mini-column has help from neighboring mini-columns through mutual axonal connections. Losing competitors are strongly suppressed.
From what I can determine, it takes both a strong local signal AND reciprocal axonal connections to maintain activation against the inhibitory pressures. Many mini-columns may try to form a hex-grid but only ones that are part of a larger learned pattern will win.
As a byproduct, hex-grids enforce a natural sparsity mechanism, with somewhere around 2% to 3% activation.
I suggest that this mechanism should replace the current spatial pooling used in HTM.
As far as your comment “Surely in a random distribution many neurons form shapes that are either non-equilateral triangles or cause noise in a neuron that forms a vertice of an equilateral triangle.” Given the stereotyped roughly circular target bands of lateral axonal projections, the natural connections will be triangles. I agree with your randomness comment, and with lateral thinking, this the basis for the fact that hex-grids can be formed on any orientation, any relative phase, and over a range of spatial scalings.
The low-level processing of sensory inputs drives the variation low-level hex-grid coding and is the basis for the larger Moser grids to have the demonstrated variations in angles, phases, and spacing.
You also mentioned layers - I see this enforcement of sparsity being mostly mediated with the layer 2/3 in the feedforward direction, although it is possible that layer 5 is doing this in the feedback direction. Layers 5 and 6 are certainly triggering inhibitory interneurons.
I see this local competition to form a hex-grid running at gamma rate, with the winners asserting intermap connections at the alpha rate.
Coming back to your question:
“Also, this assumes that the neurons within a layer are all on a plane. But even a layer within the neocortex “sheet” has a thickness many times greater than a typical neuron size (if neurons have typical sizes to begin with). Doesn’t that invalidate the idea of horizontal axonal projections?”
Look at the picture below. See how the axons projecting from layer 2/3 (blue) fan up in an inverted cone or funnel shape. The area where these axonal projections make connections with surrounding dendrite arbors (red) is a roughly circular shape, with the angles of rising and diffusion setting the width of the interaction area. Not shown in this diagram is the sea of inhibitory interneurons that these projections are passing through. As the mini-column signals recognition activity the projecting axons will trigger these inhibitory interneurons tending to silence nearby mini-columns. Of course, some of the nearby mini-columns may be doing the same thing, engaged in a battle for who can win this competition to be the one to recognize this input sensation.