I have begun the development of a new NEAT engine

flajann2 · November 19, 2024, 12:26pm

I know I raised this issue years ago, about integrating HTM and NEAT (Neural Evolution of Augmenting Topologies), but at the time, I had no idea how to do that.

I do now.

I call it “Spiking NEAT”, and I am writing it in Haskell. I am pulling in some aspects of HTM and the neural theory behind it, including some of the characteristics of Pyrimidial and Purkinje cells. So this is a research effort. I have no idea where this will lead.

I am more than open to suggestions. Especially now since I’ve just started. I am hoping to get something working before I get super busy with a new job, which will slow my progress a bit.

cezar_t · November 19, 2024, 2:13pm

There-s a a possibility the genetic algorithms have more potential when they are used to alter (aka evolve) the input representation or “embedding” rather than the network itself.
All learning algorithms work reasonably well on most inputs, sure some outperform others in various respects, e.g. some achieve higher accuracy, others better sample efficiency, or compute efficiency, others forget less, some overfit, some are better suited for noisy data, etc… but be it ANN, K-NN, linear, trees, forests, xgboost - whatever - there-s no clear winner.

The takeaway here being that IF there are some learnable correlations or patterns within some input data all learning algorithms manage to figure out, with a better or less degree, that these patterns exist.

What we (animals) seem to have a power to … simplify and clarify the few essential features that expose a property or characteristic.

I don’t know of any algorithm able to do that - to figure out not only there-s a cat in the room but also to pinpoint the few key elements in sensory input which make it sure that fact is true.

Some rules of combine/extract smaller parts from sensory channels which can be evolved should not be hard to implement. With simple enough input some learning algorithm learn very fast, dozen or hundreds of milliseconds per core.

Here-s a simplified schematics:

Raw input multiple sensory channels → Simplifier/combiner layer → Generic Learner → Results

The evolving algorithm targets the simplifier/cominer layer that generates simple(r) representations of the raw input. NOT the learner. We know it discovered an improved “perspective” of the input when the results of the generic learner improve.

Here-s a MNIST example task: let’s find a topology of 20 patches x 10 pixels each which when the input image is represented as 20 scalars (each patch sums up the values of the pixels it contains) we get the best accuracy.

So the genetic algorithm starts with a population of 100 sets of 20 random patches (10 pixels each patch).
The testing algorithm picks 1000 digits from the training dataset.
And re-trains the same small initial network 100 times, every time with a different set of 20 patches.
Then tests each trained network-patch set combination
It doesn’t need to reach top accuracy, only to figure out which of 20 series of patches combinations allowed its “learner” to outperform the others, and combine the combine the winning set (e.g. 20 out of 100) in the following generation of the genetic algorithm.

flajann2 · November 19, 2024, 4:01pm

Already I am seeing some possible ideas here. Like, for instance, Hebbian learning in the NEAT context. In the “classical” NEAT, weights are evolved. Now it occurs to me that I can have the “critter” – the group of evolving neurons out of a population of many – can also learn that way.

What we (animals) seem to have a power to … simplify and clarify the few essential features that expose a property or characteristic.

I don’t know of any algorithm able to do that - to figure out not only there-s a cat in the room but also to pinpoint the few key elements in sensory input which make it sure that fact is true.

This is a part of “reasoning” that I’ve been thinking about. And now I ask the – largely rhetorical – question of what would represent the minimal neural requirements to achieve that. Can ants do this? Worms?.. mice? chimps?

A tangential question is that of always having to set “goals” in order to “train” these systems. Kenneth Stanley, the inventor of NEAT, came up with the ideal of “novelty learning” which require no goals. I don’t know how far he went with that. He did have some simple examples of a “mouse” learning how to navigate a maze.

Much to think about here.

dmac · December 25, 2024, 2:41am

Hi,
I’ve been thinking about this because I’m also planning to implement NEAT. I want to put NEAT in a reusable python library for every one to play with. However, in the past I’ve found that every experiment is different and requires its own pile of custom code. It’s very difficult to cater to the needs of every experiment. My solution is to never touch the environment or the genome, and instead the user must provide those functions as arguments.The NEAT algorithm treats the genomes as opaque python objects.

Here is a rough sketch of my planned API:

The user provides:

seed() → genome
crossover(genome, genome) → genome
mutate(genome) → genome
distance(genome, genome) → number

NEAT provides:

__init__(seed, crossover, mutate, distance)
birth() → genome
death(genome, score)

thanh-binh.to · December 25, 2024, 11:46am

@flajann2 could you pls explain where NEAT is thought to be useful for what in HTM or any SNN?
Thanks

flajann2 · March 25, 2025, 11:11pm

My initial NEAT implementation was in Ruby, and it was doggedly slow. My new one is being developed in Haskell with ZMQ as the interface, so that all languages, including Python, Rust, C++, etc. can use it.

I think someone already did a NEAT version in Python, and there is a C++ MultiNEAT with Python bindings.

cezar_t · March 30, 2025, 6:23am

Ok, I thought of, and started implementing a simple GA for plain, fixed size SDRs.
A SDR is a set of bits (1) positions e.g. 32 choices out of 1024 values.

This is related to my message above about evolving the input (or output) “perspectives” of simplified, rudimentary learners.

The simplest (and obvious) thing to do is to consider the SDR itself as being the genetic code. Mutation with a given probability means changing one position to a different value, crossover would pick random bits of two (parent) SDRs to generate a third one (the child).

So far this code seems pretty fast.

seanjhardy · March 31, 2025, 8:13pm

Sounds like a really interesting project, how’s it going so far? I’ve been writing genetic algorithms for years and (I’m also working on right now to implement a gene regulatory network for multicellular creatures). I just wanted to bring up a thought I’ve had for a while but only ever implemented poorly. One of the main limitations of NEAT is that genomes can only encode for specific cells and connections between them, which is not only biologically implausible but prevents learning within-lifetime. I’m not entirely sure how you’re planning on merging that with HTM model, but if you’re aiming for something a bit more advanced, I’d suggest extending neat to be more modular than singular, where the genome encodes the structure of entire layers of neurons, and modules containing sequences of layers or other modules. Just like how individual neurons are connected with neat, modular neat can connect layers together, but also nest modules so that innovations in a module or layer can be shared throughout the entire network. I think the main reason this isn’t done is that its really hard to optimise a dynamic network of layers, which makes it prohibitively expensive to run. I’m also not aware of any genetic algorithm that attempts to evolve the learning mechanism itself (e.g. full control to learn via hebbian, backprop, or something else), which I’m guessing is extremely complex to develop, but that might be an interesting direction to consider.

The main issue with evolutionary algorithms is scale, performing large scale evolution is exceedingly costly to compute, and that’s quite important to get novel behaviour. You might be able to simulate the evolution of small networks in great detail, or a large scale ecosystem of models but with much less detail. I think the degree to which you get interesting behaviour is the degree of control your genome has, which increases code complexity and computational cost. My understanding is that the reason the HTM model ignores spiking is because its fairly costly to compute, and can largely be abstracted away with the same resulting behaviour, so I’d keep that in mind when designing your encoding system.

With my genetic algorithm, the entire program is structured to be maximally efficient, using GPU parallelisation wherever possible, so that I can scale the simulation up as much as is allowable. I’m not sure if Haskell can leverage the GPU, but it’d be really good to parallelise wherever you can, and structure it early on with efficiency in mind, rather than rework it later.

flajann2 · March 31, 2025, 9:19pm

Good insights.

The spiking aspect will be challenging, which is why most approaches tend to want to “average” the spiking train rates out. Thing is, when you just do average the rates rather than deal with the actual spikes, you miss out on a lot. For example, two spiking rates that differ slightly will produce a beat spiking behaviour (think beat frequency) that you won’t get with the averages.
To boost performance, I will have to create novel (!) algorithms that will give me the same for less compute. How that will work? I don’t know yet.

Doing columns and connections among the columns? Yes, I thought along those lines, but I need to do a lot more work there.

My approach is to get it working first, then come up with all the clever ways to speed it up.

Not sure how well this will fare on the GPU. GPUs are geared to handle more linear algebraic computations. Anything that does not fit that paradigm will be interesting, to say the least.

seanjhardy · April 2, 2025, 1:46pm

Hmm yeah, I suppose it depends on if your model relies on specific spike-timings to develop interesting behaviour. I suppose the main drawback is if you want spikes to happen continuously, you need to simulate your network at a much higher frequency of updates for the same amount of computation happening inside the network as a HTM model.

The thing I was refering to about connections is to not encoding any specific connections between any neurons(/columns), but encode the broad structure of the overall system (assuming you’re not implementing one HTM-like layer?), and allow the network to grow/prune/strengthen connections based on learning rules within-lifetime, otherwise the behaviour is predefined by the genome, hopefully I haven’t misunderstood you.

That’s totally fair about getting it working, I think you’re right that even validating it on a small scale will help decide on the direction to research, I guess I was talking more to my past self since I had to rewrite mine :'D but I totally understand that when the project isn’t fully scoped yet, just getting a prototype is worth it.

I also just wanted to mention that GPUs are super powerful for anything, not only linear algebra. There’s an existing library called Etaler that implements the HTM algorithm using GPU’s. The entire purpose of a GPU is to enable you to perform some small computation thousands or millions of times in parallel. Instead of writing code for your CPU, where everything happens one after another, a GPU can (effectively) calculate everything at once. Its mind blowing when you play around with it. For your project, I’d imagine there’s multiple loops that iterate over each individual in your genetic algorithm, or each neuron (maybe you’re using a matrix library that hides some of that complexity) and then performs some small function to sum its inputs and update an internal value, and spike it if it goes above some threshold. Since that same thing happens millions of times over, its really easy to just compute all of it in parallel. Maybe Haskell even has built in support for GPU computation, I’m not entirely sure

flajann2 · April 3, 2025, 1:24am

Hmmm…interesting take on the GPU. Can one core execute a lot of instructions and pull input from other GPU cores? I was under the impression that GPUs do best for matrix-based computations. Some have said in the past that NEAT does not do well on the GPU, but that may be different today, so I’ll check.

Yes, I want to get a prototype working first, running in the CPU. Then it will be clear what I need to do to make it work in the GPU context. If I can have GPU cores query other GPU cores arbitrarily (not as in a matrix computation), then that would be stupendous.

I have not looked at Haskell’s GPU offers yet. If need be, I will FFI it to C++ if I have to.

How I will do the HTM is not clear to me yet. Still thinking about that.

I think the big win is in the design of the SDRs to semantically pull out the relevant aspects of the data stream. The best performance will be had from the good design of the SDRs.

Looping? Not necessarily at the Haskell level – recursion, instead, but that gets unrolled into looping constructs.

Not sure if the LLVM will support CUDA or not. Something else for me to look at. If it does it right, I could skip the FFI bit altogether and just talk to the LLVM, giving it the AST/IR to compile to the CUDA.

All in good time. This will turn out to be a complex and complicated project. No greenhorns need apply. LOL

Update: Haslell does have support for CUDA.

TommyG · April 7, 2025, 5:34pm

1. Porting SDR & Temporal Memory Ops to CUDA- SDRs (sparse distributed representations) use sparse binary vectors and operate mostly with:

Bitwise ops
Sparse matrix multiplications
Local inhibition + boosting
CUDA could accelerate:
- SDR overlap scoring
- Sparse synaptic updates in Temporal Memory
- Parallel column activations & inhibition steps

But: it requires custom CUDA kernels since standard deep learning libraries (e.g., cuDNN) are optimized for dense tensors.

flajann2 · April 8, 2025, 12:04pm

Might be fun writing a custom CUDA kernel. And you are absolutely right. Something I need to look into. Thanks.

Topic		Replies	Views
Project to compare mraptor's bbHTM to biology Engineering	21	2098	June 22, 2016
Encoding OpenAI Gym's Input Engineering encoders , openai	7	3439	August 2, 2018
Who is currently Building HTM Systems? Engineering question	20	3954	October 1, 2018
Any questions for Jeff? Numenta Theory community	34	4020	March 22, 2017
Another HTM test implementation Engineering	18	1761	April 24, 2017

I have begun the development of a new NEAT engine

1. Porting SDR & Temporal Memory Ops to CUDA- SDRs (sparse distributed representations) use sparse binary vectors and operate mostly with:

Related topics