It’s still very much a work in progress, and I use the term simplified for a reason. I’m using a very basic implementation of the spatial pooler and temporal memory algorithms from the BAMI whitepapers to construct the proximal and basal distal connection layers respectively in my L4 implementation. I’m currently working on the L2 class and I’ve run into a bit of an issue. I’m having trouble finding a reference for the models used in the experiment similar to how the BAMI whitepapers or HTMSchool have in-depth explanations for the various components and structure of the L4 algorithms. I’ve been digging through the htmresearch experiment repo to see if I can pick apart exactly how Numenta structured their model to try to copy it, and I found this: https://github.com/numenta/htmresearch/blob/ea7f86eb6c575e5a749ce4411d1cd10b18da19a1/htmresearch/algorithms/column_pooler.py#L28
This appears to be an early implementation of some of the concepts that were used in the L2 implementation, including things like inertia (for keeping neurons active over time to create a stable representation), conditional learning (only learning when the number of active neurons is below a certain threshold) and more.
I’m hoping that I’m just being blind and there is a resource similar to the BAMI whitepapers right in front of my face and that someone here will just post a link and that will be that. Barring that, if anyone has any further materials for reading on this topic, I would appreciate the effort. Until then, I’ll just keep picking at their repo to see if I can figure out how to construct something similar.
The “Materials and Method” section of that paper does technically contain a complete description of the L2 model. [Edit: I wouldn’t say it’s a good description. Its very terse and the equations are tough to read.]
“To determine activity in the output layer we calculate the feedforward and lateral input to each cell. Cells with enough feedforward overlap with the input layer, and the most lateral support from the previous time step become active.”
“When learning a new object a sparse set of cells in the output layer is selected to represent the new object. These cells remain active while the system senses the object at different locations. Thus, each output cell pools over multiple feature/location representations in the input layer.”
I don’t think I would call this a “complete description”. For instance, they talk about cells with “enough feedforward overlap”. How much is enough? Is the model sensitive to this parameter? Is it static or dynamic? Then it talks about cells with “the most lateral support from the previous time step”. Again, how is this threshold calculated, and what does a reasonable threshold look like?
The second paragraph is even more vague. They talk about “a sparse set of cells” being selected, but they don’t actually talk about how they’re selecting those cells or how they determine how to keep them active, etc. The column pooler implementation which appears to be used in the repo for the experiment has a lot of different operations that appear to be governing this process, and from what I can tell it’s much more complex than the “materials and methods” section makes it out to be.
The full text of the article has more details and answers some of your questions.
They apply a simple constant threshold to the feedforward overlap, they do not state what value they used.
The lateral support uses a competition to activate, so they sort the cells by how many active distal segments each cell has and activate the cells with the most distal segments active.
During training they activate a random set of cells, and they force them to be active for the duration of each training session. The L2 model has computer code that prevents the active cells from changing while it’s training. This aspect of the model is obviously not biologically constrained.
Continuing to post exactly what’s in the paper that I have read multiple times is unhelpful. The parts that you’re posting are parts that appear to be in direct contradiction to the code supplied from the experiment, which is the entire reason I made this post in the first place. I’m looking for additional references that I can use to reconcile the discrepancies between what I’m seeing in the paper and what I’m seeing in the supplied code, so that I can work to recreate their findings.
For what it’s worth, there are other ways to do that task besides how they did it in the article.
You can hear about my experiments on this topic here:
Ok, in that case I would recommend sticking to what’s written in the published article.
The htm-research repo is convoluted and full of failed and/or unpublished experiments that they never deleted the code for.