Releasing BrainBlocks 0.6: Building ML Applications with HTM-Like Algorithms

@jacobeverist I am interested too on the differences between HGT and standard GC!
Could you please explain me?

The HGT is in Python because it is much less mature than the other components. There was a lot of numpy operations which was best left to python until we got a good understanding to be able to optimize it for C/C++.

Basically I generalized the concept of a “grid” as finding a set of subspace basis vectors, mapping a point in N-space to the subspace, applying a modulo along each subspace axis (grid period), and then binning the point into k intervals (bin frequency).

The subspace can be any dimension, but it is usually 1 or 2 dimensions. I haven’t experimented too much with higher dimensions, but have found that having many overlapping hypergrids at the same time represents high dimensions well, so long as the their subspace basis vectors, grid periods, and bin frequencies are sufficiently random. How to randomize and how they should be distributed is still something I’m studying and there are definite answers here.

I’m working on a paper for this which should help explain things and give a more mathematical description.

I will follow up here with some plots to illustrate some HGT concepts.

2 Likes

@jacobeverist @ddigiorg
Thanks for creating and sharing this. This looks great! :clap:

1 Like

I have some experience dealing with embedded sub-spaces. I’d be happy to help out if you need someone to bounce ideas off of or to discuss different ways of expressing your algorithms mathematically.

Hi,
Is there a paper or reference about Distributed Binary Patterns?
Thank you

@bela.berde working on it right now :slight_smile:

2 Likes

Have you had a chance to review the code? Do you have any questions? I’m still working on the documentation, so I will follow-up when it’s ready.

I’d like to be able to leverage your expertise gained from Etaler development to do things like TBB, and maybe templates if they fit somehow. Also, possibly a backend/frontend separation like you did in Etaler. Although, we currently only have one backend at the moment. We would have to resurrect our OpenCL code to make a GPU backend, with some re-engineering to get to work at least as fast as the current single-core code.

The backend code is callled bbcore. The C++ wrapper is applied first to create the C++ interface. Secondly, the python wrapper is applied to the generated C++ interface with pybind11. A number of python modules are created to comprise the generated python package which can be imported with import brainblocks.

Below is a glossary of our naming conventions that will you help you interpret the code and how it relates to HTM terminology. Where relevant, we provide links to source or example code. The “Headers” refer to the backend ‘.h’ files which each a companion ‘.c’ file which is easy to find. “Block Examples” refer to building a network of blocks manually. “Template Examples” show the use of templates that auto-assemble a network of blocks into a common architecture. The “Sklearn-Style Examples” demonstrate our Python classes that emulate the interface of scikit-learn estimators or transformers. This makes it easy to use and compare with scikit-learn’s library of classifiers and tools. Some code is implemented in Python and we refer to that as “Python Source”.

Block - similar to a region or layer in HTM. A standard interface for all the components of BrainBlocks.

Pattern Pooler - (PP), like HTM Spatial Pooler but with differences. Header

Pattern Sequence Learner - (PSL), like HTM temporal memory but with differences. Block Example, Template Example, Multivariate Abnormalities Example, Hierarchical Abnormalities Example, Header

Scalar Encoder - same as HTM Scalar Encoder. Example, Header

Symbols Encoder - like label encoder… although it takes integers like sklearn-style encoders. Example, Header

Persistence Encoder - this is a unique BrainBlocks thing where you can represent the passage of time as the same input is received. this creates input changes when you have long sequences like AAAAAAABBBBBBB. This helps with learning and also works great if you have missing data. Header

Pattern Classifier - (PC), our natively distributed classifier. it provides a supervised learning capability to HTM-like architectures. You can assign labels to sets of neurons, and it will train those neurons to activate when the labeled inputs are received. It works quite well in comparison to classic classifier algorithms. Blocks Example, Template Example, Sklearn-Style Example, Header, Python source

BlankBlock - a no-op block that is useful if you want to control the bit encoding directly from your scripts instead of using the backend tools. This is used in conjunction with the Hypergrid Transform. Example, Header

Hypergrid Transform - (HGT), a Python sklearn-style transformer that converts M-dimensional scalar vectors into numpy binary arrays. Can be input into BrainBlocks with the BlankBlock. Example, Python source

Page - these are the input/outputs of the blocks. A page is capable of having parent-child relationships with other pages. The content of the child pages are concatenated to create the content of the parent page. So to connect the output of an encoder to the input of a pooler, you would add the encoder output page as a child of the pooler input page. The pages have both the BitArray and ActArray representation available which are created as needed. Header

BitArray - The full bit representation of neuron activity. This is compact and can represent 8 neurons per byte. Header

ActArray - The sparse active neuron representation. An array of addresses that represent the active neurons. Sometimes this is a preferred representation, but often times the BitArray outperforms it. Header

Permanence - same as HTM permanence.

Statelets - these are analogous to neurons but without any implications of biological function. Either a statelet is active or not. And like its name, it represents a fragment of some greater state representation.

Column - equivalent to HTM minicolumn. This is just a convenience referring to the geometry without needing to explaining the difference between minicolumns and cortical columns. Again, we’re trying to avoid biological discussion and focus on algorithms.

CoincidenceSet - This analogous to a dendrite with synapses. A CoincidenceSet is owned by a statelet or shared by a column of statelets (in the sequence learner block). We renamed it to describe what its functional role is, which is to find statelets that activate which are coincident with the statelet that owns this coincidence set. Header

Receptors - The set of statelets that a CoincidenceSet is using for input (i.e. the potential pool of inputs in HTM parlance). Again, this reflects their functional role of creating a “receptive field” for a particular statelet that owns the CoincidenceSet.

That’s all for now. Let me know if you have more questions and I’ll try and to answer them and make this is as a sort of guide.

2 Likes

Hi @jacobeverist,

Brilliant work. I’m doing some exploring at the moment and wondering how I would set reset to signal the start of a new sequence? Could I just reset time?

1 Like

Looks like we forgot to put that in. We used to have it in our old internal version, but we neglected to put it back in for the current release. We’ll put it back in as soon as we can.

A hack-ish workaround would be to add a reset code, kind of like a newline or EOF character. This would indicate the end and start of a new signal and would nicely break up your sequences and prevent them from stitching together.

This encoding has to be completely different and has no overlap with all other inputs you would receive. That should get you moving forward until we put in the fix. Actually, it would help if you put this in as an issue on our github page.

I wouldn’t use time as an input unless your waveforms are consistently occurring at the same times. Otherwise, use the PersistenceEncoder.

2 Likes

Thanks @jacobeverist. I’ll try it out. Issue created: https://github.com/the-aerospace-corporation/brainblocks/issues/4

I’m looking to test out BrainBlocks on my 16 channel EEG data as @marty1885 suggested on HTM on EEG data. Firstly on the raw signal, then on the high dimensional cross channel features.

1 Like

You should adapt the multivariate example for cross-channel learning. It generates synthetic data which you would need to replace with your own.

For doing independent abnormality detections on each channel, adapt the hierarchical example but remove the hierarchy. Or keep it in if you want.

2 Likes

I’m kinda bad at reading pure C code. And I need more time to understand the details. As far as I see, it seems not too hard to add TBB support. Adding OpenCL support will likely need a huge rewrite. Also I think it is possible for BrainBlocks to share Etaler’s backend given we both have some rewrite. If that’s something you want. I’m working on improving the backend design currently. Maybe we could discuss this over PM or mail?

I’d like to see some unit tests for BrainBlocks. The current test only prints value so I have no idea if they are working correctly on my machine. See Etaler’s tests.

Also is there a design reason behind using C’s rand function? It is a bad random function even though it’s fast-ish. All values it produces fall on a 4D plane. It might cause weird edge cases and cause learning to fail.

1 Like

The python versions of the tests use pytest and will flag errors if they exist. We didn’t have any experience with writing unit tests in C/C++ so we don’t have a framework for them.

1 Like

Start from the C++ wrapper to make things easy. Its functionally object-oriented, which is apparent in the C++ classes.

1 Like

I have been playing with BrainBlocks (v0.7) and am enjoying its differences and relative simplicity vis à vis nupic to help clarify my thinking around HTM. I do have a couple of questions, though, and I hope this is the right place to ask them. If you would prefer I ask as issues on the github repository, I am happy to do so. Also, it has been a while since the last post in this thread and BrainBlocks has released a new version (0.7), so I apoligise if this should be a new thread (admins, feel free to move it).

Question 1: differences between ContextLearner and SequenceLearner

From my understanding of Nupic, a context learner (temporal memory used in object detection) is just a generic case of a sequence learner (temporal memory used in sequence/anomaly detection). Or, rather, the sequence learner is a specific case of the context learner where the context input is the learner’s output from the previous timestep. This plays out in the similarity between the context_learner.cpp and sequence_learner.cpp source. As expected the init functions differ, with the latter adding the output at t - 1 as a child to the context’s input. But there is also a difference in the surprise method. the sequence learner includes the following code which is absent in the context learner:

    // For each statelet on the active column
    for (uint32_t s = s_beg; s <= s_end; s++) {

        // Check if it is a historical statelet
        // - statelet is not the random statelet
        // - statelet has at least 1 dendrite
        //if(s != s_rand ) {
        if(s != s_rand && next_sd[s] > 0) {
            // Activate historical statelet
            output.state.set_bit(s);

            // Activate historical statelet's next available dendrite
            set_next_available_dendrite(s);
        }
    }

If there a reason why the sequence learner requires this additional activation but the context learner does not?

Question 2: nupic punishes incorrect predictions, brain blocks does not. why is it not needed?

Both libraries reinforce the active connections based on the active output of the learner. The nupic implementation goes a step further punishes matching synapses on the context inputs that led to an incorrect predicted (depolarised) cell. The BrainBlocks code does not punish these. Was there a reason to skip this punishment?

Thanks,
Scott

2 Likes

Thanks for checking out our code. Your questions are very astute!

You are correct that the ContextLearner is the more general form. The SequenceLearner is a special case of the ContextLearner, where the distal context comes from the t-1 statelet activations.

As for the extra activation code, this is a trick we learned to make a more efficient use of statelets in the SequenceLearner for time-series data. We may add this same code to the ContextLearner as well, since I see no reason why it can’t be used.

Basically it does this: whenever there is a “surprise” activation of a minicolumn, we do not activate all of the statelets on this minicolumn. Instead, we activate all of the statelets that have been active before in the past to make connections to all previous context states, and we activate a single random new statelet that acts as a new hypothesized context state. This makes an efficient use of statelets as a resource, and only uses them up as they are needed. The previous approach for a minicolumn “bursting” is to activate all neurons on the minicolumn to hypothesize all possible context states. This makes learning much slower. This new efficient approach makes sequence learning super fast.

Again, another good question and with an interesting answer. By default, we do not punish incorrect predictions, but will give the option to do both. So what’s the difference? I suppose it depends on your intended application.

If you want to make predictions of the t+1 state, punishment should be used because you are trying to find the best answer for the next step. You want to punish poor predictions and reward good ones. However, if you are trying to do anomaly detection, punishing poor predictions isn’t nearly as important. Without punishment, the next step prediction represents the entire space of possible next states that have been seen in the past. So this gives you a large envelope of possible trajectories that have been seen in the past which you can use as your “normal” behavior. Any next steps that deviate from all these possible trajectories represents your “abnormal” behavior and makes a really good anomaly detection approach.

Thanks for checking out BrainBlocks and your comments! Looking forward to hear more from you.

A very basic question - what kinds of problems can I solve with Brain-Blocks? Time series anomalies? Anything else?
Thanks.

@Mark_Springer We’ve mostly worked on time-series anomaly detection and standard classification tasks.

I think we’re quite competitive with other approaches on abnormality detection results. Our classification results for feature vectors are “on par” with other approaches, but we haven’t extended it to image classification yet. You can see a side-by-side comparison here.

You can see a big difference from the other approaches is that it only expresses an opinion on the area of the learned experience. BrainBlocks will give a solid “I-don’t-know” answer when encountering novel inputs. This gives you a simultaneous classification and outlier detection capability.

It took us a while to figure out what the distinguishing features of BrainBlocks actually are and why you would use it for an application. They are the following:

  1. few training inputs required: will often learn in one-shot
  2. robust to novel inputs: flags a novel input as an outlier and can find a semantically similar learned input if it exists
  3. explainable results: internal representation and output can be communicated to the user with a trained translator network. This is similar HTM’s time-series prediction but more generalized for an arbitrary language

The first two are “baked-in” with BrainBlocks. The 3rd requires some effort and isn’t something we’ve released yet.

Other than that, we use BrainBlocks to create experimental cognitive architectures that do things like sensorimotor learning and object recognition, similar to the HTM experiments. Does that answer your question?

1 Like

Yes, thanks for the informative answer.

New release mentioned here: