EMG Processing Network Questions

Hey all, I’ve been working on a project that uses HTM.java to identify EMG signals. I’m having a bit of trouble, however, figuring out how to best go about setting up my network. I’ll go ahead and explain in more detail what I’m looking to do, and what my questions are:

I recently purchased a Myo armband, and although it is quite fun, it is pretty limited in its functionality at this point in time. The idea is to slip the armband on and have it make your phone or desktop respond in some way when you perform certain gestures. Its main issue right now is that it only recognizes about 5 or 6 different hand poses (more if you factor in the armband’s gyroscope and accelerometer sensors, but I’m just concerned with the EMG data right now). The developers of the armband released an API that allows you to access a stream of EMG data from the armband via Bluetooth. I figured this would be a perfect application for an HTM - classifying and identifying a stream of spatial and temporal data!

The armband is essentially composed of 8 EMG sensors (they’re called pods). You just slide it up your forearm and wear it near your elbow. So I’m working with up to 8 streams of EMG readings from the band - because the API allows you to access each sensor’s EMG stream individually. This is where my main question comes in: What is the “best” way to set up a network to work with this data? Should I use all the data, or just some of it? How should I feed it into the network?

My current approach has been to set up a triple layer network (with these Parameters):

// Using the beautiful fluent-style API...
network = Network.create("Network", p)
    .add(Network.createRegion("Region 1")
        .add(Network.createLayer("Layer 1", p)
            .alterParameter(Parameters.KEY.AUTO_CLASSIFY, Boolean.TRUE)
            .add(Anomaly.create())
            .add(new TemporalMemory())
            .add(new SpatialPooler()))
        .add(Network.createLayer("Layer 2", p)
            .add(new TemporalMemory())
            .add(new SpatialPooler()))
        .add(Network.createLayer("Layer 3", p)
            .add(Sensor.create(ObservableSensor::create, SensorParams.create(
                SensorParams.Keys::obs, new Object[] {"name", publisherSupplier}))))
        .connect("Layer 1", "Layer 2")
        .connect("Layer 2", "Layer 3"));

Here’s what the three header lines for the PublisherSupplier look like:

Pod1,Pod2,Pod3,Pod4,Pod5,Pod6,Pod7,Pod8
float,float,float,float,float,float,float,float
B,B,B,B,B,B,B,B

Every time step I get a byte[8] containing the most recent readings from the armband’s EMG stream. I then feed in the EMG data manually via a Publisher. So, if I got this array:

{-1,-4,4,-7,-5,2,0,3}

I would feed each value into the network on a single line, like so:

manualPublisher.onNext("-1,-4,4,-7,-5,2,0,3");

Every time the network processes a line of input, I add that processing’s actual input, predictions, and anomaly score to a list. After I finish feeding in data, I graph the actual and predicted values for each pod on a line chart. So, here is the chart resulting from training a brand new network for the first time on the EMG readings created by closing and opening my fist:


You’ll notice the average anomaly score is 94% (that is calculated by summing the anomaly score for each line of input and dividing by the number of inputs).
After being trained on that same data 11 more times, the chart looks like this:

The anomaly score is down closer to 70% now. This is about as low as I can seem to get it. I only used data from one fist EMG sample in the above pictures for simplicity’s sake, but I’ll typically take about 10 samples and randomly train the network on them for up to 50 times. I even trained the network 100 times (only did that once, because it took quite a long time :slight_smile:), but even then I was only able to lower the average anomaly score to ~72% (note that the avg anomaly score is reset after each training, so the 72% was only for the 100th training). It seems like I ought to be able to do better than that… Or am I wrong?

I found a thesis paper (pdf download) online in which the authors did something similar to what I am looking to accomplish. If you’re able, I suggest perusing pages 49-68 of the document to see how they set up their network with NuPIC (the rest of it is mostly discussion of other neural networks and their EMG data capturing device, which appears to be roughly equivalent to one pod of the myo armband).

Here’s a schematic they used to illustrate their setup:


They go much more in depth in the document, but you can get the gist of it from that schematic. As far as I can tell, our networks are pretty similar, except for their category sensor (which as I understand it tells the classifier if it is looking at data from a index finger opening, or closing). I’d like to include a category sensor in my network setup, or something similar, because I’ll need to distinguish between (ideally) many different EMG data streams (e.g. opening and closing a fist, extension and contraction of the fingers, wrist movements, etc…). The only other option that I can see is to have an individual network train on each hand pose and see which network outputs the lowest anomaly score for a particular set of EMG samples. That is what I was going to do until I read their paper, but I think the category sensor approach would be far more efficient.

The problem is that I’m not entirely sure how to implement a category sensor in HTM.java. I thought that I might be able to just add a field to each input line identifying what pose the input is from. For example, an input line from an EMG sample of a fist closing would be:

manualPublisher.onNext("fist-close,-1,-4,4,-7,-5,2,0,3");

whereas an EMG sample of a fist opening would be:

manualPublisher.onNext("fist-open,-1,-4,4,-7,-5,2,0,3");

I have not yet tried out this approach, as I wanted to get some input from the community on my approach before going too much farther. But the main “issue” with this approach would be that the category identifier input would be processed by all stages of the network, rather than just the classifier. Granted, this might not matter much?

The authors of the thesis paper also appear to take a much different approach than I do when putting data into their network. It appears that they take 512 samples from their single EMG sensor and feed that into their network:


So they are essentially taking a section of the graph produced from their EMG sensor’s readings and feeding it into their network. Whereas I am taking a point from each of my 8 EMG sensors’ graphs at the same x-coord and feeding that into my network. If I were to take their approach, then it seems I would need to construct a network 8x as large as theirs - assuming I use the data from all 8 of the armband’s sensors.

I suppose that about sums it up. I’m sure I’ve left out an important detail or two, so please ask if anything doesn’t make sense and I’ll clarify. I’ve set up a GitHub repo for my project just now, so you can see the source. I’ll warn you, it isn’t the cleanest or most well documented code at this stage, as I’m still trying out a lot of different things… It’ll look better once I settle on one approach, which is why I’m posting this in the first place :slight_smile:

Again, my main questions are:

  • How should I set up my network?
  • If it is a good approach, how should I best implement a category sensor?
  • How should I go about feeding in the EMG data to my network?

Thanks in advance for any advice or input. I’m hoping some folks with more experience and knowledge in this area can help me out a bit! :smiley:

2 Likes

Very cool project!

I would just like to comment on the Network your using. I would only use one Layer unless you have some real reason to use 3? In this case (until real hierarchies are added), the more is not the merrier :slight_smile:
Other than that, I would look at the SDRCategoryEncoderTest for hints on how to setup a category encoder. As far as how to express and setup the actual categories - a more experienced Numenta engineer should be able to help you further.

Cheers,
David

EDIT: What I mean by, “until real hierarchies are added” above is something very intentional. I don’t mean merely repetition of “SP->TM” structures in a Network. Hierarchy in HTMs has a discrete meaning and will result from extensive research and investigation into how the layers in a neocortical region communicate back and forth, and how the regions themselves communicate back and forth.

see this for more context…

Are you saying that there is no benefit whatsoever to using multiple layers, or just that the benefit might not be worth the cost (it does seem like the additional TM an SP bumped up the processing time significantly). I was initially using just one layer, but I thought I’d try adding some more after reading that thesis I linked to. They seemed to be under the impression (as was I) that adding layers would allow higher level patterns to be “discovered” by the network. Although, interestingly, I was able to get the average anomaly score down to about 20-25% with a single layer (but as I said, I can’t drop it below 70% with the setup I explained in my post). Also, speaking of average anomaly scores, is that a proper way of measuring a network’s ability to recognize a series of inputs?

I was under the impression that the primary missing feature from multiple layers was feedback. Is there something more? Or maybe I’m entirely misunderstanding you… Are you saying that NuPIC (what the paper’s authors used) has this feature, but that it isn’t yet implemented in HTM.java?[quote=“cogmission, post:2, topic:419”]
I would look at the SDRCategoryEncoderTest for hints on how to setup a category encoder.
[/quote]

Thanks. I actually fiddled around a bit with category encoders awhile after posting this. I was working with the CategoryEncoder, though. If you take a peek at the documentation for it, you’ll find this:

The SDRCategoryEncoder (not yet implemented in Java)…

Is the SDRCategoryEncoder now completed (making the above line of documentation outdated)?

Also, just out of curiosity, is one of them better than the other? If that is the case, how so?

Just went in and changed my code from using a CategoryEncoder to using an SDRCategoryEncoder, it works fantastically. Which is great, because last night I was getting some annoying exception (as is typical for me :slight_smile:) about an ArrayIndexOutOfBoundsException… - that’s when I was trying to use the CategoryEncoder. It went away just by my adding SDR to the front of the CategoryEncoder, though. So thanks, you saved me some time 07

For the most part, higher level features won’t be discovered yet… until “Hierarchy” is formally added to the HTM Algorithms. The HTM.java NAPI has the ability to connect Regions just as a “place holder” for the anticipated Hierarchy features later - (at least for the most part). If you find you get traction using more Regions, I’m not going stop you? :wink:

Is the SDRCategoryEncoder1 now completed (making the above line of documentation outdated)?

I guess you answered your own question :wink: But for “others” sake I’m going to answer yes here. It is outdated documentation that needs to be removed. (If anyone wants to create a Github Issue and solve this to get their name in “bright lights” or just otherwise help, I’d appreciate it?)

Also, speaking of average anomaly scores, is that a proper way of measuring a network’s ability to recognize a series of inputs?
Also, just out of curiosity, is one of them better than the other? If that is the case, how so?

I’ll leave this for one of the Numenta engineers as I don’t know the answer to this… If you don’t get an answer here, then post your question in the “NuPIC” category so it will get some attention from Numenta folks since these are “general” questions that I can’t answer at the moment. (though i do favor the SDRCategoryEncoder)

Hehe, I can do that :stuck_out_tongue:. Do I need to do anything other than edit the javadoc comment? Will that automagically cause the javadoc to be updated, via a webhook or something?

I can update the javadoc wiki link from there once you submit your PR. It requires that I update my clone then run a small script to generate the new docs - then put them in Numenta.org’s wiki site - just FYI

This paper is referring to the Zeta 1 version of NuPIC, which is not the version implemented in HTM.Java. Please read H is for Hierarchy for background.

If you have 8 sensors, I believe you’ll need to have at least 8 HTM models. In our current architecture, you can only choose one predicted field to make predictions for per model. So if you are attempting to create live predictions for each sensor on the device, you would need 8 models, one for each sensor stream. You might end up passing all 8 streams into each model, but only predicting on sensor stream for each model. Or you might end up sending only on sensor stream to each model (it depends on if sending all 8 streams to each model helps the prediction, you’ll have to experiment with that).

Also, some general notes about signal processing with HTM. You are not the first one to try this. :wink:

Sample the data if the velocity is too fast

What rate does the device produce data? It may be too fast, in which case sampling is your friend. For example. we learned from a neurologist at the Fall 2014 hackathon that humans can identify an “pre-ictal” state in an EEG signal (a seizure is imminent) at a sampling rate of 100 Hz, but the EEG data stream itself is a massive amount of data, so very heavy sampling or aggregation of some kind is necessary to get a stream into the HTM.

A good rule of thumb for this is, if you (as a human) can identify different classifications in the data at the current sampling rate, the HTM will have better chance as well. I’ve found that it is generally better to sample or aggregate the data aggressively until the resulting signal is simple enough for you to see the pattern, but no simpler.

You might need to FFT

For extremely high-velocity data like EEG and sometimes EMG, you might want to apply an FFT to the data. This will expand your model space significantly, but it is one way to extract features that only exist in certain frequencies of the data. For a heavy metal example of this, you can see this video and my nupic.critic project:

Figure out how to use “Anomaly Likelihood”

In the NuPIC (python) codebase, we have an Anomaly Likelihood algorithm that we use to massage the raw “anomaly score” coming out of our HTM systems. This algorithm exists in HTM.Java, but I don’t think it is integrated in any way with the Network API. We’ve found that using this post-process increases the value of the anomalies reported dramatically. Here’s some more details:

3 Likes

Yes, it’s not a bad way of doing it. We used to have something of an “error score” but it turned out to be a bit misleading. I think the best way is still to keep track of how accurate the predictions are vs ground truth, plotting the error over time. The more error within a moving window, the worst the model understands the data.


On a side note @Hopding, this is such a great first post! Well-formatted and clear. Thanks for being part of the forum!

@Hopding @rhyolight

In our current architecture, you can only choose one predicted field to make predictions for per model. So if you are attempting to create live predictions for each sensor on the device, you would need 8 models, one for each sensor stream.

I may have to look at this in order to limit the processing, but for now HTM.java doesn’t have this limitation. HTM.java instead creates 8 classifiers and you can access their Classifications by field name from the Inference object that comes out of the end of the Network.

I would like to ask a Numenta engineer whether I should limit this because it will create a Classifier for every field by default?

Wow, thanks @rhyolight! That’s a lot to chew on :smile: I’m sure I’ll be able to improve my setup with all this great info.

1 Like