After trying some of nupic examples I want to pass an input of SDR data (in a csv file) straight to the nupic. This SDR has already its semantic meaning and I want to use it for anomaly detection.
My intention is to use the pass_through encoder to feed the nupic directly with this data but I´m not sure what should I include in my code to execute it.
After doing a custom anomaly detector following the steps here (removing the SD classifier and adding the anomaly algorithm) with my encoded SDR I’m always getting a score of 1.0 for any number of iterations…
I’d say definitely try much larger ‘n’ and ‘w’ values for your input SDR’s, like 10x bigger at least. Using just 13 and 2 would really confine the spatial pooler, giving it a really hard time picking the best overlapping columns when there are only 2 active bits for all of them to connect. Of the 2048 columns I bet a lot of them are connected to those 2, creating a big multi-way tie in overlap scores. I can’t guarantee that’s your only problem but I bet it is one, and I’d try that change first thing.
Also I’m not sure but I wonder why spatial pooling is even needed if you already have your SDR? Spatial pooling takes an encoder space and turns it into an SDR within the 2048 column space by picking the best overlapping columns (to form the SDR), but if you already have your SDR then couldn’t that just go right into temporal memory? I’m not familiar with the PassThru encoder in particular so I’m just wondering how much if any of that makes sense (@rhyolight)?
I made a bigger SDR and now I´m getting good anomaly scores, however there is a small number of transitions where I get high scores eventhough my data are single transitions between SDR states which are iterated cyclically for a 100 times.
And speaking again about n and w… which is the percentage range of w/n where Nupic works well?
Wait @rhyolight , so the ‘w’ value in the encoders should be just 2% of what the ‘n’ value is? So for a scalar encoder I should have a ‘w’ of just 6 if my ‘n’ is 300?
Also @Baquetron, glad to hear your anomaly score have come down. If your anomaly score are usually low but become high only at certain transitions, it would seem that those transitions are anomalous in some way. It could be that those transitions occur very infrequently compared to the other. If you’re able to plot your data somehow and highlight where the anomaly scores are getting high I’d be glad to have a look.
Maybe I should ask @scott or @subutai about this, but generally this is what I try to shoot for. Sometimes it is not possible because you want to encode more semantics into the data, and weighting can be important.
Ok great, I’m definitely curious to hear it! If I remember right I think you once said that the ‘n’ value shouldn’t be much more than like 400, in which case the 2% ‘w’ would be just 8. Right now I’m using ‘n’ at about 300 and ‘w’ at 21 for 2 scalars (working to get that changed to x,y points with the coordinate encoder), so if that’s suboptimal I’d love to know.
That is a typical number for scalar encoding. I think the rule is to just be consistent, and keep it more sparse than dense. Once you breach a certain density threshold, some computational aspects of SDRs start to become weaker.
Step 0-> Belongs to change from last SDR to first one
Step 9->Transition line from bits (1-10) to (10-19)
Step 17->Transition line from bits (10-19) to (19-28)
@rhyolight@sheiser1 the fact that annoys me is that I´m getting the 1.0 anomaly scores in these steps from the very first iteration…
3 more things to add:
My w/n is 10,47%
So if I´m getting a 1.0 score for the transitions metioned in my model, I don´t have any way to perceive if a new transition is added to my data as it will get almost the same score
Is overlap needed for possible transitions? The key point is to detect anomaly behaviour between transitions so if I train my model with a transition between 10 to 100 that happens very often, it should get low anomaly scores eventhough there is no SDR overlap between these numbers. Right?
Thanks again for your attention! Hope you haven´t got slept with my comment
I would not trust an anomaly score until it has seen a significant amount of data. Patterns arise over time, and you want the system to see each pattern several times. Also, we have had much more success triggering anomalies by using a post-process we call “anomaly likelihood”. Once you hook this up, I suggest flagging anomalies if the likelihood value is over 0.9999. (this is not biological, just a hack to smooth out the very jumpy anomaly score metric)
Yes it will take some time to learn the patterns well enough for the anomaly scores to have much meaning. As an analogy if you’re at a company and you see an intern is surprised by something that can be a normal thing, since the intern probably hasn’t seen much data and would be surprised easily. Whereas if the most senior manager is really surprised by something that’s more meaningful since that person has seen more data (more experience basically).
Anyway on your chart @Baquetron is the horizontal axis time? I’m a little confused about the meaning of the vertical axis. It seems like the system is getting surprised every time theres a shift, like at step 9 from bits (1-10) to (10-19). Since these patterns seem to have a lot of near identical values and then suddenly shift to another series of identical values, I think it’ll take a more data or repetitions for it to learn those transitions from sub-pattern to sub-patter. For instance, if the sequence went: ‘A-A-A-A-A-A-A-A-B-B-B-B-B-B-B-B-C-C-C-C-C-C-C’, the transitions from the last ‘A’ to the first ‘B’ would take longer to learn than if the pattern was more simply: ‘A-B-C-A-B-C’. I’d say if you don’t have more data then set the pattern to repeat many times, since more repetitions will be needed to learn longer or more complex patterns.
@sheiser1@rhyolight So after digging deep in my code I found I was taking wrong the PredictedcolumnIndices for the anomaly algorithm input.
Do you think this is the right way to obtain it?
#Get predicted columns from predicted cells in tm
prevpredictedCells = tm.getPredictiveCells()
dictpredicted = tm.mapCellsToColumns(prevpredictedCells)
prevpredictedColumnsIndices = dictpredicted.keys()
So prevpredictedColumnsIndices is the input of next iteration anomaly algorithm as explain here
However I’m still getting unstable anomaly scores for my SDR data as well as for data from hotgym and nyctaxi examples. I changed my encoder to enter scalars and dates and my anomaly scores differs from those of gym and taxi opf models.
For all these examples I iterated data for loads of time and still got very unstable results.
I’m not familiar enough with the guts of the code to say exactly, so what I’d do is look at the hotgym example ‘run.py’ file that came with NuPIC, and find where the differences are between that code and yours. If you’re feeding in the same data then you should be getting the same results, assuming the model params are the same.