Financial Anomaly detection

question

#1

Hi there! I’m new to NuPIC and I need your advice. I want to make anomaly detection with OHLCV time series + (maybe) add current orders/last trades (with last trades I’m not sure how model will work because it does not have concrete time slicing/steps as candles has). So I need an advice in direction of digging it in, start point and some articles to make model more complex and effective) Maybe someone have experience in such things.


#2

You can start by watching the HTM school series on YouTube. And then start doing your experiments. Have fun and you’ll learn a lot.


#3

Nice start. Thank you!


#4

Hi,
First step, to understand how HTM detect anomalys see this link is very helpful. and then you can see NAB. and there is some vedios on YouTube like marty said HTM school series and numenta. Here all the papers of research team.


#5

So, I watched everything, walk thru tutorials, and I have many questions in final) I tried to find anomaly in my data and result was not satisfying :unamused: Also I found this project from Numenta and it is similar to what I want to implement - but digging into code didn’t gave me explanations (I stucked into htmengine code).
The main questions is:
How to use multiple data which simultaneously could have an anomaly (price and volume)?
Which encoder to use with scalar data to avoid problems with setting resolution (I’m trying to use AdaptiveScalarEncoder, but getting an error and not able to use it)?


#6

Also I was thinking about transformation of my absolute data in to relative (because I can’t use AdaptiveScalarEncoder) but I’m not sure is it worth it while I don’t have any normal working model. My data looks like:

“time”,“price”,“low”,“high”,“open”,“close”,“volume”
1545857700000,0.00000962,0.00000962,0.00000964,0.00000964,0.00000962,8079
1545858000000,0.00000964,0.00000964,0.00000964,0.00000964,0.00000964,2636

And I was thinking about transformation into percentage difference between previous row and current. (I was using similar approach in LSTM predictions classification)


#7

The percent changes idea could help yes, certainly worth trying. I also wonder if there’d be a way to reduce the number of fields you have there without losing too much information. If any of the fields were highly correlated with each other it would be redundant and kind of muddy the waters for the model I believe.

One crude way of testing this could be to do a multiple regression model with all those fields predicting the next ‘price’ value, this would yield ‘VIF’ coefficients for each field, and if any of these were high (>= ~10) it would suggest multicollinearity. Also have you tried swarming? This is NuPIC’s native method for finding optimal parameters including which fields to include (in case you didn’t know that).

It seems to me that a big part of success with HTM (and probably any algorithm) is effectively de-noising the data without losing too much of the real signal. This seems especially needed and difficult within the financial domain due the the inherent high noise and complex signals. Have you considered using larger time increments for instance? Like using the total price change week by week rather than day by day? It just seems like daily changing values could be extremely noisy. I also wonder if there’d be another stock with some inherent relationship to this one (a competing company or something), and you could use the changing ratio between these two instead of the absolute price in dollars.

I have no particular expertise in modeling financial data, just some considerations from my perspective.


#8

Thanks for reply. Yes I’m reading now about swarming, but for now have no idea how to run it :grinning: I should clarify what I’m exactly doing: the whole idea is to “jump in to the train” while trading asset is artificially pumping (cryptocurrency) so the main fields for model should be “close” and “volume” (I’m trying to use only this 2 fields for now), also sentiment analysis will not help because “pumping” is happening by collusion of a group of people. So on graphs (mostly) seen a “suspicious” activity before actual “pump” is happened (buying volume growth and price) or at the beginning of the “pump”. But it seems to me that there is a big probability of impossibility to implement it in this way. (You can find an example here set time (candle size) to 15 min, and slide back a little) So before actual “pump” started there is 2 clear huge buyings (2 hours before and 3 hours before) which pretty clearly differencing from other “noisy” parts of graph.


#9

I think scale may be another challenge for cryptocurrency predictions. From my uninformed perspective, I can see a lot of patterns in crypto valuation graphs that appear to repeat themselves, but they happen at wildly different scales (neither encoding the actual valuation or deltas would capture these scale differences IMO). I wonder if an encoder for relative strength index would be able to spot these types of “pump” anomalies? RSI graphs seem to be fairly normalized. I’m not a financially savvy person myself, though, so maybe something obviously wrong with that approach :slight_smile:


#10

See also: HTM Application for S&P 500 Analysis


#11

I’m not sure that using any of indicators will help much. I’m my previous works with predicting price - indicators(especially like RSI and Bollinger Bands) really increasing performance a bit, BUT unlike the “classic” markets flooded with traders and bots which using classic trading indicators - the cryptocurrency market in commonly used by “charity chasers” and bots trading in high frequency manner. So the main reason why indicators “works” is that almost every trader/system uses them for making decision to buy/sell. As example when Bollinger Bands gives sell signal - pro traders/bots who has big amount of assets selling them - so market goes down and graph repeats itself. I’m sure that this is not the main reason of market movements - but it has some part of the truth. But cryptocurrency market lives by other more manipulative laws (especially low volume assets).


#12

And it’s really has a clear patterns in different scales (I believe you mean a different trades aggregations like 5 min candles, 30 min candles and so on). I spend huge amount of hours trying to predict prices using ML without any success but I believe that it’s possible for individuals (many trading companies and hedge funds using ML to increase their profits)


#13

So in other words, a crypto “pump” campaign would not show up as any unique temporal pattern when viewed on an RSI graph. I’m sure there are other ways to normalize the data to help with the scale variance problem. Not familiar with the “candles” terminology, so can’t comment on whether or not that refers to the same thing.

– Edit –
To clarify what I am talking about, I see a pattern that looks something like this appears over and over again in crypto value charts:

image

I’ve often wondered if this pattern could be predicted by HTM. The problem is that it seems to occur at wildly different scales (both in the valuation and in time scale). It seems that the fact I can identify it visually, it should be possible to devise an encoder that would allow HTM to recognize it as well.


#14

I’m sure it’s possible, but not sure it’s possible by lonely individual with poor knowledges like me :grinning: On your graph you mean something like trend changing from flat to uptrend and after peak - downtrend. But this is true for big pairs with millions of USD day volume. For smaller ones your graph looks like what i’m looking for - “pump”. Numenta has a partner which claims that they able to predict start and end of uptrend on classy markets. So they definitely spend many human-hours of professionals on this and getting some encouraging results. It’s a reason why I refused the idea of exact price predicting (even from binary classification - “will price in next time step be greater or lower than the current”) and choose something more simple like: “oh, something strange is happening here”.
e053d44b851b97f629ef0f40db52116782083609

I’m trying to spot GREEN area with anomaly detection. And sell on RED without using any complicated ML algo (you can achieve this by careful using of stoplosses but not ideal).


#15

About normalization of data - I’m not sure I understood you correctly. You mean something like e.g. distribute data from 0 to 1 or using some kind of smoothing on data to reduce noise a little? Or something else?


#16

Exactly my thought as well. Anyone can recognize the steep climb once it is well under way, or the peak once downward slope is well under way. If the system could start to predict these (or just call them out as anomalies) with some degree of reliability, and call them to your attention a bit earlier, that would be really interesting.

The best way I can explain what I am talking about is imagine somebody shows you a symbol or figure in small print that you haven’t seen before. Then later you see that same symbol on a large poster board, taking up a large part of your visual field. You would have no trouble recognizing it. Your visual system was somehow able to establish a coordinate system and result in an encoding that is scale invariant.

Something like that would need to be applied here. Since HTM does not yet have mechanisms for establishing coordinate spaces through movement, this would need to be handled instead by the encoder. You’d want to recognize a particular shape in the graph, even if the values being measured are encompassing a much larger (or smaller) range than previous encounters of that shape.


#17

Could be something as simple as taking average peaks and valleys over some sliding time window, and using that to define the coordinate system (0 - 1 for example). Then encode changes in the currency’s value (compared to USD or BTC for example) as float numbers within that coordinate system. Wouldn’t handle time variance, though (perhaps the scaling factor from the coordinate system could be applied to the sample rate – could have multiple HTM modules sampling at different rates)


#18

I’m sure that I don’t understand your thought well. Do you mean something like “descriptive function” of current pattern which not affected by scalar values? Something like normalizing not all data at the same time but sliced in pieces?

And feed encoder with this normalized data slices between 0 and 1 data? So the idea not to detect exact values which differs from the previous data but find curves/shapes which differs enough from previous curves/shapes?


#19

Yes something like that. Still a rough idea and definitely untested. Let me throw together some tests to see how something like this might be set up.


#20

So for now my experiments ended with nothing :unamused: I’ve tried to run large swarm on 40k historic data with 3 fields included (time, close, volume) and I was forced to stop swarm after about 19 hours of work. Now I’m running swarm on 3k historical data will see how long it will take…

Also I’m pretty sure that I need my data to be normalized to use encoder with max and min values (0 - 1). But I will try this approach after poor results of swarming.