Hi all, i am planning on developing a DDoS attack detection model from network traffic, first is it possible to use HTM for this kind of problem as a DDoS attack can be detected using several properties of the network traffic.
Thank You
Hi all, i am planning on developing a DDoS attack detection model from network traffic, first is it possible to use HTM for this kind of problem as a DDoS attack can be detected using several properties of the network traffic.
Thank You
Hi ism1n! Welcome to the forum.
If you want a ready to deploy solution. Get Grok. If not, you can find anomaly traffics using HTM (Traffic not showing up or too much traffic). The beauty of HTM is that it learns while inference. So you don’t need to update your model every time when there’s a growth.
Hi Marty,
Thanks for the reply. I did check out Grok and it is a good product. I am currently a research student and considering to use HTM framework in my research to model the said problem.
Thanks again
Welcome @osm1n and very cool application! If you’d like any helping thinking through how to formulate the problem for HTM we’d be glad to, and I for one am really curious how well your system performs at detection.
What kind of dataset your are using for ddos attack detection?
Thanks
That will be great and i will really appreciate that. The plan is to use a dataset and a stream network traffic coming from connected devices for online detection of the attacks. Then compare the performance of HTM against other unsupervised techniques used to solve these kinds of attacks.
Thank You
I am currently looking at this dataset here
I have tried using the dataset with HTM studio but seems like it wont work as its a multivariate dataset. I am not sure though.
Would you care to specify which file(s) in which folder(s) you’re referring to? I have my own home made sort of HTM studio for multivariate data, if you know when the real attack(s) occurred I could run it through and evaluate the anomalies that way.
Awesome. Any of these files with the entire features here
Afterwards this file with the 10 best final features
In the datasets the attack and normal traffic are mixed together - with attack traffic set as 1 in the attack column
Thanks
After staring at the logs for a bit. Looks like you need to transform the data a bit before apply HTM to it. Good luck!
Hey @marty1885, would you mind explaining this a bit? Like what is it about the data that calls for transformation(s)? And what kind of transformation(s) do you think would be best? Thanks, just curious for your intuitions on this.
Sure!!
First, few know-how, domain knowledge and my co-worker have been working on almost the same problem in the past 6 months.
subcategory
columns is likely to be guessed information, I should probably ignore them.pkSeqID
looks weird. I started out guessing it is the TCP packet sequence number. But UDP shouldn’t have one and that number is getting too large. I guess they are generic IDs. Ignoring them should be a good idea.So, I’ll purpose the following processing steps waving hands.
Thanks for the detailed explanation. I do have a question though, what if the network traffic was captured in real time, then a DDoS attack was carried out using one of the host sending the traffic, is it still possible to use HTM in this scenario ?
Thanks again for your time guys.
Thanks for your reply. First, real time != regular intervals. So you till need to apply a rolling window. Secondly, yes you can, but you will have better luck finding DDoS attacks from the server side then trace back to the attacker. Monitoring everyone’s outbound traffic is simply noisy and unreliable in most cases. In fact I might suggest you use the amount of connections made as the primary detection factor. Instead of the inbound traffic. Depending on the data.
Alright, I am taking your suggestions close to heart and getting started with the data clean up. Thanks.
When you have the data cleaned I’ll run it through my multivariate anomaly script if you’d like.