I have been trying to raise awareness and the importance of blocking bot, suspicious or dangerous web traffic on sites and IoT devices for more than a year now. However, it seems that no one seem to be listening. Everyone shouts about hacking and major compromises like those at
Yahoo and
IoT DDoS. However, it seems like other than crying about it, very few individual consumers seem to be actively looking to solve this problem. I was hoping to help the individual consumer rather than the enterprise because I think that that might be the new and upcoming market. If you are interested in what I am doing or want to try my solution, please drop me an email. ;)
My Work & Experiments
Anyway, I wanted to share that I have been
working and experimenting in my spare time on a solution that approaches the problem from a slightly different angle than the traditional detection approach. For those of you who have not been following my work, I am working on
prevention rather than detection.
In my opinion, I don't think that the current generation of firewalls are effective anymore because it is too static when compared to the dynamic techniques used by criminal hackers. I saw many new startups spring up to tackle this problem and big investments in the cyber security space recently (
here and
here) too but I am wondering how effective these tools are or are they merely hype. I do hope that the best technology will win out in the long run.
From my experiments, I have come to believe that this problem can be tackled more gracefully if we are using and looking at the
right data and context. I have applied machine learning to this problem to scale and stretch the prevention approach that I have started out with even further into the realm of predictions. Technically, my experimental approach with machine learning is rather simple. I hand craft and model a set of high quality labels to start with. Using the labeled data, I benchmark it against a wide set of dimensions within the web traffic data. The dimensions are slowly filtered out and reduced based on empirical observations over tens of thousands of real world live data. This process is repeated until I can weed out all the false positives and negatives with a satisfactory ratio of precision and recall. The final product is a machine learning model that I use as an unsupervised system which I promote as a service,
MB™. The results are pretty exciting and interesting so far.
Two Dangerous Samples
Here are two of the recent samples that MB™ has caught without any human intervention. The data is taken out from my network report as the bots crawl around for information and participate in referral spamming. These machines are typically part of a botnet doing data reconnaissance for their criminal masters.
The web traffic from Syria actually belong to an organization whose members have links to the
Syrian Electronic Army. The second web traffic came from a south American home user who probably does not even know that his or her computer has been hacked and used as part of a botnet. The clues are not always as obvious as the two samples shown here because both web traffic includes referral data that comes from well known referral spammers. MB™ has surprised me many times before in being able to pick out and spot them even when obvious data is not immediately available for my validation.
Preventing these types of web traffic from obtaining information about your website or IoT device is critical because without any information available to the criminal hackers, it is hard for them to quickly decide how to exploit your device, server or site. Trying exploits randomly quickly exposes them to other existing security tools such as malware detection, antivirus and penetration monitoring tools.
I recently deployed MB™ on my
blogger site and the results shocked me. It turns out that almost
90% of all the traffic that came to my blog site are bots, scrapers and are questionable. Below is a summary report extracted from my MB™ dashboard.
Hope to hear from you soon, in the meantime, stay safe online! My contact is
support@malleablebyte.org