Thursday 26 March 2020

AI and Cybersecurity. Part 4 - Clustering URLs

In Part 3, we tried to apply the feature scaling and dimensionality reduction techniques to the dataset with phishing and benign URLs. As a result, we were able to clearly see the distribution of URLs between two classes based on four attributes: registrar, country, lifetime, and protocol.

But what if we don’t have labels (phishing and benign) for the Internet links in the beginning. Will ML still work to detect phishing attacks? In this case, we may come to unsupervised learning, in particular, clustering. Clustering enables grouping objects of unknown classes according to common features so that we do not need labeled data for a training set.

Wednesday 18 March 2020

AI and Cybersecurity. Part 3 - Dimensionality Reduction and Feature Scaling

In the previous post, we created a binary classifier for detecting phishing URLs. Here, we're going to continue exploring the data with visualization techniques.

Monday 16 March 2020

AI and Cybersecurity. Part 2 - Detecting Phishing URLs with ML

hack fraud card code computer credit crime cyber data hacker identity information internet password phishing pile privacy protection safety secure spy steal technology thief green cartoon text product line font illustration human behavior angle clip art graphics computer wallpaper

In Part 1, we already got acquainted with AI paradigms and the main ML approaches: supervised, unsupervised, and reinforcement learning. Even though the unsupervised learning approach looks more attractive as you do not need to pre-mark the data for training, supervised learning can be seen as a more precise instrument for detecting malicious objects such as phishing URLs once we have enough labeled data.

Saturday 14 March 2020

AI and Cybersecurity. Part 1 - Intro

Image via
[Author: Alexander Adamov]

I have spent almost all my professional life working in the antivirus industry detecting and analyzing malware. Around ten years ago, when the malware flow had increased so much that my colleagues and I did not have enough resources to analyze them all, we started thinking about automating our efforts. How to make a machine that autonomously detects and analyzes malware and phishing URLs day and night, writes and publishes reports? As a result, we managed to create a robot (what we call now 'malware sandbox') from scratch to automate most of the processes in the malware laboratory with the help of His Majesty Artificial Intelligence (AI). Since that, we accumulated a bunch of use cases for cyberattacks detection, malware analysis, and security testing with ML that can be useful for cybersecurity professionals that decided to leverage ML for cyberdefense. I'm going to share this knowledge in the series of blog posts that will eventually become a part of a new university course 'ML in Cybersecurity' that I plan to make open-source. I also welcome cybersecurity experts and data scientists to contribute and help universities adopting the course.