Friday 2 October 2020

Reinforcement Learning for Anti-Ransomware Testing



ML models have recommended themselves as a powerful tool for cyberdefense. AI/ML is heavily used in antiviruses (EDR), Next-Gen Firewalls, and SIEM (SOAR) solutions to solve the classification problem as well as to discover anomalous behavior that may indicate a presence of an attacker with the help of Supervised and Unsupervised Learning. Deep Learning helps to filter spam emails and mark fake news to protect users against disinformation [1].

So what about AI in security testing such as penetration and anti-malware testing? The interesting application of AI can be found in the work called ‘Autonomous Penetration Testing using Reinforcement Learning’ by J. Schwartz from the University of Queensland [2] where he proposes to use Reinforcement Learning (RL) for network penetration testing to find the optimal attack path.

We decided to investigate the capabilities of applying ML in testing anti-ransomware solutions. In particular, RL has been chosen for generating ransomware models and finding the optimal test strategy to bypass an anti-ransomware solution. 

RL is well known as the way to train a computer to play strategy games. For example, AlphaGo, a computer program designed by DeepMind, has established itself as the best Go and chess player in the world as a result of playing many times with other instances of itself to improve its play with RL [3]. 

Download the slides.
Ransomware simulator demo.

Some of the research results are highlighted below:

Q (S, A)StatesThreshold
Actions012345678910
03.7426320.7871560.50037001.0078110.1199282.6864711.7714430
all files are encrypted
17.1051961.0135781.2529020.62586505.49176700.1107810.0205770
27.5450323.3705161.1189790.29656101.1709410.1702010.1479760.2158660.763711
36.6818591.0368871.431128001.1787610.100310.5262320.0196090.1805
43.948145000.2315320.1207580.7908980.5994710.8714980.4832160
55.81644201.1586970.69380201.7787380.7150840.2058160.0298460
63.7340558.9235772.616297000.7474480.2528020.2365180.0205770
77.0224680.0568581.9311740.56346601.0032110.2131730.1982590.3771460
84.1587110.9638611.045184000.4595860.5905490.0284230.286980.01805
95.8726310.4633530.38710301.5243411.1436320.3584650.7718590.1129530.119148
105.9592670.92858.5309231.73973401.557610.1415510.2441380.1723540
116.14123700.76871200.1448121.6404830.1494330.4180040.0558910.068169
125.1796971.9926470.71376600.4430530.6525530.2262340.1736080.3729560.089291
133.1815441.9450880.62929400.4220051.0517742.2835790.2310250.1660560
1414.119962.5162351.8898263.1060870.0871030.7056480.2754480.2479260.206540
155.8284710.88546900.16335301.2603620.0807880.1864290.0104030.07444

References

[1] Attention is All They Need: Combatting Social Media Information Operations With Neural Language Models. FireEye, 2019. Available at https://www.fireeye.com/blog/threat-research/2019/11/combatting-social-media-information-operations-neural-language-models.html 

[2] Schwartz J. Autonomous Penetration Testing using Reinforcement Learning. University of Queensland, 2018, available at https://arxiv.org/ftp/arxiv/papers/1905/1905.05965.pdf

[3] Mastering the game of Go without human knowledge. DeepMind, 2017. Available at https://www.nature.com/articles/nature24270.epdf

No comments:

Post a Comment