From Python Scikit-learn to Scala Apache Spark—The Road to Uncovering Botnets

Slides PDF Video

The landscape of security threats an enterprise faces is vast. It is imperative for an organization to know when one of the machines within the network has been compromised. One layer of detection can take advantage of the DNS requests made by machines within the network. A request to a Command & Control (CNC) domain can act as an indication of compromise. It is thus advisable to find these domains before they come into play. The team at Akamai aims to do just that.

In this session, Aminov will share Akamai’s experience in porting their PoC detection algorithms, written in Python, to a reliable production-level implementation using Scala and Apache Spark. He will specifically cover their experience regarding an algorithm they developed to detect botnet domains based on passive DNS data. The session will also include some useful insights Akamai has learned while handing out solutions from research to development, including the transition from small-scale to large-scale data consumption, model export/import using PMML and sampling techniques. This information is valuable for researchers and developers alike.

Session hashtag: #SFexp15

Avi Aminov, Machine Learning Scientist at Akamai Technologies

About Avi

Avi Aminov is a Machine Learning Expert at Akamai Technologies. He is part of the Enterprise Threat Research team which is responsible for threat intelligent and reconnaissance based on DNS behavior.

Avi has almost a decade of experience in security-related R&D, focusing on Fraud Detection. Prior to working for Akamai, he pursued a PhD in Physics at the Israel Institute of Technology.

In his free time, Avi likes to dance and teach Salsa.