Employing Machine Learning Algorithms to Detect Unknown Scanning and Email Worms

Employing Machine Learning Algorithms to Detect Unknown Scanning and Email Worms

Shubair Abdulla1, Sureswaran Ramadass2, Altyeb Altaher Altyeb3, and Amer Al-Nassiri4
1Instructional and Learning Technologies Department, Sultan Qaboos University, Oman 
2,3NAV6 Center of Excellence, Universiti Sains Malaysia, Malaysia
4IT College, Ajman University of Science and Technology, UAE
 
 
Abstract: We present a worm detection system that leverages the reliability of IP-Flow and the effectiveness of learning machines. Typically, a host infected by a scanning or an email worm initiates a significant amount of traffic that does not rely on DNS to translate names into numeric IP addresses. Based on this fact, we capture and classify NetFlow records to extract feature patterns for each PC on the network within a certain period of time. A feature pattern includes: no of DNS requests, no of DNS responses, no of DNS normals, and no of DNS anomalies. Two learning machines are used, K-Nearest Neighbors (KNN) and Naïve Bayes (NB), for the purpose of classification. Solid statistical tests, the cross-validation and paired t-test, are conducted to compare the individual performance between the KNN and NB algorithms. We used the classification accuracy, false alarm rates, and training time as metrics of performance to conclude which algorithm is superior to another. The data set used in training and testing the algorithms is created by using 18 real-life worm variants along with a big amount of benign flows.



Keywords: IP flow, netflow, NB, KNN, scanning worms, email worms.
 
Received September 27, 2011; accepted May 22, 2012
  

Full Text

Read 3483 times Last modified on Sunday, 19 August 2018 02:42
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…