Feature Selection Algorithm Based on Correlation between Muti Metric Network Traffic Flow Features

Feature Selection Algorithm Based on Correlation between Muti Metric Network Traffic Flow Features

Yongfeng Cui1,2, Shi Dong1,2,3, and Wei Liu2

1School of Computer Science and Technology, Huazhong Universtiy of Science and Technology, China

2School of Computer Science and Technology, Zhoukou Normal University, China

3Department of Computer Science and Engineering, Washington University in St Louis, USA

Abstract: Traffic identification is a hot issue in recent years, in order to overcome shortcomings of port-based and Deep Packet Inspection (DPI), machine learning algorithm has gained wide attention, but nowadays research focus on traffic identification based on full packets dataset, which would be a great challenge to identify online traffic flow. It is a way to overcome this shortcoming by considering the sampled flow records as identification object. In this paper, flow records NOC_SET is constructed as dataset, and inherent NETFLOW and extended flow metrics are regarded as features. This paper proposes feature selection algorithm MSAS to select features with high correlation. And classical machine learning algorithms are used to identify traffic. Experimental results show that machine learning flow identification algorithm based on sampled flow records has almost the same identification results as method based on full packets dataset, and the proposed feature selection algorithm MSAS can improve the result of application identification.

 

Keywords: Port identification, deep packet inspection, netflow flow, machine learning.

Received Febrauary 5, 2014; accepted April 2, 2015

Read 1771 times Last modified on Wednesday, 08 May 2019 03:54
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…