Hybrid Support Vector Machine based Feature Selection Method for Text Classification

Hybrid Support Vector Machine based Feature Selection Method for Text Classification

Thabit Sabbah1, Mosab Ayyash1, and Mahmood Ashraf2

1Faculty of Technology and Applied Sciences, Al-Quds Open University, Palestine

2Department of Computer Science, Federal Urdu University, Pakistan

Abstract: Automatic text classification is an effective solution used to sort out the increasing amount of online textual content. However, high dimensionality is a considerable impediment observed in the text classification field in spite of the fact that there have been many statistical methods available to address this issue. Still, none of these has proved to be effective enough in solving this problem. This paper proposes a machine learning based feature ranking and selection method named Support Vector Machine based Feature Ranking Method (SVM-FRM). The proposed method utilizes Support Vector Machine (SVM) learning algorithm for weighting and selecting the significant features in order to obtain better classification performance. Later on, hybridization techniques are applied to enhance the performance of SVM-FRM method in some experimental situations. The proposed SVM-FRM method and its enhancement are tested using three text classification public datasets. The achieved results are compared with other statistical feature selection methods currently used for the said purpose. Results evaluation shows higher and superior F-measure and accuracy performances of the proposed SVM-FRM on balanced datasets. Moreover, a noticeable performance enhancement is recorded due to the application of the proposed hybridization techniques on an unbalanced dataset.

Keywords: Feature ranking, text classification, feature selection, SVM-based weighting, hybridization, dimensionality reduction.

Received February 12, 2018; accepted April 22, 2018

Read 3346 times
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…