A Bi-Level Text Classification Approach for SMS Spam Filtering and Identifying Priority Messages
886
886
A Bi-Level Text Classification Approach for SMS Spam Filtering and Identifying Priority Messages
Naresh Kumar Nagwani
Department of Computer Science and Engineering, National Institute of Technology Raipur, India.
Abstract: Short Message Service (SMS) traffic is increasing day by day and trillions of sms are sent and received by billions of users every day. SPAM messages are also increasing in same proportionate. Numbers of recent advancements are taking place in the field of sms spam detection and filtering. The objective of this work is twofold, first is to identify and classify spam messages from the collection of sms messages and second is to identify the priority or important sms messages from the filtered non-spam messages. The objective of the work is to categorize the sms messages for effective management and handling of sms messages. the work is planned in two level of binary classification wherein at the first level of classification the sms messages are categorized into the two classes spam and non-spam using popular binary classifiers, and then at the second level of classification non-spam sms messages are further categorized into the priority and normal sms messages. four state of the art popular text classification techniques namely, naïve bayes, support vector machine, latent dirichlet allocation and non-negative matrix factorization are used to categorize the sms text message at different levels of classification. The proposed bi-level classification model is also evaluated using the performance measures accuracy and f-measure. Combinations of classifiers at both levels are compared and it is shown from the experiments that support vector machine algorithm performs better for filtering the spam messages and categorizing the priority messages.
Keywords: SMS spam, priority sms, important sms, sms spam filtering, bi-level binary classification.
Received March 24, 2014; accepted August 13, 2014