A
Bi-Level Text Classification Approach for SMS Spam Filtering and Identifying
Priority Messages
Naresh
Kumar Nagwani
Department
of Computer Science and Engineering, National Institute of Technology Raipur,
India.
Abstract:
Short Message Service (SMS) traffic
is increasing day by day and trillions of sms are sent and received by billions
of users every day. SPAM messages are also increasing in same proportionate. Numbers
of recent advancements are taking place in the field of sms spam detection and
filtering. The objective of this work is twofold, first is to identify and
classify spam messages from the collection of sms messages and second is to
identify the priority or important sms messages from the filtered non-spam
messages. The objective of the work is to categorize the sms messages for
effective management and handling of sms messages. the work is planned in two
level of binary classification wherein
at the first level of classification the sms messages are categorized
into the two classes spam and non-spam using popular binary classifiers, and
then at the second level of classification non-spam sms messages are further
categorized into the priority and normal sms messages. four state of the art
popular text classification techniques namely, naïve bayes, support vector
machine, latent dirichlet allocation and non-negative matrix factorization are
used to categorize the sms text message at different levels of classification. The
proposed bi-level classification model is also evaluated using the performance
measures accuracy and f-measure. Combinations of classifiers at both levels are
compared and it is shown from the experiments that support vector machine
algorithm performs better for filtering the spam messages and categorizing the
priority messages.
Keywords:
SMS spam, priority sms,
important sms, sms spam filtering, bi-level binary classification.
Received
March 24, 2014; accepted August 13, 2014