TITLE:
Mobile SMS Spam Filtering for Nepali Text Using Naïve Bayesian and Support Vector Machine
AUTHORS:
Tej Bahadur Shahi, Abhimanu Yadav
KEYWORDS:
SMS Spam Filtering; Classification; Support Vector Machine; Naïve Bayes; Preprocessing; Feature Extraction; Nepali SMS Datasets
JOURNAL NAME:
International Journal of Intelligence Science,
Vol.4 No.1,
December
17,
2013
ABSTRACT:
Spam is a universal problem with which everyone is
familiar. A number of approaches are used for Spam filtering. The most common
filtering technique is content-based filtering which uses the actual text of
message to determine whether it is Spam or not. The content is very dynamic and
it is very challenging to represent all information in a mathematical model of
classification. For instance, in content-based Spam
filtering, the characteristics used by the filter to identify Spam message are
constantly changing over time. Na?ve Bayes method represents the changing
nature of message using probability theory and support vector machine (SVM)
represents those using different features. These two methods of classification
are efficient in different domains and the case of Nepali SMS or Text
classification has not yet been in consideration; these two methods do not
consider the issue and it is interesting to find out the performance of both the
methods in the problem of Nepali Text classification. In this paper, the Na?ve
Bayes and SVM-based classification techniques are implemented to classify the
Nepali SMS as Spam and non-Spam. An empirical analysis for various text cases
has been done to evaluate accuracy measure of the classification methodologies
used in this study. And, it is found to be 87.15% accurate in SVM and 92.74%
accurate in the case of Na?ve Bayes.