TITLE:
A Multi-Classifier Based Prediction Model for Phishing Emails Detection Using Topic Modelling, Named Entity Recognition and Image Processing
AUTHORS:
C. Emilin Shyni, S. Sarju, S. Swamynathan
KEYWORDS:
Phishing, Conditional Random Field Classifier, Latent Dirichlet Allocation, Natural Language Processing, Machine Learning, Image Segmentation, Image Processing
JOURNAL NAME:
Circuits and Systems,
Vol.7 No.9,
July
26,
2016
ABSTRACT: Phishing is the act of
attempting to steal a user’s financial and personal information, such as credit
card numbers and passwords by pretending to be a trustworthy participant,
during online communication. Attackers may direct the users to a fake website
that could seem legitimate, and then gather useful and confidential information
using that site. In order to protect users from Social Engineering techniques
such as phishing, various measures have been developed, including improvement
of Technical Security. In this paper, we propose a new technique, namely, “A
Prediction Model for the Detection of Phishing e-mails using Topic Modelling,
Named Entity Recognition and Image Processing”. The features extracted are
Topic Modelling features, Named Entity features and Structural features. A
multi-classifier prediction model is used to detect the phishing mails.
Experimental results show that the multi-classification technique outperforms
the single-classifier-based prediction techniques. The resultant accuracy of
the detection of phishing e-mail is 99% with the highest False Positive Rate
being 2.1%.