TITLE:
Short Text Classification Based on Improved ITC
AUTHORS:
Liangliang Li, Shouning Qu
KEYWORDS:
ITC; Text Classification; Short Text
JOURNAL NAME:
Journal of Computer and Communications,
Vol.1 No.4,
October
18,
2013
ABSTRACT:
The long text classification has got
great achievements,
but short text classification still needs to be perfected. In this paper, at
first, we describe why we select the ITC feature selection algorithm not the
conventional TFIDF and the superiority of the ITC compared with the TFIDF, then we conclude the
flaws of the conventional ITC algorithm, and then we present an improved ITC feature selection
algorithm based on the characteristics of short text classification while combining the concepts of the
Documents Distribution Entropy with the Position Distribution Weight. The improved ITC algorithm conforms to
the actual situation of the short text classification. The experimental results
show that the performance based on the new algorithm was much better than that based on the traditional TFIDF and
ITC.