Short Text Classification Based on Improved ITC

HTML  Download Download as PDF (Size: 140KB)  PP. 22-27  
DOI: 10.4236/jcc.2013.14004    5,209 Downloads   9,302 Views  Citations

ABSTRACT

The long text classification has got great achievements, but short text classification still needs to be perfected. In this paper, at first, we describe why we select the ITC feature selection algorithm not the conventional TFIDF and the superiority of the ITC compared with the TFIDF, then we conclude the flaws of the conventional ITC algorithm, and then we present an improved ITC feature selection algorithm based on the characteristics of short text classification while combining the concepts of the Documents Distribution Entropy with the Position Distribution Weight. The improved ITC algorithm conforms to the actual situation of the short text classification. The experimental results show that the performance based on the new algorithm was much better than that based on the traditional TFIDF and ITC.

Share and Cite:

Li, L. and Qu, S. (2013) Short Text Classification Based on Improved ITC. Journal of Computer and Communications, 1, 22-27. doi: 10.4236/jcc.2013.14004.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.