Gender Identification on Twitter Using the Modified Balanced Winnow

HTML  Download Download as PDF (Size: 172KB)  PP. 189-195  
DOI: 10.4236/cn.2012.43023    5,885 Downloads   10,039 Views  Citations

ABSTRACT

With the rapid growth of web-based social networking technologies in recent years, author identification and analysis have proven increasingly useful. Authorship analysis provides information about a document’s author, often including the author’s gender. Men and women are known to write in distinctly different ways, and these differences can be successfully used to make a gender prediction. Making use of these distinctions between male and female authors, this study demonstrates the use of a simple stream-based neural network to automatically discriminate gender on manually labeled tweets from the Twitter social network. This neural network, the Modified Balanced Winnow, was employed in two ways; the effectiveness of data stream mining was initially examined with an extensive list of n-gram features. Feature selection techniques were then evaluated by drastically reducing the feature list using WEKA’s attribute selection algorithms. This study demonstrates the effectiveness of the stream mining approach, achieving an accuracy of 82.48%, a 20.81% increase above the baseline prediction. Using feature selection methods improved the results by an additional 16.03%, to an accuracy of 98.51%.

Share and Cite:

W. Deitrick, Z. Miller, B. Valyou, B. Dickinson, T. Munson and W. Hu, "Gender Identification on Twitter Using the Modified Balanced Winnow," Communications and Network, Vol. 4 No. 3, 2012, pp. 189-195. doi: 10.4236/cn.2012.43023.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.