Clustering Categorical Data Based on Within-Cluster Relative Mean Difference - Open Journal of Statistics

OJS > Vol.7 No.2, April 2017

Clustering Categorical Data Based on Within-Cluster Relative Mean Difference ()

HTML XML

Download as PDF (Size: 575KB) PP. 173-181

DOI: 10.4236/ojs.2017.72013 1,628 Downloads 3,458 Views Citations

Author(s)

Jinxia Su, Chunjing Su

Affiliation(s)

School of Mathematics and Statistics, Lanzhou University, Lanzhou, China.

ABSTRACT

The clustering on categorical variables has received intensive attention. In dataset with categorical features, some features show the superior performance on clustering procedure. In this paper, we propose a simple method to find such distinctive features by comparing pooled within-cluster mean relative difference and then partition the data upon such features and give subspace of the subgroups. The applications on zoo data and soybean data illustrate the performance of the proposed method.

KEYWORDS

Clustering, Categorical Variable, Distinctive Attribute, Pooled Within-Cluster Mean Relative Difference, Hamming Distance

Share and Cite:

Su, J. and Su, C. (2017) Clustering Categorical Data Based on Within-Cluster Relative Mean Difference. Open Journal of Statistics, 7, 173-181. doi: 10.4236/ojs.2017.72013.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies