TITLE:
Clustering of the Values of a Response Variable and Simultaneous Covariate Selection Using a Stepwise Algorithm
AUTHORS:
Olivier Collignon, Jean-Marie Monnez
KEYWORDS:
Classification, Variable Selection, Supervised Learning, Akaike Information Criterion, Wilks’ Lambda
JOURNAL NAME:
Applied Mathematics,
Vol.7 No.15,
September
12,
2016
ABSTRACT: In supervised learning the number of values
of a response variable can be very high. Grouping these values in a few
clusters can be useful to perform accurate supervised classification analyses.
On the other hand selecting relevant covariates is a crucial step to build
robust and efficient prediction models. We propose in this paper an algorithm
that simultaneously groups the values of a response variable into a limited
number of clusters and selects stepwise the best covariates that discriminate
this clustering. These objectives are achieved by alternate optimization of a
user-defined model selection criterion. This process extends a former version
of the algorithm to a more general framework. Moreover possible further
developments are discussed in detail.