TITLE:
Model-Free Feature Screening Based on Gini Impurity for Ultrahigh-Dimensional Multiclass Classification
AUTHORS:
Zhongzheng Wang, Guangming Deng
KEYWORDS:
Ultrahigh-Dimensional, Feature Screening, Model-Free, Gini Impurity, Multiclass Classification
JOURNAL NAME:
Open Journal of Statistics,
Vol.12 No.5,
October
27,
2022
ABSTRACT: It is quite common that both categorical and continuous covariates appear
in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous.
And applicable feature screening method is very limited; to
handle this non-trivial situation, we propose a model-free feature screening for ultrahigh-dimensional
multi-classification with both categorical and continuous covariates. The
proposed feature screening method will be based on Gini impurity to evaluate
the prediction power of covariates. Under certain regularity conditions, it is
proved that the proposed screening procedure possesses the sure screening
property and ranking consistency properties. We demonstrate the finite sample
performance of the proposed procedure by simulation studies and illustrate
using real data analysis.