TITLE:
Improvement of Misclassification Rates of Classifying Objects under Box Cox Transformation and Bootstrap Approach
AUTHORS:
Mst Sharmin Akter Sumy, Md Yasin Ali Parh, Ajit Kumar Majumder, Nayeem Bin Saifuddin
KEYWORDS:
Misclassification Rate, SVM, Box Cox Transformation, Bootstrapping
JOURNAL NAME:
Open Journal of Statistics,
Vol.12 No.1,
February
22,
2022
ABSTRACT: Discrimination and classification rules are based on different types of
assumptions. Also, all most statistical methods are based on some necessary
assumptions. Parametric methods are the best choice if it follows all the
underlying assumptions. When assumptions are violated, parametric approaches do
not provide a better solution and nonparametric techniques are preferred. After
Box-Cox transformation, when assumptions are satisfied, parametric methods provide fewer misclassification rates. With this problem in
mind, our concern is to compare the classification accuracy of parametric and
non-parametric approaches with the aid of Box-Cox transformation and
Bootstrapping. We carried Support Vector Machines (SVMs) and different
discrimination and classification rules to classify objects. The attention is
to critically compare the SVMs with Linear discrimination Analysis (LDA), and
Quadratic discrimination Analysis (QDA) for measuring the performance of these
techniques before and after Box-Cox transformation using misclassification
rates. From the apparent error rates, we observe that before Box-Cox
transformation, SVMs perform better than existing classification techniques, on
the other hand, after Box-Cox transformation, parametric techniques provide
fewer misclassification rates compared to nonparametric method. We also
investigated the performances of classification techniques using the Bootstrap
approach and observed that Bootstrap-based classification techniques
significantly reduce the classification error rate than the usual techniques of
small samples. Thus, this paper proposes to apply classification techniques
under the Bootstrap approach for classifying objects in case of small sample. A
real and simulated datasets application is carried out to see the performance.