Evaluation of Feature Subset Selection, Feature Weighting, and Prototype Selection for Biomedical Applications
Suzanne LITTLE, Sara COLANTONIO, Ovidio SALVETTI, Petra PERNER
.
DOI: 10.4236/jsea.2010.31005   PDF    HTML     6,145 Downloads   10,516 Views   Citations

Abstract

Many medical diagnosis applications are characterized by datasets that contain under-represented classes due to the fact that the disease is much rarer than the normal case. In such a situation classifiers such as decision trees and Naïve Bayesian that generalize over the data are not the proper choice as classification methods. Case-based classifiers that can work on the samples seen so far are more appropriate for such a task. We propose to calculate the contingency table and class specific evaluation measures despite the overall accuracy for evaluation purposes of classifiers for these specific data characteristics. We evaluate the different options of our case-based classifier and compare the perform-ance to decision trees and Naïve Bayesian. Finally, we give an outlook for further work.

Share and Cite:

S. LITTLE, S. COLANTONIO, O. SALVETTI and P. PERNER, "Evaluation of Feature Subset Selection, Feature Weighting, and Prototype Selection for Biomedical Applications," Journal of Software Engineering and Applications, Vol. 3 No. 1, 2010, pp. 39-49. doi: 10.4236/jsea.2010.31005.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] A. Asuncion and D. J. Newman. “UCI machine learning repository,”[http://www.ics.uci.edu/~mlearn/MLReposit-ory.html] University of California, School of Informa- tion and Computer Science, Irvine, CA, 2007.
[2] C. L. Chang, “Finding prototypes for nearest neighbor classifiers,” IEEE Transactions on Computers, Vol. C-23, No. 11, pp. 1179–1184, 1974.
[3] P. Perner, “Data mining on multimedia data,” Springer Verlag, lncs 2558, 2002.
[4] D. Wettschereck and D. W. Aha, “Weighting Features,” M. Veloso and A. Aamodt, in Case-Based Reasoning Research and Development, M. lncs 1010, Springer- Verlag, Berlin Heidelberg, pp. 347–358, 1995.
[5] D. W. Aha, D. Kibler, and M. K. Albert, “Instance-based learning algorithm,” Machine Learning, Vol. 6, No. 1, pp. 37–66, 1991.
[6] P. Perner, “Prototype-based classification,” Applied Intel-ligence, Vol. 28, pp. 238–246, 2008.
[7] P. Perner, U. Zscherpel, and C. Jacobsen, “A comparision between neural networks and decision trees based on data from industrial radiographic testing,” Pattern Recognition Letters, Vol. 22, pp. 47–54, 2001.
[8] B. Smyth and E. McKenna, “Modelling the competence of case-bases,” in Advances in Case-Based Reasoning, 4th European Workshop, Dublin, Ireland, pp. 208–220. 1998,
[9] I. H. Witten and E. Frank, Data Mining: Practical ma-chine learning tools and techniques, 2nd Edition, Morgan Kaufmann, San Francisco, 2005.
[10] DECISION MASTER, http://www.ibai-solutions.de.
[11] P. Horton and K. Nakai, “Better prediction of protein cel-lular localization sites with the it k nearest neighbors clas-sifier” Proceeding of the International Conference on Intel-ligent Systems in Molecular Biology, pp. 147–152 1997.
[12] C. A. Ratanamahatana and D. Gunopulos. “Scaling up the Naive Bayesian Classifier: Using decision trees for fea-ture selection,” in proceedings of Workshop on Data Cleaning and Preprocessing (DCAP 2002), at IEEE In-ternational Conference on Data Mining (ICDM 2002), Maebashi, Japan. 2002.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.