Comparison of Various Classification Techniques Using Different Data Mining Tools for Diabetes Diagnosis


In the absence of medical diagnosis evidences, it is difficult for the experts to opine about the grade of disease with affirmation. Generally many tests are done that involve clustering or classification of large scale data. However many tests could complicate the main diagnosis process and lead to the difficulty in obtaining the end results, particularly in the case where many tests are performed. This kind of difficulty could be resolved with the aid of machine learning techniques. In this research, we present a comparative study of different classification techniques using three data mining tools named WEKA, TANAGRA and MATLAB. The aim of this paper is to analyze the performance of different classification techniques for a set of large data. A fundamental review on the selected techniques is presented for introduction purpose. The diabetes data with a total instance of 768 and 9 attributes (8 for input and 1 for output) will be used to test and justify the differences between the classification methods. Subsequently, the classification technique that has the potential to significantly improve the common or conventional methods will be suggested for use in large scale data, bioinformatics or other general applications.

Share and Cite:

R. Rahman and F. Afroz, "Comparison of Various Classification Techniques Using Different Data Mining Tools for Diabetes Diagnosis," Journal of Software Engineering and Applications, Vol. 6 No. 3, 2013, pp. 85-97. doi: 10.4236/jsea.2013.63013.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] N. J. Nilsson, “Introduction to Machine Learning,” 2010. html
[2] M. S. Sapna and D. A. Tamilarasi, “Fuzzy Relational Equation in Preventing Neuropathy Diabetic”, International Journal of Recent Trends in Engineering, Vol. 2, No. 4, 2009, p. 126.
[3] L. Carnimeo and A. Giaquinto, “An Intelligent System for Improving Detection of Diabetic Symptoms in Retinal Images,” IEEE International Conference on Information Technology in Biomedicine, Ioannina, 26-28 October 2006.
[4] R. Radha and S. P. Rajagopalan, “Fuzzy Logic Approach for Diagnosis of Diabetes,” Information Technology Journal, Vol. 6, No. 1, pp. 96-102. doi:10.3923/itj.2007.96.102
[5] P. Jeatrakul and K. W. Wong, “Comparing the Performance of Different Neural Networks for Binary Classification Problems,” The 8th International Symposium on Natural Language Processing, Bangkok, 20-22 October 2009, pp. 111-115. doi:10.1109/SNLP.2009.5340935
[6] Q. Q. Zhou, M. Purvis and N. Kasabov, “Membership Function Selection Method for Fuzzy Neural Networks,” University of Otago, Dunedin, 2007.
[7] T.-H. Lin and V.-W. Soo, “Pruning Fuzzy ARTMAP Using the Minimum Description Length Principle in Learning from Clinical Databases,” Proceedings of the 9th International Conference on Tools with Artificial Intelligence, Newport Beach, 3-8 November 1997, pp. 396-403.
[8] F. Ensan, M. H. Yaghmaee and E. Bagheri, “FACT: A New Fuzzy Adaptive Clustering Technique,” The 11th IEEE Symposium on Computers and Communications, Sardinia, 26-29 June 2006, pp. 442-447. doi:10.1109/ISCC.2006.73
[9] UCI Machine Learning Repository.
[10] S. W. Purnami, A. Embong, J. M. Zain and S. P. Rahayu, “A New Smooth Support Vector Machine and Its Applications in Diabetes Disease Diagnosis,” Journal of Computer Science, Vol. 5, No. 12, pp. 1006-1011.
[11] P. Werbos, “Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences,” Ph.D. Thesis, Harvard University, Cambridge, 1974.
[12] G. H. John and P. Langley, “Estimating Continuous Distributions in Bayesian Classifiers,” Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, San Francisco, 1995, pp. 338-345.
[13] J. Quinlan, “C4.5: Programs for Machine Learning,” Morgan Kaufmann, San Mateo, 1993.
[14] I. H. Witten and E. Frank, “Data Mining: Practical Machine Learning Tools and Techniques,” 2nd Edition, Morgan Kaufmann, San Francisco, 2005.
[15] The Mathworks-Fuzzy Logic Toolbox, 2006. toolbox/fuzzy/fuzzy.html
[16] Jang and J.-S. Roger, “Anfis: Adaptive-Network-Based Fuzzy Inference System,” IEEE Transactions on Systems, Man, and Cybernetics, Vol. 23, No. 3, 1993, pp. 665-685. doi:10.1109/21.256541
[17] J. W. Han and M. Kanber, “Data Mining Concept and Techniques,” Morgan Kaufmann Publishers, Burlington, 2000.
[18] Kappa Statistic.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.