On Clustering Algorithms for Biological Data


Age of knowledge explosion requires us not only to have the ability to get useful information which represented by data but also to find knowledge in information. Human Genome Project achieved large amount of such biological data, and people found clustering is a promising approach to analyze those biological data for knowledge hidden. The researches on biological data go to in-depth gradually and so are the clustering algorithms. This article mainly introduces current broad-used clustering algorithms, including the main idea, improvements, key technology, advantage and disadvantage, and the applications in biological field as well as the problems they solve. What’s more, this article roughly introduces some database used in biological field.

Share and Cite:

Li, X. and Zhu, F. (2013) On Clustering Algorithms for Biological Data. Engineering, 5, 549-552. doi: 10.4236/eng.2013.510B113.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] J.-G. Sun, J. Liu and L.-Y. ZHao, “Clustering Algorithms Re-search.”
[2] M. X. Duan, 2009-5-1.
[3] L. M. Wang and X. D. Wang, “A Non-Parametric Bayesian Clustering for Gene Expression Data,” IEEE Workshop on Statistical Signal Processing (SSP), Ann Arbor, 5-8 August 2012, pp. 556-559.
[4] M. Zhang and J. Yu, “Fuzzy Partitional Clustering Algorithms.”
[5] L. Wang, H. Peng, J.-S Hu and H.-F. Liang, “Fuzzy Clustering Applied in Genetic Differentiation Analysis,” Control & Automation, Vol. 22, No. 3, 2006, pp. 172- 174.
[6] R. Xu, Survey of Clustering Algorithms,” IEEE Transactions on Neural Networks, Vol. 16, No. 3, 2005, pp. 645- 678.
[7] M. X. Duan and L. M. Yang “The Improvement of Hierarchical Clustering Algorithm.”
[8] Y. Tian, D. Y. Liu and B. Yang, “Application of Com- plex Networks Clustering Algorithm in Biological Net- works,” Journal of Frontiers of Computer Science and Technology, Vol. 4, No. 4, 2010, pp. 330-337.
[9] E. Hartuv and R. Shamir, “A Clustering Algorithm Based on Graph Connectivity,” Information Processing Letters, Vol. 76, No. 4-6, 2000, pp. 175-181.
[10] J. Yang, “A Study on the Clustering Analysis for Parkinson-Relates Genes,” Master’s Thesis, Tianjin Medical University, Tianjin, 2007.
[11] J. Wang and X. Y. Liu, “Hierarchical Clustering Algorithm Based on the aiNet Model of Artificial Immune System,” Computer Engineering and Applications, Vol. 42, No. 24, 2006, pp. 167-169.
[12] Q. J. Tang, T. Xu, D. Wang, L. J. Li and L. F. Du, “Clustering GO Term Applied to Differential Gene Expression Detection,” Chinese Journal of Applied and Environmental Biology, No. 3, 2011, pp. 422-426.
[13] W. Yuan and S. F. Zhu, “Study on Biological Text Clustering Algorithm Based on Metric Learning,” Computer Applications and Software.
[14] Y. L. Yuan, “Improved Fuzzy C-means Clustering Algorithm.”
[15] C. F. Gao, “Novel Fuzzy Clustering Algorithms and Applications,” Ph.D. Thesis, Jiangnan University, Wuxi, 2011.
[16] X. Wang, X. B. Yang and L.-L. Zhou, “An Algorithm of Hierarchical Clustering Based on Correcting Class Center,” Microelectronics & Computer, Vol. 28, No. 10, 2011.
[17] D. R. Edla and P. K. Jana, “A Novel Clustering Algorithm using Voronoi Diagram,” Seventh International Conference on Digital Information Management (ICDIM), Macau, 22-24 August 2012, pp. 35-40.
[18] R. Sloutsky, N. Jimenez, S. Joshua Swamidass and K. M. Naegle, “Accounting for Noise When Clustering Biological Data,” Briefings in Bioinformatics, Vol. 14, No. 4, 2013, pp. 423-436.
[19] R. L. Tatusov, M. Y. Galperin, D. A. Natale, L. V. Grakavtsev, T. A. Tatusova, U. T. Shankavaram, B. S. Rao, B. Kiryutin, M. Y. Galperin, N. D. Fedorova and E. V. Koonin. “The COG Database: New Developments in Phylogenetic Classification of Proteins from Complete Genomes,” Nucleic Acids Research, Vol. 29, No. 1, 2001, pp. 22-28.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.