TITLE:
Condensed Matrix Descriptor for Protein Sequence Comparison
AUTHORS:
Soumen Ghosh, Jayanta Pal, Bansibadan Maji, Dilip Kumar Bhattacharya
KEYWORDS:
Amino Acids, Condensed Matrix, Eigen Values, Matrix Invariants, ALE Index
JOURNAL NAME:
International Journal of Analytical Mass Spectrometry and Chromatography,
Vol.4 No.1,
March
17,
2016
ABSTRACT: The present paper develops a novel way of reducing a protein sequence of any length to a real symmetric condensed 20 × 20 matrix. This condensed matrix can be nicely applied as a protein sequence descriptor. In fact, with such a condensed representation, comparison of two protein sequences is reduced to a comparison of two such 20 × 20 matrices. As each square matrix has a unique Alley Index/normalized Alley Index, such index is conveniently used in getting distance matrix to construct Phylogenetic trees of different protein sequences. Finally protein sequence comparison is made based on these Phylogenetic trees. In this paper three types viz., NADH dehydrogenase subunit 3 (ND3), subunit 4 (ND4) and subunit 5 (ND5) of protein sequences of nine species, Human, Gorilla, Common Chimpanzee, Pygmy Chimpanzee, Fin Whale, Blue Whale, Rat, Mouse and Opossum are used for comparison.