Statistical Methods of SNP Data Analysis and Applications

We develop various statistical methods important for multidimensional genetic data analysis. Theorems justifying application of these methods are established. We concentrate on the multifactor dimensionality reduction, logic regression, random forests, stochastic gradient boosting along with their new modifications. We use complementary approaches to study the risk of complex diseases such as cardiovascular ones. The roles of certain combinations of single nucleotide polymorphisms and non-genetic risk factors are examined. To perform the data analysis concerning the coronary heart disease and myocardial infarction the Lomonosov Moscow State University supercomputer “Chebyshev” was employed.

A. Bulinski, O. Butkovsky, V. Sadovnichy, A. Shashkin, P. Yaskov, A. Balatskiy, L. Samokhodskaya and V. Tkachuk, "Statistical Methods of SNP Data Analysis and Applications," Open Journal of Statistics, Vol. 2 No. 1, 2012, pp. 73-87. doi: 10.4236/ojs.2012.21008.

Conflicts of Interest

The authors declare no conflicts of interest.


