Identifying predictive markers of chemosensitivity of breast cancer with random forests
Wei Hu
DOI: 10.4236/jbise.2010.31009   PDF    HTML     7,715 Downloads   12,138 Views   Citations


Several gene signatures have been identified to build predictors of chemosensitivity for breast cancer. It is crucial to understand how each gene in a signature contributes to the prediction, i.e., to make the prediction model interpretable instead of using it as a black box. We utilized Random Forests (RFs) to build two interpretable predictors of pathologic complete response (pCR) based on two gene signatures. One signature consisted of the top 31 probe sets (27 genes) differentially expressed between pCR and residual disease (RD) chosen from a previous study, and the other consisted of the genes involved in Notch singling pathway (113 genes). Both predictors had a higher accuracy (82% v 76% & 79% v 76%), a higher specificity (91% v 71% & 98% v 71%), and a higher positive predictive value (PPV) (68% v 52% & 73% v 52%)) than the predictor in the previous study. Furthermore, Random Forests were employed to calculate the importance of each gene in the two signatures. Findings of our functional annotation suggested that the important genes identified by the feature selection scheme of Random Forests are of biological significance.

Share and Cite:

Hu, W. (2010) Identifying predictive markers of chemosensitivity of breast cancer with random forests. Journal of Biomedical Science and Engineering, 3, 59-64. doi: 10.4236/jbise.2010.31009.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Nguyen, P. L., Taghian, A. G. et al. (2008) Breast cancer subtype approximated by estrogen receptor, progesterone receptor, and HER-2 is associated with local and distant recurrence after breast-conserving therapy, J. Clin Oncol., 26(14), 2373-8.
[2] Perou, C. M., Sorlie, T., et al. (2000)Molecular portraits of human breast tumours, Nature, 406(6797), 747-752.
[3] Sorlie, T., Perou, C. M. et al. (2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications, Proc Natl Acad Sci U S A, 98(19), 10869-10874.
[4] Kapp, A. V., Jeffrey, S. S. et al. (2006) Discovery and validation of breast cancer subtypes. BMC Genomics, 7, 231.
[5] Pusztai, L., (2008) Current status of prognostic profiling in breast cancer, The Oncologist, 13, 350-360.
[6] van ’t Veer, L. J., Dai, H., et al. (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415, 530-536.
[7] van de Vijver, M. J. et al. (2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med, 347, 1999-2009.
[8] Ayers, M., Symmans, W. F., Stec, J. et al. (2004) Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel/FAC chemotherapy in breast cancer. J Clin Oncol, 22, 2284-2293.
[9] Hess, K. R., Anderson, K., Symmans, W. F. et al., (2006) Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer. J Clin Oncol, 24, 4236-4244.
[10] Brenton, J. D., Carey L. A., Ahmed, A. A. et al. (2005) Molecular classification and molecular forecasting of breast cancer: ready for clinical application? J Clin Oncol, 23, 7350-7360.
[11] Wang, Y., Klijn, J. G. et al. (2005) Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer, Lancet, 365, 671-679.
[12] Ma, X. J., Wang, Z., et al. (2004) A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell, 5(6), 607-16.
[13] Brennan, K. and Anthony Brown, M. C. (2003) Is there a role for Notch signalling in human breast cancer? Breast Cancer Res, 5(2), 69-75.
[14] Stylianou, S., Clarke, R. B. et al. (2006) Activation of notch signaling in human breast cancer, Cancer Research, 66, 1517-1525.
[15] GEArray, O. Human notch signaling pathway microarray.
[16] Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol, 57, 289-300.
[17] Pounds, S. and Morris, S. W. (2003) Estimating the occurrence of false positive and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics 19, 1236- 1242.
[18] Breiman, L. and Random, F. (2001) Machine Learning, 45 (1), 5-32.
[19] Díaz-Uriarte, R. and Alvarez de Andrés, S. (2006) Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7(3).
[20] Rouzier, R., Rajan, R., et al. (2005) Microtubule associated protein tau is a predictive marker and modulator of response to paclitaxel-containing preoperative chemotherapy in breast cancer. Proc Natl Acad Sci U S A, 102, 8315-8320.
[21] Ou, Y. H., Chung, P. H., et al. (2007)The candidate tumor suppressor BTG3 is a transcriptional target of p53 that inhibits E2F1, The EMBO Journal, 26, 3968-3980.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.