Statistical models for predicting number of involved nodes in breast cancer patients
Alok Kumar Dwivedi, Sada Nand Dwivedi, Suryanarayana Deo, Rakesh Shukla
.
DOI: 10.4236/health.2010.27098   PDF    HTML     5,544 Downloads   9,983 Views   Citations

Abstract

Clinicians need to predict the number of involved nodes in breast cancer patients in order to ascertain severity, prognosis, and design subsequent treatment. The distribution of involved nodes often displays over-dispersion—a larger variability than expected. Until now, the negative binomial model has been used to describe this distribution assuming that over-dispersion is only due to unobserved heterogeneity. The distribution of involved nodes contains a large proportion of excess zeros (negative nodes), which can lead to over-dispersion. In this situation, alternative models may better account for over-dispersion due to excess zeros. This study examines data from 1152 patients who underwent axillary dissections in a tertiary hospital in India during January 1993-January 2005. We fit and compare various count models to test model abilities to predict the number of involved nodes. We also argue for using zero inflated models in such populations where all the excess zeros come from those who have at some risk of the outcome of interest. The negative binomial regression model fits the data better than the Poisson, zero hurdle/inflated Poisson regression models. However, zero hurdle/inflated negative binomial regression models predicted the number of involved nodes much more accurately than the negative binomial model. This suggests that the number of involved nodes displays excess variability not only due to unobserved heterogeneity but also due to excess negative nodes in the data set. In this analysis, only skin changes and primary site were associated with negative nodes whereas parity, skin changes, primary site and size of tumor were associated with a greater number of involved nodes. In case of near equal performances, the zero inflated negative binomial model should be preferred over the hurdle model in describing the nodal frequency because it provides an estimate of negative nodes that are at “high-risk” of nodal involvement.

Share and Cite:

Dwivedi, A. , Dwivedi, S. , Deo, S. and Shukla, R. (2010) Statistical models for predicting number of involved nodes in breast cancer patients. Health, 2, 641-651. doi: 10.4236/health.2010.27098.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] Hernandez-Avila, C.A., Song, C., Kuo, L., Tennen, H., Armeli, S. and Kranzler, H.R. (2006) Targeted versus daily naltrexone: Secondary analysis of effects on average daily drinking. Alcoholism, Clinical and Experimental Research, 30(5), 860-865.
[2] Slymen, D.J., Ayala, G.X., Arredondo, E.M., Elder, J.P. (2006) A demonstration of modeling count data with an application to physical activity. Epidemiologic Perspectives & Innovations, 3(3), 1-9.
[3] Horton, N.J., Kim, E. and Saitz, R. (2007) A cautionary note regarding count models of alcohol consumption in randomized controlled trials. BioMed Central Medical Research Methodology, 7(9), 1-9.
[4] Salinas-Rodriguez, A., Manrique-Espinoza, B. and Sosa- Rubi, S.G. (2009) Statistical analysis for count data: Use of health services applications. Salud Publica Mex, 51(5), 397-406.
[5] Asada, Y. and Kephart, G. (2007) Equity in health services use and intensity of use in Canada. Biomed Central Health Services Research, 7(41), 1-12.
[6] Grootendorst, P.V. (1995) A comparison of alternative models of prescription drug utilization. Health Economics, 4(3), 183-198.
[7] Afifi, A.A., Kotlerman, J.B., Ettner, S.L. and Cowan, M. (2007) Methods for improving regression analysis for skewed continuous or counted responses. Annual Review of Public Health, 28, 95-111.
[8] Hur, K., Hedeker, D., Henderson, W., Khuri, S. and Daley, J. (2002) Modeling clustered count data with excess zeros in health care outcomes research. Health Services and Outcomes Research Methodology, 2002, 3, 5-20.
[9] Lee, A.H., Wang, K., Scott, J.A., Yau, K.K. and McLachlan, G.J. (2006) Multi-level zero-inflated Poisson regression modeling of correlated count data with excess zeros. Statistical Methods in Medical Research, 15(1), 47-61.
[10] Yau, K.K. and Lee, A.H. (2001) Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention programme. Statistics in Medicine, 20 (19), 2907-2920.
[11] Min, Y. and Agresti, A. (2005) Random effect models for repeated measures of zero-inflated count data. Statistical Modelling, 5(1), 1-19.
[12] Gardner, W., Mulvey, E.P. and Shaw, E.C. (1995) Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin, 118(3), 392-404.
[13] Hardin, J.W. and Hilbe, J.M. (2007) Generalized Linear Models and Extensions. A Stata Press Publication, StatCorp LP, Texas.
[14] Mullay, J. (1986) Specifications and testing of some modified count data model. Journal of Econometrics, 33(3), 341-365.
[15] Lambert, D. (1992) Zero-inflated Poisson regression, with application to defects in manufacturing. Technometrics, 34(1), 1-14.
[16] Vuong, Q.H. (1989) Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica, 57 (2), 307-333.
[17] Picard, R. and Cook, D. (1984) Cross-Validation of Regression Models. Journal of the American Statistical Association 1984, 79(387), 575-583.
[18] Baughman, L.A. (2007) Mixture model framework facilitates understanding of zero-inflated and hurdle models for count data. Journal of Biopharmaceutical Statistics 2007, 17(5), 943-946.
[19] Gilthorpe, M.S., Frydenberg, M., Cheng, Y. and Baelum, V. (2009) Modelling count data with excessive zeros: The need for class prediction in zero-inflated models and the issue of data generation in choosing between zero-inflated and generic mixture models for dental caries data. Statistics in Medicine, 28(28), 3539-3553.
[20] Sandhu, D.S., Sandhu, S., Karwasra, R.K. and Marwah, S. (2010) Profile of breast cancer patients at a tertiary care hospital in north India. Indian Journal of Cancer, 47(1), 16-22.
[21] Saxena, S., Rekhi, B., Bansal, A., Bagga, A., Chintamani and Murthy, N.S. (2005) Clinico-morphological patterns of breast cancer including family history in a New Delhi hospital, India-A cross-sectional study. World Journal of Surgical Oncology, 3, 67-75.
[22] Nouh, M.A., Ismail, H., Ali El-Din, N.H. and El-Bolkainy, M.N. (2004) Lymph node metastasis in breast carcinoma: Clinicopathologic correlations in 3747 patients. Journal of Egyptian National Cancer Institute, 16(1), 50-56.
[23] Gann, P.H., Colilla, S.A., Gapstur, S.M., Winchester, D.J. and Winchester, D.P. (1999) Factors associated with axillary lymph node metastasis from breast carcinoma descriptive and predictive analyses. Cancer, 86(8), 1511- 1518.
[24] Olivotto, I.A., Jackson, J.S.H., Mates, D., Andersen, S., Davidson, W., Bryce, C.J. and Ragaz, J. (1998) Prediction of axillary lymph node involvement of women with invasive breast carcinoma a multivariate analysis. Cancer, 83(5), 948-955.
[25] Ravdin, P.M., De Laurentiis, M., Vendely, T. and Clark, G.M. (1994) Prediction of axillary lymph node status in breast cancer patients by use of prognostic indicators. Journal of National Cancer Institute, 86(23), 1771-1775.
[26] Chua, B., Ung, O., Taylor, R. and Boyages, J. (2001) Fre- quency and predictors of axillary lymph node metastases in invasive breast cancer. Australian and New Zealand Journal of Surgery, 71(12), 723-728.
[27] Manjer, J., Balldina, G. and Garne, J.P. (2004) Tumour location and axillary lymph node involvement in breast cancer: A series of 3472 cases from Sweden. European Journal of Surgical Oncology, 30(6), 610-617.
[28] Manjer, J., Balldin, G., Zackrisson, S. and Garne, J.P. (2005) Parity in relation to risk of axillary lymph node involvement in women with breast cancer. European Surgical Research, 37(3), 179-184.
[29] Olivotto, I.A., Jackson, J.S.H., Mates, D., Andersen, S., Davidson, W., Bryce, C.J. and Ragaz, J. (1998) Prediction of axillary lymph node involvement of women with invasive breast carcinoma a multivariate analysis. Cancer, 83(5), 948-955.
[30] Ravdin, P.M., De Laurentiis, M., Vendely, T. and Clark, G.M. (1994) Prediction of axillary lymph node status in breast cancer patients by use of prognostic indicators. Journal of National Cancer Institute, 86(23), 1771-1775.
[31] Chua, B., Ung, O., Taylor, R. and Boyages, J. (2001) Frequency and predictors of axillary lymph node metastases in invasive breast cancer. Australian and New Zealand Journal of Surgery, 71(12), 723-728.
[32] Cetintas, S.K., Kurt, M., Ozkan, L., Engin, K., Gokgoz, S. and Tasdelen, I. (2006) Factors influencing axillary node metastasis in breast cancer. Tumori, 92(5), 416-422.
[33] Fisher, B., Bauer, M., Wickerham, D.L., Redmond, C.L.K. and Fisher, E.R. (1983) Relation of number of positive axillary nodes to the prognosis of patients with primary breast cancer. Cancer, 52(9), 1551-1557.
[34] Harden, S.P., Neal, A.J., Al-Nasiri, N., Ashley, S. and Quercidella, R.G. (2001) Predicting axillary lymph node metastases in patients with T1 infiltrating ductal carcinoma of the breast. The Breast, 10(2), 155-159.
[35] Guern, A.S. and Vinh-Hung, V. (2008) Statistical distribution of involved axillary lymph nodes in breast cancer. Bull Cancer, 95(4), 449-455.
[36] Kendal, W.S. (2005) Statistical kinematics of axillary nodal metastases in breast carcinoma. Clinical & Expe- rimental Metastasis, 22(2), 177-183.
[37] Cameron, A.C. and Trivedi, P.K. (1998) Regression Analysis of Count Data. Cambridge University Press, New York, USA.
[38] Rose, C.E., Martin, S.W., Wannemuehler, K.A. and Plikaytis, B.D. (2006) On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data. Journal of Biopharmaceutical Statistics, 16(4), 463-481.
[39] Rampaul, R.S., Miremadi, A., Pinder, S.E., Lee, A. and Ellis, I.O. (2001) Pathological validation and significance of micrometastasis in sentinel nodes in primary breast can- cer. Breast Cancer Research, 3(2), 113-116.
[40] Schaapveld. M., Otter, R., de Vries, E.G., Fidler, V., Grond, J.A., van der Graaf, W.T., de Vogel, P.L. and Will- emse, P.H. (2004) Variability in axillary lymph node dissection for breast cancer. Journal of Surgical Oncology, 87(1), 4-12.
[41] Martin, T.G., Wintle, B.A., Rhodes, J.R., Kuhnert, P.M., Field, S.A., Low-Choy, S.J., Tyre, A.J. and Possingham, H.P. (2005) Zero tolerance ecology: Improving ecological inference by modeling the source of zero observations. Ecology Letters, 8(11), 1235-1246.
[42] Zorn, C.J.W. (1996) Evaluating zero-inflated and hurdle Poisson specifications. Midwest Political Science Assoc- iation, San Diego, 1-16.
[43] Boucher, J.P., Denuit, M. and Guillen, M. (2007) Risk classification for claim counts: A comparative analysis of various zero inflated mixed Poisson and hurdle models. North American Actuarial Journal, 11(4), 110-131.
[44] Bohning, D., Dietz, E., Schlattmann, P., Mendonca, L. and Kirchner, U. (1999) The zero inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology. Journal of the Royal Statistical Society (Series A), 162(2), 195-209.
[45] Cheung, Y.B. (2002) Zero-inflated models for regression analysis of count data: A study of growth and development. Statistics in Medicine, 21(10), 1461-1469.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.