TITLE:
Decipher Clinical and Genetic Underpins of Breast Cancer Survival with Machine Learning Methods
AUTHORS:
Zhengkai Zhuang
KEYWORDS:
Machine Learning, Breast Cancer Prediction, Data Analysis, Feature Importance Comparison
JOURNAL NAME:
Advances in Breast Cancer Research,
Vol.12 No.4,
October
26,
2023
ABSTRACT: Breast cancer is one of the most common cancers
among women in the world, with more than two million new cases of breast
cancer every year. This disease is associated with numerous clinical and
genetic characteristics. In recent years,
machine learning technology has been increasingly applied to the medical
field, including predicting the risk of malignant tumors such as breast cancer.
Based on clinical and targeted sequencing data of 1980 primary breast cancer
samples, this article aimed to analyze these data and predict living conditions
after breast cancer. After data engineering, feature selection, and comparison of machine learning methods, the light
gradient boosting machine model was found the best with hyperparameter
tuning (precision = 0.818, recall = 0.816, f1 score = 0.817, roc-auc = 0.867).
And the top 5 determinants were clinical features age at diagnosis, Nottingham
Prognostic Index, cohort and genetic features rheb, nr3c1. The study shed light
on rational allocation of medical resources and provided insights to early
prevention, diagnosis and treatment of breast cancer with the identified risk
clinical and genetic factors.