Construction of a Risk Prediction Model for Lung Cancer Based on Lifestyle Behaviors in the UK Biobank Large-Scale Population Cohort

CHEN Ruilin, WANG Jingru, WANG Shuo, TANG Siqi, SUO Chen

Abstract

To identify the risk factors related to lifestyle behaviors that affect the incidence of lung cancer, to build a lung cancer risk prediction model to identify, in the population, individuals who are at high risk, and to facilitate the early detection of lung cancer.  Methods  The data used in the study were obtained from the UK Biobank, a database that contains information collected from 502389 participants between March 2006 and October 2010. Based on domestic and international guidelines for lung cancer screening and high-quality research literature on lung cancer risk factors, high-risk population identification criteria were determined. Univariate Cox regression was performed to screen for risk factors of lung cancer and a multifactor lung cancer risk prediction model was constructed using Cox proportional hazards regression. Based on the comparison of Akaike information criterion and Schoenfeld residual test results, the optimal fitted model assuming proportional hazards was selected. The multiple factor Cox proportional hazards regression was performed to consider the survival time and the population was randomly divided into a training set and a validation set by a ratio of 7:3. The model was built using the training set and the performance of the model was internally validated using the validation set. The area under the receiver operating characteristic (ROC) curve (AUC) was used to evaluate the efficacy of the model. The population was categorized into low-risk, moderate-risk, and high-risk groups based on the probability of occurrence of 0% to <25%, 25% to <75%, and 75% to 100%. The respective proportions of affected individuals in each risk group were calculated.   Results  The study eventually covered 453558 individuals, and out of the cumulative follow-up of 5505402 person-years, a total of 2330 cases of lung cancer were diagnosed. Cox proportional hazards regression was performed to identify 10 independent variables as predictors of lung cancer, including age, body mass index (BMI), education, income, physical activity, smoking status, alcohol consumption frequency, fresh fruit intake, family history of cancer, and tobacco exposure, and a model was established accordingly. Internal validation results showed that 8 independent variables (all the 10 independent variables screened out except for BMI and fresh fruit intake) were significant influencing factors of lung cancer (P<0.05). The AUC of the training set for predicting lung cancer occurrence at one year, five years, and ten years were 0.825, 0.785, and 0.777, respectively. The AUC of the validation set for predicting lung cancer occurrence at one year, five years, and ten years were 0.857, 0.782, and 0.765, respectively. 68.38% of the individuals who might develop lung cancer in the future could be identified by screening the high-risk population.  Conclusion  We established, in this study, a model for predicting lung cancer risks associated with lifestyle behaviors of a large population. Showing good performance in discriminatory ability, the model can be used as a tool for developing standardized screening strategies for lung cancer.


Keywords: Lung cancer, Risk prediction, Prediction model, Risk factor

 

Full Text:

PDF


References


SUNG H, FERLAY J, SIEGEL R L, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin,2021,71(3): 209–249. doi: 10. 3322/caac.21660.Epub2021Feb4.

VERNIERI C, NICHETTI F, RAIMONDI A, et al. Diet and supplements in cancer prevention and treatment: clinical evidences and future perspectives. Crit Rev Oncol Hematol,2018,123: 57–73. doi: 10. 1016/j.critrevonc.2018.01.002.

BACH P B, KATTAN M W, THORNQUIST M D, et al. Variations in lung cancer risk amongsmokers. J Natl CancerInst,2003,95(6): 470–478. doi: 10.1093/jnci/95.6.470.

SPITZ M R, HONG W K, AMOS C I, et al. A risk model for prediction of lung cancer. J Natl Cancer Inst,2007,99(9): 715–726. doi: 10.1093/jnci/djk153.

SPITZ M R, ETZEL C J, DONG Q, et al. An expanded risk prediction model for lung cancer. Cancer Prev Res (Phila),2008,1(4): 250–254. doi: 10.1158/1940-6207.capr-08-0060.

El-ZEIN R A, LOPEZ M S, D′AMELIO A M, et al. The cytokinesis-blocked micronucleus assay as a strong predictor of lung cancer: extension of a lung cancer risk prediction model. Cancer Epidemiol Biomarkers Prev,2014,23(11): 2462–2470. doi: 10.1158/1055-9965.EPI-14-0462.

CASSIDY A, MYLES J P, Van TONGEREN M, et al. The LLP risk model: an individual risk prediction model for lung cancer. Br J Cancer, 2008,98(2): 270–276. doi: 10.1038/sj.bjc.6604158.

RAJI O Y, AGBAJE O F, DUFFY S W, et al. Incorporation of a genetic factor into an epidemiologic model for prediction of individual risk of lung cancer: the Liverpool Lung Project. Cancer Prev Res (Phila),2010, 3(5): 664–669. doi: 10.1158/1940-6207.CAPR-09-0141.

MARCUS M W, CHEN Y, RAJI O Y, et al. Llpi: Liverpool lung project risk prediction model for lung cancer incidenc. Cancer Prev Res (Phila), 2015,8(6): 570–575. doi: 10.1158/1940-6207.capr-14-0438.

MARCUS M W, RAJI O Y, DUFFY S W, et al. Incorporating epistasis interaction of genetic susceptibility single nucleotide polymorphisms in a lung cancer risk prediction model. Int J Oncol,2016,49(1): 361–370. doi: 10.3892/ijo.2016.3499.

ETZEL C J, KACHROO S, LIU M, et al. Development and validation of a lung cancer risk prediction model for African-Americans. Cancer Prev Res (Phila),2008,1(4): 255–265. doi: 10.1158/1940-6207.capr-08-0082.

SPITZ M R, AMOS C I, LAND S, et al. Role of selected genetic variants in lung cancer risk in African Americans. J Thorac Oncol,2013,8(4): 391–397. doi: 10.1097/JTO.0b013e318283da29.

TAMMEMAGI C M, PINSKY P F, CAPORASO N E, et al. Lung cancer risk prediction: prostate, lung, colorectal and ovarian cancer screening trial models and validation. J Natl Cancer Inst,2011,103(13): 1058–1068. doi: 10.1093/jnci/djr173.

TAMMEMAGI M C, LAM S C, MCWILLIAMS A M, et al. Incremental value of pulmonary function and sputum DNA image cytometry in lung cancer risk prediction. Cancer Prev Res (Phila),2011,4(4): 552–561. doi: 10.1158/1940-6207.CAPR-10-0183.

TAMMEMAGI M C, KATKI H A, HOCKING W G, et al. Selection criteria for lung-cancer screening. N Engl J Med,2013,368(8): 728–736. doi: 10.1158/1940-6207.capr-10-0183.

HOGGART C, BRENNAN P, TJONNELAND A, et al. A risk model for lung cancer incidence. Cancer Prev Res (Phila),2012,5(6): 834–846. doi: 10.1158/1940-6207.CAPR-11-0237.

CHARVAT H, SASAZUKI S, SHIMAZU T, et al. Development of a risk prediction model for lung cancer: the Japan public health center-based prospective study. Cancer Sci,2018,109(3): 854–862. doi: 10.1111/cas. 13509.

YOUNG R P, HOPKINS R J, HAY B A, et al. Lung cancer susceptibility model based on age, family history and genetic variants. PLoS One,2009, 4(4): e5302. doi: 10.1371/journal.pone.0005302.


Refbacks

  • There are currently no refbacks.