Abstract:
Objective To construct of a predictive model to predict the risk of readmission within 30d in patients with acute exacerbation of chronic obstructive pulmonary disease (AECOPD) based on the machine learning (ML) modeling and Shapley's additive interpretation (SHAP).
Methods A total of 225 patients with AECOPD who were admitted from January 2022 to April 2024 were selected as the test set, 123 patients with AECOPD who were admitted from May 2024 to August 2025 were selected as the validation set. In the test set, the LASSO algorithm was used to screen the risk variables for readmission of AECOPD patients within 30 days. Four ML models, including linear discriminant analysis (LDA), mixed discriminant analysis (MDA), flexible discriminant analysis (FDA) and extreme gradient boost (XGBoost) were constructed and trained. The predictive performance of four ML models were evaluated using calibration curves, precision-recall curves (PRC), precision-recall gain curves (PRGC) and ROC curves. The ML models were interpreted by attaching SHAP values, and the online application for predicting the risk of readmission within 30 days in patients with AECOPD was developed d using R packages such as shiny. In the validation set, the ML model was validated using the correction curve, decision curve and ROC.
Results The readmission rate within 30 d in 225 patients with AECOPD was 18.7% (42/225). The LASSO algorithm screened for the age, education, diabetes mellitus, smoking history, albumin (Alb), ultrasensitive-C-reactive protein (hs-CRP), D-dimer (D-D) and forced expiratory volume in the first second as a percentage of predicted value (FEV1%pred) as the risk variables for readmission within 30d in AECOPD patients. Of the four ML algorithms, the high predictive accuracy of XGBoost was confirmed by calibration curves, PRC, PRGC and ROC. The XGBoost model based on SHAP values additional interpretation and visualization had very high accuracy in predicting the risk of readmission within 30 d in AECOPD patients, and the online application was available at https://per-dynamic.shinyapps.io/COPD AE/. In the validation set, the correction curve showed that the C-index was 0.812, and the prediction results of the XGBoost model had a high consistency with the actual observed results. The results of the decision curve analysis showed that when the risk threshold was between 0.123 and 0.910, the XGBoost model had significant clinical net benefits. The results of ROC showed that when the AUC of the XGBoost model was 0.853, the XGBoost model had excellent efficacy in predicting the risk of readmission within 30 days in patients with AECOPD.
Conclusions The age, FEV1%pred, Alb, D-D, hs-CRP, smoking history, education and diabetes are the risk variables of readmission in AECOPD patients. An online application based on SHAP values interpreting the XGBoost model can accurately and conveniently predicts the risk of readmission within 30 d in AECOPD patients.