基于机器学习算法构建腹膜透析病人营养不良风险预测模型

    Construction of the risk prediction model of malnutrition in peritoneal dialysis patients based on machine learning algorithms

    • 摘要:
      目的: 利用机器学习算法预测影响腹膜透析病人营养不良的风险因素,为其营养不良管理决策提供参考。
      方法: 对171例腹膜透析病人进行回顾性分析。根据主观全面营养评定方法(SGA)评估营养状况,将病人分为营养不良组69例和营养正常组102例,并进行数据预处理。筛选特征变量采用共线性诊断、最小绝对收缩和选择算子(LASSO)。选择随机森林(RF)、极端梯度提升(XGB)、支持向量机(SVM)、K近邻模型(KNN)、轻量级梯度提升机(Light GBM)5种机器学习算法进行预测建模,十倍交叉验证后,使用受试者工作特征曲线、受试者工作特征曲线下面积(AUC)、精确召回率(PR)曲线、准确率、灵敏度、特异度、F1指数分别对模型进行综合评估,引入Shapley加性解释(SHAP)对最优机器学习模型进行可解释化处理。
      结果: 经LASSO回归分析后,确定9个特征变量用于构建机器学习模型。综合评估显示RF模型具有较高的AUC(0.994)、准确率(0.960)、灵敏度(0.905)、特异度(0.967)、召回率(0.952)、F1指数(0.952)。SHAP模型解释性分析显示,贡献度前5的特征依次是超敏C反应蛋白、血红蛋白、白蛋白、高血压病史和白细胞。
      结论: 腹膜透析病人营养不良的预测模型中,RF模型展现出最佳性能,可为制定腹膜透析病人的营养管理策略提供极具价值的依据。

       

      Abstract:
      Objective To predict the risk factors of malnutrition in peritoneal dialysis (PD) patients using machine learning (ML) algorithms, and provide the reference for nutritional management decisions.
      Methods A retrospective analysis was conducted on 171 PD patients, the nutritional status was assessed using the Subjective Global Assessment (SGA), the patients were divided into the malnutrition group (n = 69) and well-nourished group (n = 102), and the data were preprocessed.The feature variables were screened by collinearity diagnosis, least absolute contraction and selection operator (LASSO). Five machine learning algorithms, namely Random Forest (RF), Extreme Gradient Boosting (XGB), Support Vector Machine (SVM), K-Nearest Neighbor Model (KNN) and Lightweight Gradient Boosting (Light GBM), were selected for predictive modeling. After tenfold cross-validation, the model was comprehensively evaluated using the receiver operating characteristic curve, area under the receiver operating characteristic curve (AUC), precise recall rate (PR) curve, accuracy rate, sensitivity, specificity and F1 index. The Shapley additive interpretation (SHAP) was introduced to process the interpretability of the optimal machine learning model.
      Results After LASSO regression analysis, nine characteristic variables were determined for constructing the machine learning model. The comprehensive evaluation showed that the RF model had high AUC (0.994), accuracy rate (0.960), sensitivity (0.905), specificity (0.967), recall rate (0.952) and F1 index (0.952). The results of interpretive analysis of the SHAP model showed that the top 5 characteristics in terms of contribution were hypersensitive C-reactive protein, hemoglobin, albumin, history of hypertension and white blood cells in sequence.
      Conclusions In the prediction model of malnutrition in peritoneal dialysis patients, the RF model shows the best performance and can provide extremely valuable basis for formulating nutritional management strategies for peritoneal dialysis patients.

       

    /

    返回文章
    返回