可解释机器学习模型预测精神分裂症病人攻击行为风险

武倩倩; 周三华; 李辉辉; 孙雷鸣

doi:10.13898/j.cnki.issn.2097-5252.2025.08.020

可解释机器学习模型预测精神分裂症病人攻击行为风险

Interpretable machine learning predicts risk of aggressive behavior in schizophrenia

摘要

摘要: 目的: 基于shapley加法解释(SHAP)开发预测精神分裂症病人攻击行为的网络计算器。方法: 选取238例精神分裂症病人为研究对象。通过Boruta算法筛选攻击行为重要特征变量。以3∶ 2比例将238例病人随机分为训练集(n=145)和测试集(n=93)来训练9种机器学习(ML)模型，并进行十倍交叉验证。采用接收者操作特征曲线(ROC)评估ML模型筛选最佳预测性能模型，进一步采用决策曲线分析评估ML模型临床收益。使用SHAP附加解释和可视化ML模型并通过R包构建预测精神分裂症病人攻击行为的网络计算器。结果: 238例精神分裂症病人中，76例发生攻击行为(31.9%)。Boruta算法筛选出病程、社会支持率量表(SSRS)、简明精神病评定量表(BPRS)、高密度脂蛋白(HDL)、中性粒细胞/淋巴细胞比值(NLR)和暴力史是攻击行为重要特征变量。9种ML算法中，训练集和测试集的ROC证实极端梯度提升(XGBoost)模型预测攻击行为风险性能最高。决策曲线分析结果表明该模型预测准确性及临床应用价值较高。SHAP直观图显示贡献攻击行为风险的6个重要特征变量排序依次为病程、SSRS、BPRS、HDL、NLR、暴力史，摘要图显示上述6个变量SHAP值呈“两端分离”，表明能有效预测精神分裂症病人攻击行为风险。基于XGBoost模型的网络计算器能有效预测精神分裂症病人攻击行为风险。结论: 基于病程、SSRS、BPRS、HDL、NLR和暴力史构建SHAP附加解释的XGBoost模型网络计算器能精准预测病人攻击行为风险，完善攻击行为风险早期预防和干预。

Abstract: Objective: To develop an online calculator for predicting aggressive behavior in schizophrenic patients based on shapley additive explanations (SHAP). Methods: A total of 238 schizophrenic patients were selected for the study.The important characteristic variables of aggressive behavior were screened by Boruta algorithm.The 238 patients were randomly divided into a training set (n=145) and a test set (n=93) in a 3∶ 2 ratio to train nine machine learning (ML) models and perform tenfold cross-validation.ML models were evaluated using receiver operating characteristic (ROC) curves to screen the best predictive performance model, and decision curve analysis was further used to assess the clinical benefit of ML models.An online calculator for predicting aggressive behavior in schizophrenic patients was constructed using SHAP add-on to interpret and visualize the ML models and via R package. Results: A total of 238 patients with schizophrenia were included and 76 patients (31.9%) developed aggressive behavior.Boruta algorithm screened for duration of disease, social support rating scale (SSRS), brief psychiatric rating scale (BPRS), high-density lipoprotein (HDL), neutrophil/lymphocyte ratio (NLR), and history of violence as important characteristic variables of aggressive behavior.Among the nine ML algorithms, the ROC of the training and test sets confirmed that the extreme gradient boost (XGBoost) model had the highest performance in predicting the risk of aggressive behavior.The results of the decision curve analysis indicated that the model had high predictive accuracy and clinical application value.The SHAP visualization showed that the rank order of risk for contributing to aggressive behavior in order is disease duration, SSRS, BPRS, HDL, NLR, history of violence.SHAP summary plots showing the global impact and distribution of each feature variable on model predictions.Online calculator based XGBoost effectively predicted risk of aggressive behavior in schizophrenic patients. Conclusions: XGBoost model online calculator for constructing SHAP additional explanations based on disease course, SSRS, BPRS, HDL, NLR, and history of violence accurately predicts patients′ risk of aggressive behavior and improves early prevention and intervention of aggressive behavior risk.

HTML全文

参考文献(21)

施引文献

资源附件(0)