Abstract:
Objective To construct a nomogram risk prediction model for ovarian cancer (OV) occurrence based on machine learning algorithms, providing clinical evidence for early screening of OV patients.
Methods A total of 142 ovarian tumor patients were included, comprising 71 OV patients and 71 benign ovarian tumor patients.Clinical data such as age, blood type, pathological diagnosis, preoperative blood routine tests, biochemical tests, coagulation function, immunological screening, and serum tumor markers such as five-item tumor marker panel were collected.LASSO regression was used to screen factors influencing OV initially, followed by random forest analysis to rank the importance of these factors.The results were then incorporated into multivariate logistic regression to identify the final influencing factors, which were used to construct the OV nomogram prediction model.Finally, the model performance was evaluated using ROC curve, calibration curve, and decision curve analysis (DCA).
Results The variables influencing OV occurrence, screened by LASSO regression and random forest algorithms, were ranked in descending order of importance as follows: carbohydrate antigen 125 (CA125), plasma D-dimer, premenopausal ovarian cancer risk prediction model (PREM), postmenopausal ovarian cancer risk prediction model (POSTM), fibrinogen, platelet count, and lymphocyte percentage (LY%).Multivariate logistic regression revealed that CA125 (OR=1.037, 95%CI: 1.012-1.063), PREM(OR=1.158, 95%CI: 1.011-1.327), and LY% (OR=0.910, 95%CI: 0.851-0.973) were statistically significant predictors of OV.The constructed nomogram risk prediction model for OV achieved an AUC of 0.966 in ROC analysis, and both calibration curve and DCA curve demonstrated strong clinical utility of the prediction model.
Conclusions CA125, PREM, and LY% were identified as key factors influencing OV through machine learning algorithms.The constructed nomogram risk prediction model for OV based on these factors exhibits high predictive value for OV patients.