#0295
Enhancing Individual Survival Prediction in Upper Tract Urothelial Carcinoma After Radical Nephroureterectomy Using Machine and Deep Learning AI models
C. Tsai1,2, Y. Shih1, J. Liu3, Y. Tsai4,5
1Far
Eastern Memorial Hospital, Divisions of Urology, Department of Surgery, New
Taipei City, Taiwan
2Yuan Ze University, Department of Electrical Engineering, Taoyuan
City, Taiwan
3National Chung Cheng University, Department of Communication,
Chiayi City, Taiwan
4Taipei Tzuchi Hospital, The Buddhist Tzu Chi Medical Foundation,
Division of Urology, Department of Surgery, New Taipei City, Taiwan
5Buddhist Tzu Chi University, School of Medicine, Hualien City,
Taiwan
Introduction:
Uro-oncologists often face the challenge of predicting long-term prognosis for patients. This study aimed to develop machine and deep learning algorithms to optimize the prediction of overall survival (OS) and cancer-specific survival (CSS) after radical nephroureterectomy (RNU) for upper tract urothelial carcinoma (UTUC).
Material and methods:
We utilized data from the nationwide, multicenter Taiwan UTUC Collaboration Group registry, which included oncological outcomes for 3,910 patients who underwent RNU between 1988 and 2023 across various hospitals in Taiwan. A total of 54 clinicopathological variables were collected as features for the models. The patient cohort was randomly divided into a development cohort and a test cohort (3128 [80%] vs. 782 [20%]). Four time-to-event survival algorithms—Cox Proportional Hazards (CoxPH), Random Survival Forest (RSF), eXtreme Gradient Boosting Survival Embeddings (XGBSE), and DeepSurv—were employed for model development. The performance of these models was validated on an independent test set.
Results:
There were no statistically significant differences between the development and test cohorts in all clinicopathological features and survival, as determined by the log-rank test. The prediction models demonstrated variation in their performance. For CSS, the RSF model demonstrated the highest concordance-index (C-index) among the models tested, achieving 0.826 (95% CI: 0.814-0.838) in the development cohort and 0.816 (95% CI: 0.791-0.840) in the test cohort. The XGBSE model showed a significant drop in performance in the test cohort with a C-index of 0.725 (95% CI: 0.698-0.761), indicating potential issues with overfitting. For OS, RSF again showed superior performance, with a C-index of 0.765 (95% CI: 0.751-0.779) in the development cohort and 0.760 (95% CI: 0.732-0.788) in the test cohort. The CoxPH yielded lower C-index in the test cohort at 0.706 (95% CI: 0.674-0.737). Across both CSS and OS, the non-linear RSF models significantly outperformed the traditional CoxPH in test validation, with C-index of 0.816 vs. 0.793 for CSS, and 0.760 vs. 0.706 for OS, respectively. The performance of the deep learning-based DeepSurv was comparable to that of CoxPH.