Predicting clinical outcomes in liver cirrhosis using machine learning and data balancing technique

Document Type : Original Article

Authors

1 Department of Mathematics and Statistics, The University of Toledo, Toledo, OH, 43606, USA

2 Department of Mathematics and Statistics, Stephen F. Austin State University, Nacogdoches, TX, 75965, USA

3 Department of Mathematics, Utah Tech University, Saint George, UT, 84770, USA

10.21608/cjmss.2025.397747.1213

Abstract

Liver cirrhosis is a chronic and life-threatening disease that significantly impacts liver function and overall patient health. Early prediction of clinical outcomes in cirrhotic patients can aid in timely intervention and improved treatment planning. In this study, a dataset containing real-world clinical, biochemical, and demographic data from cirrhosis patients was used to develop predictive models for classifying patient outcomes into three categories: alive, deceased, and liver transplant. A total of fifteen machine learning algorithms were implemented under three scenarios: original dataset with all the rows dropped for missing values, the original dataset with standard data imputation, and a balanced dataset generated through data standardization and the SMOTE oversampling technique. SMOTE was applied to address class imbalance and improve the model’s ability to learn from underrepresented outcomes. Experimental results indicate that the Extra Trees classifier achieved the highest predictive performance, with an accuracy of 85.00%, AUC 94.36%, and an F1 score 84.75% on this latter dataset. These findings underscore the importance of data balancing and model selection in improving outcome prediction in liver disease.

Keywords

Main Subjects