Timely detection of Chronic Kidney Disease (CKD) is important to develop patient outcomes and reduce the burden of end-stage renal failure. ‘Machine learning (ML)’ techniques offer promising tools for early and accurate prediction of CKD by leveraging clinical, demographic, and lifestyle data. This research intended to identify the most relevant clinical, demographic, and lifestyle indicators of CKD, and assess the predictive precision of several machine learning models, and enhance model interpretability through explainable AI techniques. This study utilized a balanced dataset derived through the ‘Random Over-Sampling Examples (ROSE) technique’, addressing the inherent class imbalance between CKD and non-CKD cases. Feature selection was conducted using a hybrid approach combining ‘Recursive Feature Elimination (RFE)’ and Random Forest importance metrics to detect the supreme influential predictors. Five machine learning models “Logistic Regression”, “Random Forest”, “Support Vector Machine (SVM)”, “Decision Tree”, and “XGBoost” were instructed and assessed. Performance was assessed by means of “Accuracy”, “Sensitivity”, “Specificity”, “Kappa statistic”, and “Area Under the Receiver Operating Characteristic Curve (AUC)”. Model interpretability was further enriched through Shapley Additive Explanations (SHAP) analysis. Amongst the models tested, XGBoost attained the highest testing accuracy (97.79%) and AUC (0.9979), followed thoroughly by Random Forest. SHAP analysis revealed that clinical markers such as “Serum Creatinine”, “Glomerular Filtration Rate (GFR)”, “Protein in Urine”, and “Fasting Blood Sugar” were the most significant contributors to model predictions. Interpretability assessments confirmed that model outputs were consistent with clinical knowledge of CKD risk factors. ‘Machine Learning Models’, particularly ensemble methods like XGBoost and Random Forest, can reliably predict chronic kidney disease when united with effective feature selection and data balancing approaches. Incorporating model interpretability techniques such as SHAP values ensures transparency and fosters trust in predictive analytics for clinical applications. To improve early CKD detection and management, future research should incorporate with clinical decision support systems and external validation.
Chronic Kidney Disease, Machine Learning, XGBoost, Random Forest, SHAP, Feature Selection, Data Balancing, Predictive Modeling, Healthcare Analytics.
International Journal of Trend in Scientific Research and Development - IJTSRD having
online ISSN 2456-6470. IJTSRD is a leading Open Access, Peer-Reviewed International
Journal which provides rapid publication of your research articles and aims to promote
the theory and practice along with knowledge sharing between researchers, developers,
engineers, students, and practitioners working in and around the world in many areas
like Sciences, Technology, Innovation, Engineering, Agriculture, Management and
many more and it is recommended by all Universities, review articles and short communications
in all subjects. IJTSRD running an International Journal who are proving quality
publication of peer reviewed and refereed international journals from diverse fields
that emphasizes new research, development and their applications. IJTSRD provides
an online access to exchange your research work, technical notes & surveying results
among professionals throughout the world in e-journals. IJTSRD is a fastest growing
and dynamic professional organization. The aim of this organization is to provide
access not only to world class research resources, but through its professionals
aim to bring in a significant transformation in the real of open access journals
and online publishing.