Diabetes Prediction Using Logistic Regression Machine Learning Algorithm
Diabetes Prediction Using Logistic Regression Machine Learning Algorithm
Abstract
Diabetes is a serious worldwide health issue that is becoming more of a problem in Nepal because of its high risk of death and other complications. This study develops an early prediction model using logistic regression, a widely applied machine learning classification technique in clinical research. The model was implemented in Python IDE with data from the Pima Indians Diabetes Database, which includes 768 patient records comprising eight independent features and one outcome variable. Exploratory data analysis was performed to extract insights and visualize trends in the dataset. To address class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied, generating synthetic samples for the minority class. Model evaluation using a confusion matrix demonstrated satisfactory results, achieving an accuracy of 77%, precision of 75%, recall of 77%, and an F1-score of 76%. To further enhance performance, hyperparameter tuning was conducted using the grid search method. The model after grid search improved outcomes, reaching an accuracy of 82%. These findings suggest that logistic regression, supported by data preprocessing, resampling techniques, and hyperparameter optimization, can serve as an effective tool for early detection of diabetes, thereby supporting timely intervention and improved healthcare outcomes.
Keywords:Personal identity, identity crisis, identity confusion, psychological dimensions, emotional processes, cognitive processes, societal expectations, psychological resilience, identity change, individual perceptions, psychological flexibility, life transitions, sense of self, psychological well-being
Author(s):Prof. Dr. Kursat Sahin Yildirimer1* , Ocak Korhan Özduru2
Email:ruponsarkar108@gmail.com
DOI:https://doi.org/10.58924/rjhss.v4.iss1.p1