Heart disease prediction, Machine learning, Logistic regression, Random forest, XGBoost
Abstract
Heart disease is considered the world leader in mortality rates among people. A project was created with the goal of deploying a machine learning model for the prediction of cardiac disease using publicly available datasets from the University of California, Irvine Machine Learning Repository. Early detection through prompt care may improve death rates. In this respect, several machine learning algorithms: decision trees, logistic regression, random forests, and XGBoost, were used to identify heart disease patterns and risk factors. These above-mentioned models will be evaluated against the following key performance metrics: precision, accuracy, recall, and F1-score. Of all the algorithms, the XGBoost model performed the best, giving a precision of 89% and an F1-score of 0.87, which was one of the best in predicting heart diseases. These findings emphasize the crucial role of machine learning in further improving the prediction of cardiovascular diseases, possibly allowing for early diagnosis. Such predictive tools will allow healthcare providers to move toward more personalized and preventive treatments in patient care and outcomes.