Temporal Evolution of Predictive Factors for Heart Disease: A Random Forest Analysis

Authors

  • Gema Zhu Author

DOI:

https://doi.org/10.61173/rfp4zs80

Keywords:

Feature importance, random forest, disease prediction, machine learning

Abstract

Heart disease remains the leading cause of death globally, profoundly affecting patients’ quality of life and placing a significant burden on healthcare systems and society. Identifying and understanding the key factors associated with heart disease is essential for its prevention, diagnosis, and treatment. This study explores how the significance of these factors has evolved over time by analyzing data from the Behavioral Risk Factor Surveillance System (BRFSS) from 2015 to 2021. This study focused on lifestyle and demographic variables for non-institutionalized adults aged 18 and older, selecting 36 relevant variables from an initial pool of over 300 each year through rigorous data cleaning and normalization. Utilizing a random forest algorithm, this paper evaluated feature importance across the years. The findings consistently highlight BMI, Income, Age, General Health, Education, and Smoking as pivotal predictors of myocardial infarction (MI) and coronary heart disease (CHD). Although High Cholesterol and Arthritis appeared in the top ten features only once during the four years, they maintained a notable presence within the top fifteen, indicating their significant but secondary role compared to the consistently prominent factors. This variability highlights that while some factors retain stable importance, others may vary in relevance due to changing health trends and dataset characteristics.

Downloads

Published

2024-12-31

Issue

Section

Articles