Objectives This study examined patterns of clustering of intermediate risk factors for Cardiovascular diseases (CVDs) in a rural South African population of middle-aged and older adults and assessed their associations with socio-demographic and lifestyle factors. Study design Cross-sectional analysis of secondary data. Methods We applied unsupervised machine learning clustering algorithms to data from a sample of 5059 men and women aged 40+ years from the Health and Aging in Africa: A Longitudinal Study of an INDEPTH Community in Rural South Africa (HAALSI) to identify natural clusters of intermediate risk factors for CVDs (body mass index, waist-to-hip ratio, total cholesterol, LDL, HDL, systolic blood pressure, diastolic blood pressure, blood glucose and triglycerides). Logistic regression models were used to assess the association between the different subgroups and socio-demographic (sex, age, education, wealth status) and lifestyle factors (smoking, alcohol use, fruit and vegetable intake and physical activity). Results The clustering algorithms identified two distinct subgroupings of the intermediate risk factors for CVD in the cohort: one with optimal biomarker levels, and another comprising mostly individuals with biomarker readings above optimal threshold. Results from the regression analysis showed higher odds of belonging to the high-risk cluster among females and with increasing age, education, and wealth. Contrary to expectations, non-smokers, non-drinkers, and those consuming atleast 5 weekly fruit/vegetable servings had higher odds being in the high-risk cluster. Conclusion The analysis revealed distinct clustering of CVD risk profiles indicating the need for targeted screening and context-specific prevention strategies to address hidden cardiometabolic risk in seemingly low-risk groups.
Journal article
2026-05-01T00:00:00+00:00
254