journal articles
MACHINE LEARNING TO DETECT ALZHEIMER\'S DISEASE WITH DATA ON DRUGS AND DIAGNOSES
Johanna Wallensten, Caroline Wachtler, Nenad Bogdanovic, Anna Olofsson, Miia Kivipelto, Linus Jönsson, Predrag Petrovic, Axel C. Carlsson
BACKGROUND: Integrating machine learning with medical records offers potential for early detection of Alzheimer's disease (AD), enabling timely interventions.
OBJECTIVES: This study aimed to evaluate the effectiveness of machine learning in constructing a predictive model for AD, designed to predict AD with data up to three years before diagnosis. Using clinical data, including prior diagnoses and medical treatments, we sought to enhance sensitivity and specificity in diagnostic procedures. A second aim was to identify the most important factors in the machine learning models, as these may be important predictors of AD.
DESIGN: The study employed Stochastic Gradient Boosting, a machine learning method, to identify diagnoses predictive of AD using primary healthcare data. The analyses were stratified by sex and age groups.
SETTING: The study included individuals within Region Stockholm, Sweden, using medical records from 2010 to 2022.
PARTICIPANTS: The study analyzed clinical data for individuals over the age of 40. Patients with an AD diagnosis (ICD-10-SE codes F00 or G30) during 2010–2012 were excluded to ensure prospective modeling. In total, AD was identified in 3,407 patients aged 41–69 years and 25,796 patients aged over 69.
MEASUREMENTS: The machine learning model ranked predictive diagnoses, with performance assessed by the area under the receiver operating characteristic curve (AUC). Known and novel predictors were evaluated for their contribution to AD risk.
RESULTS: AUC values ranged from 0.748 (women aged 41–69) to 0.816 (women over 69), with men across age groups falling within this range.
Sensitivity and specificity ranged from 0.73 to 0.79 and 0.66 to 0.79, respectively, across age and gender groups. Negative predictive values were consistently high (≥0.954), while positive predictive values were lower (0.199–0.351).
Additionally, we confirmed known risk factors as predictors and identified novel predictors that warrant further investigation. Key predictors included medical observations, cognitive symptoms, antidepressant treatment, visit frequency, and vitamin B12/folic acid treatment.
CONCLUSIONS: Machine learning applied to clinical data shows promise in predicting AD, with robust model performance across age and sex groups. The findings confirmed known risk factors, such as depression and vitamin B12 deficiency, while also identifying novel predictors that may guide future research. Clinically, this approach could enhance early detection and risk stratification, facilitating timely interventions and improving patient outcomes.
CITATION:
Johanna Wallensten ; Caroline Wachtler ; Nenad Bogdanovic ; Anna Olofsson ; Miia Kivipelto ; Linus Jönsson ; Predrag Petrovic ; Axel C. Carlsson (2025): Machine learning to detect Alzheimer's disease with data on drugs and diagnoses. The Journal of Prevention of Alzheimer’s Disease (JPAD). https://doi.org/10.1016/j.tjpad.2025.100115