Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Forecasting Alzheimer’s Disease Progression with Deep Multimodal Learning: Integration of 3D MRI and Tabular Clinical Records via a Large Vision-Language Model
0
Zitationen
7
Autoren
2026
Jahr
Abstract
Abstract Background Accurate forecasting of Alzheimer’s Disease (AD) progression is critical for personalized patient management and clinical trial stratification. However, current predictive models often struggle to effectively integrate high-dimensional neuroimaging with longitudinal clinical data. We introduce AD-LLaVA-3D, a novel multimodal framework designed to bridge this gap by adapting large vision-language models for volumetric and temporal forecasting. Methods We leveraged the LLaVA-NeXT-Video architecture to treat 3D MRI volumes as temporal sequences, enabling the model to process volumetric imaging alongside longitudinal Tabular Clinical Records (TCR). The model was trained on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort (n=764) and evaluated using a rigorous patient-level split. We assessed its ability to forecast a suite of future clinical indicators (e.g., CDR-SB, MMSE) against traditional machine learning baselines (Lasso, Random Forest, Gradient Boosting) and specialized deep learning models (ResNet-3D, Med-Flamingo). Results AD-LLaVA-3D demonstrated superior predictive accuracy on the ADNI test set, achieving a Coefficient of Determination ( R 2 ) of 0.68 for the critical CDR-SB score, surpassing the best-performing baseline ( R 2 = 0.66). Crucially, in an independent external validation on the Open Access Series of Imaging Studies (OASIS) cohort (n=76), our model exhibited exceptional generalization ( R 2 = 0.82, MSE = 0.54), whereas comparison models showed significant performance degradation ( R 2 < 0.60). Conclusions This study presents the first application of a video-based multimodal architecture for AD progression forecasting. By effectively integrating 3D MRI with tabular clinical records, AD-LLaVA-3D offers a robust, generalizable tool for monitoring disease trajectories, significantly advancing predictive capabilities beyond current unimodal or static methods. Highlights First-in-Class Architecture: We introduce the first application of video-based Large Vision-Language Models (LVLMs) to interpret 3D volumetric MRI as a temporal sequence, capturing longitudinal neurodegeneration more effectively than static 3D-CNNs. Robust External Validation: The model achieved superior predictive accuracy ( R 2 = 0.82) on an independent external cohort (OASIS), demonstrating exceptional generalization beyond the training population (ADNI). Data-Efficient Multimodal Integration: We developed a novel prompting strategy that integrates sparse Tabular Clinical Records (TCR) without artificial imputation, allowing the model to leverage incomplete real-world medical history. Clinical Trial Enrichment: By accurately forecasting future cognitive scores (CDR-SB, MMSE), AD-LLaVA-3D serves as a precise screening tool to identify ”rapid progressors” for clinical trials, potentially reducing failure rates in drug development. 1
Ähnliche Arbeiten
The Pittsburgh sleep quality index: A new instrument for psychiatric practice and research
1989 · 34.100 Zit.
Clinical diagnosis of Alzheimer's disease
1984 · 27.939 Zit.
The Montreal Cognitive Assessment, MoCA: A Brief Screening Tool For Mild Cognitive Impairment
2005 · 24.934 Zit.
Special Care Units and Traditional Care in Dementia: Relationship with Behavior, Cognition, Functional Status and Quality of Life - A Review
2013 · 20.659 Zit.
The diagnosis of dementia due to Alzheimer's disease: Recommendations from the National Institute on Aging‐Alzheimer's Association workgroups on diagnostic guidelines for Alzheimer's disease
2011 · 18.649 Zit.