Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Not the Models You Are Looking For: Traditional ML Outperforms LLMs in Clinical Prediction Tasks

2024·5 Zitationen·medRxivOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Objectives: To determine the extent to which current Large Language Models (LLMs) can serve as substitutes for traditional machine learning (ML) as clinical predictors using data from electronic health records (EHRs), we investigated various factors that can impact their adoption, including overall performance, calibration, fairness, and resilience to privacy protections that reduce data fidelity. Materials and Methods: We evaluated GPT-3.5, GPT-4, and ML (as gradient-boosting trees) on clinical prediction tasks in EHR data from Vanderbilt University Medical Center and MIMIC IV. We measured predictive performance with AUROC and model calibration using Brier Score. To evaluate the impact of data privacy protections, we assessed AUROC when demographic variables are generalized. We evaluated algorithmic fairness using equalized odds and statistical parity across race, sex, and age of patients. We also considered the impact of using in-context learning by incorporating labeled examples within the prompt. Results: Traditional ML (AUROC: 0.847, 0.894 (VUMC, MIMIC)) substantially outperformed GPT-3.5 (AUROC: 0.537, 0.517) and GPT-4 (AUROC: 0.629, 0.602) (with and without in-context learning) in predictive performance and output probability calibration (Brier Score (ML vs GPT-3.5 vs GPT-4): 0.134 versus 0.384 versus 0.251, 0.042 versus 0.06 versus 0.219). Traditional ML is more robust than GPT-3.5 and GPT-4 to generalizing demographic information to protect privacy. GPT-4 is the fairest model according to our selected metrics but at the cost of poor model performance. Conclusion: These findings suggest that LLMs are much less effective and robust than locally-trained ML for clinical prediction tasks, but they are getting better over time.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcarePrivacy-Preserving Technologies in Data

Volltext beim Verlag öffnen

Not the Models You Are Looking For: Traditional ML Outperforms LLMs in Clinical Prediction Tasks

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen