Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Language Models Lag Behind: Inconsistent Alignment with Recent Removal of Race from Clinical Algorithms
0
Zitationen
2
Autoren
2025
Jahr
Abstract
<bold>BACKGROUND:</bold> Several medical societies have revised commonly used clinical algorithms to remove the social construct of race as a predictor. Given the rise in the adoption of large language models (LLMs), we assessed whether LLMs trained since October 2023 are aware of the recent removal of race from lung function reference equations and models used to estimate Glomerular Filtration Rate (eGFR). <bold>METHODS:</bold> We evaluated a combination of 12 state-of-the-art and lightweight language models by OpenAI, Meta, Microsoft, and DeepSeek. We used various prompts to ask LLMs to identify the latest FEV<sub>1</sub> and eGFR reference models and their input parameters, as recommended by the ERS/American Thoracic Society (ATS) and the American Society of Nephrology (ASN), respectively. We ran each prompt-model combination five times with default parameters and compared model outputs with the Medical Society's recommendations. <bold>RESULTS:</bold> Regardless of model size and reasoning capability, less than 3% of model outputs correctly identified the removal of race from lung function equations. In contrast, 100% of outputs from state-of-the-art models and 25% from lightweight models correctly identified the removal of race from eGFR (Figure). No statistically significant difference was observed between open-source and proprietary models. <bold>CONCLUSIONS:</bold> Evaluated LLMs consistently failed to correctly identify the latest recommendations of ERS and ATS. <fig><object-id>erj;66/suppl_69/PA2023/F1</object-id><object-id>F1</object-id><object-id>F1</object-id><graphic></graphic></fig>
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.384 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.719 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.259 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.688 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.434 Zit.