Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Human–AI collectives most accurately diagnose clinical vignettes
15
Zitationen
13
Autoren
2025
Jahr
Abstract
AI systems, particularly large language models (LLMs), are increasingly being employed in high-stakes decisions that impact both individuals and society at large, often without adequate safeguards to ensure safety, quality, and equity. Yet LLMs hallucinate, lack common sense, and are biased-shortcomings that may reflect LLMs' inherent limitations and thus may not be remedied by more sophisticated architectures, more data, or more human feedback. Relying solely on LLMs for complex, high-stakes decisions is therefore problematic. Here, we present a hybrid collective intelligence system that mitigates these risks by leveraging the complementary strengths of human experience and the vast information processed by LLMs. We apply our method to open-ended medical diagnostics, combining 40,762 differential diagnoses made by physicians with the diagnoses of five state-of-the art LLMs across 2,133 text-based medical case vignettes. We show that hybrid collectives of physicians and LLMs outperform both single physicians and physician collectives, as well as single LLMs and LLM ensembles. This result holds across a range of medical specialties and professional experience and can be attributed to humans' and LLMs' complementary contributions that lead to different kinds of errors. Our approach highlights the potential for collective human and machine intelligence to improve accuracy in complex, open-ended domains like medical diagnostics.
Ähnliche Arbeiten
The Strengths and Difficulties Questionnaire: A Research Note
1997 · 14.516 Zit.
Making sense of Cronbach's alpha
2011 · 13.646 Zit.
QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies
2011 · 13.521 Zit.
A method for estimating the probability of adverse drug reactions
1981 · 11.446 Zit.
Evidence-Based Medicine
1992 · 4.133 Zit.
Autoren
Institutionen
- Rational (Germany)(DE)
- Max Planck Institute for Human Development(DE)
- The Human Diagnosis Project
- University of California, San Francisco(US)
- Human Immunome Project(US)
- University of Cologne(DE)
- Harvey Mudd College(US)
- University of Oxford(GB)
- Kaiser Permanente(US)
- Institute of Cognitive Sciences and Technologies(IT)
- Centre for Artificial Intelligence and Robotics(IN)
- National Research Council(RO)
- Innovation Cluster (Canada)(CA)
- Technische Universität Berlin(DE)