Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Artificial intelligence‑driven virtual tumor board enhances precision care in myelodysplastic syndromes
0
Zitationen
20
Autoren
2026
Jahr
Abstract
Background: Large language models (LLMs) perform well on standardized medical exam questions, but their reliability for complex hematology decision making is uncertain. We compared four general-purpose LLMs (GPT-4o, GPT-o3, Claude Sonnet 4, and DeepSeek-V3) with a Virtual MDS Panel (VMP), a coordinated multi-agent AI system in which domain-specialized, rule-bound software agents (WHO/ICC guidelines; IPSS-R/IPSS-M; NCCN) collaborate to generate tumor-board-level recommendations. Methods: Each model generated diagnostic, prognostic, and treatment recommendations for 30 myelodysplastic syndrome cases. Nine international MDS experts from five institutions, blinded to model identity, completed 3,000 structured ratings using 5-point Likert scales for diagnosis, prognosis, and therapy and classified errors by severity. Results: General-purpose LLMs achieved modest expert ratings (overall mean scores: 3.7 for GPT-o3, 3.2 for GPT-4o, 3.1 for DeepSeek, and 3.0 for Claude) and contained major factual errors in at least 24% of responses. The VMP increased the proportion of outputs rated 4 or higher to 87% (vs. 34-66% for general-purpose models), improved mean scores to 4.3 overall (4.3 for diagnosis, 4.4 for prognosis, and 4.1 for therapy), and reduced major errors to 8%. Conclusions: In this blinded evaluation of 30 complex MDS cases, general-purpose LLMs produced clinically important errors at rates that raise safety concerns for autonomous hematology decision making. The VMP, a rule-bound, multi-agent architecture, approached expert-level accuracy supporting its potential role as an effective decision-support tool for MDS in the future.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.391 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.257 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.685 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.501 Zit.
Autoren
- David M Swoboda
- Amy E DeZern
- James T. England
- Sangeetha Venugopal
- Thomas J. Kehoe
- Brandon J. Aubrey
- Marco Gabriele Raddi
- Angela Consagra
- Jiasheng Wang
- Jonathan Andreadakis
- Gustavo Rivero
- Maximilian Stahl
- Amer M Zeidan
- Torsten Haferlach
- Andrew M. Brunner
- Rena Buckstein
- Valeria Santini
- Matteo Giovanni Della Porta
- Mikkael A Sekeres
- Aziz Nazha
Institutionen
- Tampa General Hospital(US)
- Sidney Kimmel Comprehensive Cancer Center(US)
- Sunnybrook Health Science Centre(CA)
- Sylvester Comprehensive Cancer Center(US)
- University of South Florida(US)
- Massachusetts General Hospital(US)
- Azienda Ospedaliero-Universitaria Careggi(IT)
- The Ohio State University(US)
- Yale Cancer Center(US)
- Munich Leukemia Laboratory (Germany)(DE)
- Humanitas University(IT)
- IRCCS Humanitas Research Hospital(IT)
- Sidney Kimmel Cancer Center(US)