Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Advancing Conversational Diagnostic AI with Multimodal Reasoning
1
Zitationen
36
Autoren
2025
Jahr
Abstract
Large Language Models (LLMs) have demonstrated great potential for conducting diagnostic conversations but evaluation has been largely limited to language-only interactions, deviating from the real-world requirements of remote care delivery. Instant messaging platforms permit clinicians and patients to upload and discuss multimodal medical artifacts seamlessly in medical consultation, but the ability of LLMs to reason over such data while preserving other attributes of competent diagnostic conversation remains unknown. Here we advance the conversational diagnosis and management performance of the Articulate Medical Intelligence Explorer (AMIE) through a new capability to gather and interpret multimodal data, and reason about this precisely during consultations. Leveraging Gemini 2.0 Flash, our system implements a state-aware dialogue framework, where conversation flow is dynamically controlled by intermediate model outputs reflecting patient states and evolving diagnoses. Follow-up questions are strategically directed by uncertainty in such patient states, leading to a more structured multimodal history-taking process that emulates experienced clinicians. We compared AMIE to primary care physicians (PCPs) in a randomized, blinded, OSCE-style study of chat-based consultations with patient actors. We constructed 105 evaluation scenarios using artifacts like smartphone skin photos, ECGs, and PDFs of clinical documents across diverse conditions and demographics. Our rubric assessed multimodal capabilities and other clinically meaningful axes like history-taking, diagnostic accuracy, management reasoning, communication, and empathy. Specialist evaluation showed AMIE to be superior to PCPs on 7/9 multimodal and 29/32 non-multimodal axes (including diagnostic accuracy). The results show clear progress in multimodal conversational diagnostic AI, but real-world translation needs further research.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.200 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.051 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.416 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.410 Zit.
Autoren
- Khaled Saab
- Jan Freyberg
- Chunjong Park
- Tim Strother
- Yong Cheng
- Wei‐Hung Weng
- David G. T. Barrett
- David Stutz
- Nenad Tomašev
- Anil Palepu
- Valentin Liévin
- Yash Sharma
- Roma Ruparel
- Abdullah Ahmed Ali Ahmed
- Elahe Vedadi
- Kimberly Kanada
- Cían Hughes
- Yun Liu
- Geoff Brown
- Yang Gao
- Xiang Li
- S. Sara Mahdavi
- James Manyika
- Katherine Chou
- Yossi Matias
- Avinatan Hassidim
- Dale R. Webster
- Pushmeet Kohli
- S. M. Ali Eslami
- Joëlle Barral
- Adam Rodman
- Vivek Natarajan
- Mike Schaekermann
- Tao Tu
- Alan Karthikesalingam
- Ryutaro Tanno