Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Toward expert-level medical question answering with large language models
597
Zitationen
35
Autoren
2025
Jahr
Abstract
Large language models (LLMs) have shown promise in medical question answering, with Med-PaLM being the first to exceed a 'passing' score in United States Medical Licensing Examination style questions. However, challenges remain in long-form medical question answering and handling real-world workflows. Here, we present Med-PaLM 2, which bridges these gaps with a combination of base LLM improvements, medical domain fine-tuning and new strategies for improving reasoning and grounding through ensemble refinement and chain of retrieval. Med-PaLM 2 scores up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19%, and demonstrates dramatic performance increases across MedMCQA, PubMedQA and MMLU clinical topics datasets. Our detailed human evaluations framework shows that physicians prefer Med-PaLM 2 answers to those from other physicians on eight of nine clinical axes. Med-PaLM 2 also demonstrates significant improvements over its predecessor across all evaluation metrics, particularly on new adversarial datasets designed to probe LLM limitations (P < 0.001). In a pilot study using real-world medical questions, specialists preferred Med-PaLM 2 answers to generalist physician answers 65% of the time. While specialist answers were still preferred overall, both specialists and generalists rated Med-PaLM 2 to be as safe as physician answers, demonstrating its growing potential in real-world medical applications.
Ähnliche Arbeiten
Autoren
- K. K. Singhal
- Tao Tu
- Juraj Gottweis
- Rory Sayres
- Ellery Wulczyn
- Mohamed Amin
- Le Hou
- Kevin Clark
- Stephen Pfohl
- Heather Cole-Lewis
- Darlene Neal
- Qazi Mamunur Rashid
- Mike Schaekermann
- Amy Wang
- Dev Dash
- Jonathan H. Chen
- Nigam H. Shah
- Sami Lachgar
- P. Mansfield
- Sushant Prakash
- Bradley Green
- Ewa Dominowska
- Blaise Agüera y Arcas
- Nenad Tomašev
- Yun Liu
- Renee Wong
- Christopher Semturs
- S. Sara Mahdavi
- Joëlle Barral
- Dale R. Webster
- Greg S. Corrado
- Yossi Matias
- Shekoofeh Azizi
- Alan Karthikesalingam
- Vivek Natarajan