Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
GPT versus Resident Physicians — A Benchmark Based on Official Board Scores
112
Zitationen
8
Autoren
2024
Jahr
Abstract
BACKGROUND Artificial intelligence (AI) is a burgeoning technological advancement, with considerable promise for influencing the field of medicine. As a preliminary step toward integrating AI into medical practice, it is imperative to ascertain whether model performance is comparable with that of physicians. We present a systematic comparison of performance by a large language model (LLM) versus that of a large cohort of physicians. The cohort includes all residents who took the medical specialist license examination in Israel in 2022 across the core medical disciplines: internal medicine, general surgery, pediatrics, psychiatry, and obstetrics and gynecology (OB/GYN). We provide the examinations as an accessible benchmark dataset for the medical machine learning and natural language processing communities, which may be adapted for future LLM studies.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.214 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.071 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.429 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.418 Zit.