Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ChatGPT takes on the european exam in core cardiology: an AI success story
1
Zitationen
6
Autoren
2023
Jahr
Abstract
Abstract Background ChatGPT, the trending novel artificial intelligence has triggered ongoing debate regarding its capabilities.Recently, preliminary reports showed that answered correctly in the majority of questions of the United States Medical License Examinations (USMLE). However, its ability to succeed a more precise, challenging and high-stakes post-graduate test, such as the final exam for the completion of medical residency , like the European Exam in Core Cardiology (EECC), is not known yet. Purpose We sought to evaluate the performance of ChatGPT on EECC, to test its capability on a more demanding, high-stakes post-graduate exam in Cardiology training. Methods A total of 488 publicly-available single-answer multiple choice questions (MCQs) were randomly obtained from three different MCQs sources that are traditionally used for the preparation for the EECC: 88 from the sample exam questions released since 2018 from the official ESC website, 200 from the 2022 edition of StudyPRN and 200 from the Braunwald's Heart Disease Review and Assessment (BHDRA). Questions containing audio or visual assets were excluded. After filtering, 362 MCQ items (ESC sample: 68, BHDRA:150, StudyPRN: 144) were included and considered as input source. False responses and indeterminate responses were considered as not correct. Results ChatGPT answered to 340 questions out of 362, with 22 indeterminate answers in total. The overall accuracy was 58.8% across all the question sources. More specifically, it demonstrated an accuracy for the ESC sample, BHDRA and StudyPRN of 61.7%, 52.6%, 63.8% respectively. It answered correctly 42/68 (4 indeterminate) of ESC sample questions, 79/150 (11 indeterminate) of the BHDRA and 92/144 (7 indeterminate) of the StudyPRN. Conclusion ChatGPT manages to correctly answer the majority of EECC’s questions and perform within the passing threshold range. Although it cannot yet process visual content, it can provide rational and correct answers to text-based inputs in most scenarios. The model may be able to efficiently handle a massive amount of acquired medical knowlededge, but the current approach may not substitue for critical thinkg, innovation and creativity; some of the key attibutes that doctors are expected to showcase.Performance of ChatGPT on EECCExample of MCQ input at ChatGPT
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.339 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.211 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.614 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.478 Zit.