Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
GPT-4o’s competency in answering the simulated written European Board of Interventional Radiology exam compared to a medical student and experts in Germany and its ability to generate exam items on interventional radiology: a descriptive study
16
Zitationen
5
Autoren
2024
Jahr
Abstract
PURPOSE: This study aimed to determine whether ChatGPT-4o, a generative artificial intelligence (AI) platform, was able to pass a simulated written European Board of Interventional Radiology (EBIR) exam and whether GPT-4o can be used to train medical students and interventional radiologists of different levels of expertise by generating exam items on interventional radiology. METHODS: GPT-4o was asked to answer 370 simulated exam items of the Cardiovascular and Interventional Radiology Society of Europe (CIRSE) for EBIR preparation (CIRSE Prep). Subsequently, GPT-4o was requested to generate exam items on interventional radiology topics at levels of difficulty suitable for medical students and the EBIR exam. Those generated items were answered by 4 participants, including a medical student, a resident, a consultant, and an EBIR holder. The correctly answered items were counted. One investigator checked the answers and items generated by GPT-4o for correctness and relevance. This work was done from April to July 2024. RESULTS: GPT-4o correctly answered 248 of the 370 CIRSE Prep items (67.0%). For 50 CIRSE Prep items, the medical student answered 46.0%, the resident 42.0%, the consultant 50.0%, and the EBIR holder 74.0% correctly. All participants answered 82.0% to 92.0% of the 50 GPT-4o generated items at the student level correctly. For the 50 GPT-4o items at the EBIR level, the medical student answered 32.0%, the resident 44.0%, the consultant 48.0%, and the EBIR holder 66.0% correctly. All participants could pass the GPT-4o-generated items for the student level; while the EBIR holder could pass the GPT-4o-generated items for the EBIR level. Two items (0.3%) out of 150 generated by the GPT-4o were assessed as implausible. CONCLUSION: GPT-4o could pass the simulated written EBIR exam and create exam items of varying difficulty to train medical students and interventional radiologists.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.774 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.685 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.244 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.898 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.