Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
7870 Exploring The Role Of ChatGPT In Pediatric Endocrinology Education; Exam Preparation And Question Generation
0
Zitationen
2
Autoren
2024
Jahr
Abstract
Abstract Disclosure: J. Tarkoff: None. A.G. Martinez Sanchez: None. Large language models (LLMs) hold substantial promise for improving physician knowledge and expertise. Their role in medical education and the generation of diverse diagnoses can be crucial, potentially leading to positive effects on clinical outcomes. The New England Journal of Medicine pediatric case challenges have examined ChatGPT [1]. Yet, its performance in specialized areas like Pediatric Endocrinology remains unexplored. This study assesses the effectiveness of ChatGPT 4 in responding to questions from the Pediatric Endocrine Self-Assessment Program (PESAP). It also examines if the AI is suitable for creating an educational quiz for residents and fellows. Methods: ChatGPT 4 underwent testing with questions from the 2021-2022 version of PESAP, utilizing the prompt: “Can you assist from the perspective of a pediatric endocrinologist with the following patient case”. Responses were evaluated for initial correctness, and performance was analyzed across various competency categories corresponding to the 7 “umbrella sections” of the tool (Adrenal, Bone, Carbohydrate and Lipid Metabolism/Obesity, Growth, Pituitary, Reproductive System, and Thyroid). Subsequently, we personalized the model by incorporating the questions and detailed responses from the PESAP. This customization resulted in a model that was then utilized to create a 10-question proof-of-concept quiz for four board-certified pediatric endocrinologists. The quiz included a scoring system designed to measure the extent and depth of ChatGPT knowledge. Results: ChatGPT 4 accurately answered 52% of PESAP questions, demonstrating varying performance across specific categories, ranging from 30% (Adrenal) to 78% (Reproductive System). In 16 questions, ChatGPT 4 did not provide an initial answer, requiring a specific request for a response. For questions related to thyroid cancer, explicit prompts were necessary to instruct responses based on the American Thyroid Association 2015 guidelines. In the endocrinologist quiz, the average score was 80%, ranging from 60% to 100%. Discussion: Currently, using ChatGPT 4 as the final diagnostic tool, especially in pediatric endocrinology, should be approached with caution, based on our assessment. However, when incorporating distinct Pediatric Endocrinology case studies, ChatGPT 4 successfully generated valuable educational questions, reinforcing fundamental concepts in the field. We anticipate that as LLMs advance and receive direct medical training, the out-of-the-box diagnostic accuracy will improve, leading to a transformative impact on medical education through this technology. References: 1. Barile J, Margolis A, Cason G, et al. Diagnostic Accuracy of a Large Language Model in Pediatric Case Studies. JAMA Pediatr. Published online January 02, 2024. Presentation: 6/3/2024
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.303 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.155 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.555 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.453 Zit.