Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Decoding Digestive Dilemmas: ChatGPT outperforms Bard in Gastroenterology Clinical Questions from MKSAP-19
0
Zitationen
2
Autoren
2024
Jahr
Abstract
Abstract Aim: This study aims to evaluate the performance of large language models (LLMs), specifically OpenAI’s ChatGPT and Google Bard, in answering gastroenterology clinical questions from the Medical Knowledge Self-Assessment Program-19 (MKSAP-19), thereby assessing their potential utility in clinical decision-making within the field of gastroenterology. Materials and Methods: A comparative analysis was conducted using a dataset of 50 gastroenterology questions from MKSAP-19, assessing the ability of ChatGPT and Bard to provide correct answers without prior training or access to MKSAP-19 materials. The performance of each LLM was evaluated based on the percentage of correct answers, with a passing score set at 50%. Results: ChatGPT outperformed Bard, achieving a 68% success rate in answering the questions correctly, compared to Bard’s 44%. ChatGPT attempted all questions, while Bard abstained from answering two. The analysis also identified specific areas where both LLMs struggled, indicating gaps in their clinical reasoning capabilities. Conclusions: ChatGPT demonstrated a higher efficacy in clinical decision-making for gastroenterology questions than Bard, suggesting the potential of LLMs as supplementary tools in clinical settings. However, the limitations of LLMs, including their inability to interpret images and consider real-life factors such as social determinants of health, highlight the need for further development before they can independently guide medical decisions.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.231 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.423 Zit.