Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating the accuracy and consistency of ChatGPT for the management of type 2 diabetes: A cross-sectional study
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Abstract Large language models (LLMs) have fundamentally changed how patients and clinicians retrieve information; however, it is unclear how accurate and consistent widely available LLMs are in answering questions related to medical information. Our objective was to evaluate the accuracy and consistency of ChatGPT in answering questions related to the management of type 2 diabetes mellitus (T2DM). Three users asked ChatGPT 13 questions pertaining to medications from the top five most common classes of T2DM medications. A response was labelled inconsistent if the response provided to one user differed from the response provided to at least one other user in the same domain for the same medication. A response was labelled as inaccurate if the information provided by ChatGPT was incorrect based on the most recent FDA-approved drug label, in addition to review by an expert reviewer. Additionally, one user asked ChatGPT 26 basic questions related to the management of T2DM, in which the answer was categorized as correct or incorrect. We summarized all results using descriptive statistics. ChatGPT delivered inaccurate responses in seven out of 13 domains and inconsistent responses in seven out of 13 domains for drugs in all five classes of T2DM medication. Of ChatGPT’s responses to the 26 basic T2DM treatment questions, 7 (26%) were incorrect. In this cross-sectional study, we identified that it was common for ChatGPT to provide incorrect or inconsistent responses to enquiries related to the management of type 2 diabetes.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.312 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.169 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.564 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.466 Zit.