Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
P0384 Applying ChatGPT to Improve Inflammatory Bowel Disease Diagnosis and Evaluation
0
Zitationen
5
Autoren
2025
Jahr
Abstract
Abstract Background Chat Generative Pre-Trained Transformer-v4 (ChatGPT-v4) is an unquestionable asset to healthcare professionals. This paper will outline a number of ways in which ChatGPT-v4 can be used to improve the accuracy of its diagnosis and treatment strategies for ulcerative colitis (UC) and Crohn’s disease (CD), in line with the ECCO guidelines. Methods This 102-item questionnaire was designed to assess the accuracy, consistenacy, and inclusiveness of responses to questions about the diagnosis and treatment of UC and CD. The questionnaire was created in the form of true/false and multiple-choice questions and was based on clinical scenarios reflecting real-life situations and the ECCO guidelines. After that, ChatGPT-v4 was shown it, and its reactions were assessed. The responses were evaluated using a Likert scale. The queries were posed to the artificial intelligence at 15-day intervals. Results In 47 responses (92%), ChatGPT adhered to the established guidelines, with deviations observed in 4 instances (8%). Of the 4 responses previously identified as non-compliant, 3 demonstrated no change or improvement following a 15-day reassessment period. The mean values for the accuracy scores were found to differ, with 5.45 for the first set and 5.56 for the second (p = 0.606). The completeness scores showed a similar trend, with the mean value differing by 0.10, with 2.33 for the first set and 2.43 for the second (p = 0.280). The standard deviation trends for the two sets were also comparable. The analysis revealed a significant enhancement in accuracy from the initial to the fifteenth day of assessment (5.49 and 5.56, respectively). A comparable increase was observed in completeness scores, rising from 2.15 to 2.37 between the initial and fifteenth day assessments. The imaging results indicated that more accurate responses were given to questions pertaining to it, although the difference was not statistically significant (p = 0.31). In the context of multiple-choice questions, the performance of the AI model demonstrated enhanced stability and consistency (p = 0.606). The majority of answers were initially correct (90% of the total). Four out of five incorrect answers evolved into correct answers, one incorrect answer persisted, and two correct answers changed to incorrect answers. Conclusion ChatGPT-v4 has demonstrated potential for development as a clinical support tool in the management of inflammatory bowel diseases, including UC and CD. However, performance differed between binary and multiple-choice questions.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.402 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.270 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.702 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.507 Zit.