Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
S618 ChatGPT-4.0 Answers Common Irritable Bowel Syndrome Patient Queries: Accuracy and References Validity
0
Zitationen
6
Autoren
2023
Jahr
Abstract
Introduction: Artificial intelligence (AI) chatbots are becoming increasingly popular and likely to become frequently used by patients inquiring about health-related concerns. ChatGPT was introduced in November 2022 and recently updated to version 4. We sought to assess the accuracy of answers and references provided by ChatGPT4.0 to questions on irritable bowel syndrome (IBS), a diagnosis frequently queried online. Methods: After reviewing the most frequently searched terms related to IBS on Google Trends, we formulated 15 questions on the topic. We entered each question into ChatGPT-4.0 in a separate chat log, asking the model to supply references for each generated answer. Accuracy of the AI's responses and provided references were then assessed by 3 independent gastroenterologists. Answers were evaluated using 2 grading systems: an overall grade (accurate vs inaccurate) and a granular grade (100% accurate, 100% inaccurate, accurate with missing information, partly inaccurate). References were graded as suitable, unsuitable (existent but unrelated to answer), or nonexistent. We used free-marginal Fleiss kappa coefficients (κ) to quantify inter-rater agreement pre(κpre) and post(κpost) consensus discussions, which served to rectify any grading discrepancies. When disagreement persisted, the most stringent evaluation was accepted as the definitive grade (Table 1). Results: Overall assessment showed 80% of AI answers were accurate and 20% inaccurate (κpre=0.82 [95% confidence interval CI 0.58-1.00], κpost=1.00 [95%CI, 1.00-1.00]). Granular grading showed 53% of answers were accurate, 33% partially inaccurate, 13% correct but incomplete and 0% completely inaccurate (κpre=0.38 [95%CI 0.14-0.62], κpost=0.88 [95%CI 0.72-1.00]). Provided references were suitable for 33% of answers, unsuitable for 53%, and nonexistent for 13% of answers (κpre=0.53 [95%CI, 0.27-0.79]/κpost=1.00 [95%CI, 1.00-1.00]) (Figure 1). Conclusion: While overall accuracy was high at 80%, ChatGPT-4.0 still missed some details or provided outdated information. However, no fully inaccurate information was given, making this model a potential safe source for general guidance on common IBS queries. The model remains problematic for medical professionals when it comes to literature research and referencing as ChatGPT failed to provide a satisfying number of suitable references. These findings underscore the need for enhancing AI's precision and references validity in health-related information dissemination.Figure 1.: Visual Comparison of Reviewers' Assessments of ChatGPT-4.0 Responses to IBS Queries: A Clustered Column Chart. Table 1. - Questions posed to ChatGPT-4.0 and reviewers' evaluation of responses IBS Questions Granular Grade Overall Grade References Grade 1. What are the symptoms of IBS? Provide references Partly Inaccurate Accurate Suitable 2. What causes IBS? Provide references Accurate Accurate Suitable 3. Is IBS a serious condition? Provide references Partly Inaccurate Accurate Suitable 4. How is IBS diagnosed? Provide references Partly Inaccurate Inaccurate Unsuitable 5. What are the treatment options for IBS? Provide references Accurate with Missing Information Accurate Nonexistent 6. What foods should I avoid if I have IBS? Provide references Accurate Accurate Suitable 7. Are there any natural remedies for IBS? Provide references Accurate Accurate Unsuitable 8. Can probiotics help with IBS? Provide references Accurate Accurate Unsuitable 9. What support resources are available for people with IBS? Provide references Accurate with Missing Information Accurate Unsuitable 10. Can Cannabinoids (CBD) improve IBS symptoms? Provide references Partly Inaccurate Inaccurate Unsuitable 11. What causes IBS flare-ups? Provide references Accurate Accurate Unsuitable 12. Why do people with IBS pass gas so much? Provide references Accurate Accurate Unsuitable 13. Is there a test for IBS? Provide references Partly Inaccurate Inaccurate Suitable 14. How to cure IBS permanently? Provide references Accurate Accurate Unsuitable 15. How to manage IBS during pregnancy? Provide references Accurate Accurate Nonexistent
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.436 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.311 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.753 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.523 Zit.