Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
S2981 Evaluating ChatGPT-4o’s Ability to Inform Patients of Pathology Findings Specific to Gastroenterology
0
Zitationen
6
Autoren
2025
Jahr
Abstract
Introduction: Artificial Intelligence (AI) tools like ChatGPT are increasingly used for patient education and decision support. However, little is known about ChatGPT's accuracy in explaining colonoscopy pathology results—a growing concern as patients are able to access pathology reports through electronic medical records, many times even before provider review. Methods: Five standardized patient-centered questions were input into ChatGPT-4o for 5 common pathology findings: normal mucosa, hyperplastic polyps (HP), tubular adenomas, sessile serrated polyps, and inflammatory polyps. Three independent gastroenterologists reviewed each AI-generated response using a structured evaluation form that included 2 positive and negative qualitative comments, along with 5 binary (Yes/No) assessments: factual accuracy, clarity of medical terms for laypersons, completeness, freedom from misleading or outdated content, and conciseness. Discrepancies were identified through side-by-side comparison of reviewer feedback, and subjective comments were used to explore patterns in evaluator disagreement. Results: Per Table 1, the reviewers often agreed that ChatGPT effectively defined the pathologies and their associated cancer risks and appropriately referred patients to their gastroenterology providers. AI performance was found to be the most comprehensive for normal mucosa and HP. However, limitations of the platform included generalizations that overlooked incomplete exams (poor bowel prep), over assurance of benignity, and occasional irrelevant or confusing details (undefined terms like “villous” or “high-grade dysplasia”). Sessile serrated polyp-related responses included complex concepts (“microsatellite instability”) and appeared in responses for unrelated findings (HP and tubular adenomas). In addition, several follow-up recommendations were inconsistent with current post-polypectomy surveillance guidelines, including both over and underestimating recommendations for interval until next procedure. Conclusion: Due to increases in electronic medical record access, patients increasingly seek online explanations of pathology findings before clinical follow-up. While ChatGPT provides accessible summaries and encourages physician consultation, it also presents risks of patient confusion through outdated recommendations, technical language, and overgeneralizations. Gastroenterolgy physicians should be aware of these limitations and proactively counsel patients about the reliability and context of AI-generated health content.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.391 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.257 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.685 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.501 Zit.