Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
How smart are the machines? An analysis of AI responses on chronic otitis media
0
Zitationen
4
Autoren
2026
Jahr
Abstract
Abstract Objectives This study aimed to comprehensively evaluate the readability, quality, and reliability of the content generated by four state-of-the-art artificial intelligence (AI)-powered chatbots—OpenAI ChatGPT-4, DeepSeek v3, Google Gemini 2.5 Pro, and Grok-2—in response to frequently asked questions (FAQs) posed by patients on the Internet regarding chronic otitis media (COM). Methods A curated set of 25 frequently asked questions (FAQs) on COM was meticulously compiled using insights from Google Trends, Semrush, and authoritative clinical sources. Each question was posed to four leading AI-powered chatbots—OpenAI ChatGPT-4, DeepSeek v3, Google Gemini 2.5 Pro, and Grok-2—via standardized prompts to ensure consistency. Responses were rigorously evaluated for quality and reliability using the Ensuring Quality Information for Patients (EQIP) tool and the modified DISCERN (mDISCERN) instrument. Readability was assessed through the Flesch Reading Ease Score (FRES) and the Flesch–Kincaid Grade Level (FKGL), offering a comprehensive appraisal of the clarity, accuracy, and patient-centeredness of AI-generated content. Results Statistically significant differences were observed among the evaluated chatbots in terms of information quality and reliability, as reflected by EQIP and mDISCERN scores ( p < 0.001), with Grok demonstrating higher performance. In contrast, no statistically significant differences were found in readability measures (FRES and FKGL), and all responses required advanced reading skills. Moreover, no statistically significant differences were observed among the evaluated AI models in terms of readability. Conclusion AI-powered chatbots show promise in delivering health-related information about chronic otitis media; however, their outputs vary notably in quality and reliability while remaining uniformly challenging in terms of readability. Ongoing validation and optimization are essential to enhance the accessibility and educational value of AI-generated medical content. Level of evidence 5.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.436 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.311 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.753 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.523 Zit.