OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 07.04.2026, 14:32

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Artificial Intelligence Chatbots and Temporomandibular Disorders: A Comparative Content Analysis over One Year

2025·0 Zitationen·Applied SciencesOpen Access
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2025

Jahr

Abstract

As the use of artificial intelligence (AI) chatbots for medical queries expands, their reliability may vary as models evolve. We longitudinally assessed the quality, reliability, and readability of information on temporomandibular disorders (TMD) generated by three widely used chatbots (ChatGPT, Gemini, and Microsoft Copilot). Ten TMD questions were submitted to each chatbot at two timepoints (T1: February 2024; T2: February 2025). Two blinded evaluators independently assessed all answers using validated tools like the Global Quality Score (GQS), PEMAT, DISCERN, CLEAR, Flesch Reading Ease (FRE), and Flesch–Kincaid Grade Level (FKGL) tools. Analyses followed METRICS guidance. Comparisons between models and across timepoints were conducted using non-parametric tests. At T1, Copilot scored significantly lower in GQS, CLEAR appropriateness, and relevance (p < 0.01), while ChatGPT provided less evidence-based content than its counterparts (p < 0.001). Reliability was poor across models (mean DISCERN score: 34.73 ± 9.49), and readability was difficult (mean FRE: 34.64; FKGL: 14.13). At T2, performances improved across chatbots, particularly for Copilot, yet actionability remained limited and citations were inconsistent. This year-long longitudinal analysis shows an overall improvement in chatbot performance, although concerns regarding information reliability persist. These findings underscore the importance of human oversight of AI-mediated patient information, reaffirming that clinicians should remain the primary source of patient education.

Ähnliche Arbeiten