Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A Comparative Analysis of Five AI Chatbots in Providing Patient Education on Smile Design

2025·1 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

<title>Abstract</title> Background: This study aimed to evaluate and compare the accuracy, quality, readability, understandability, and actionability of responses provided by five AI chatbots—Microsoft Copilot, ChatGPT-4, ChatGPT-5, Google Gemini, and Claude Sonet 4.5—to patient questions about smile design and anterior aesthetic dental procedures. Method: Twenty-eight patient-oriented questions were collected from Reddit and Quora. A volunteer asked these questions to the five AI chatbots on the same day in a blinded order. Each response was recorded and coded to maintain anonymity. Two prosthodontists independently assessed the responses for accuracy using a 5-point Likert scale, quality using the Global Quality Scale (GQS), and understandability and actionability using the Patient Education Materials Assessment Tool (PEMAT-P). Readability was measured with Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL). Inter-rater reliability was calculated using Cohen’s kappa. Statistical analyses were performed using Kruskal-Wallis tests for non-parametric data and ANOVA for normally distributed readability scores, with p < 0.05 considered statistically significant. Results: Significant differences were observed in accuracy (p = 0.013) and quality (p < 0.001) among the chatbots. ChatGPT-5 had lower accuracy than Google Gemini (p = 0.017) and Claude Sonet 4.5 (p = 0.041) and lower quality than all other chatbots (p < 0.001). Readability differed significantly (FRE: p = 0.004; FKGL: p < 0.001), with ChatGPT-5 responses requiring the highest reading level. PEMAT-P scores also showed significant differences in understandability and actionability (p < 0.001), with ChatGPT-5 displaying lower scores than other chatbots. Microsoft Copilot, ChatGPT-4, and Google Gemini generally provided higher-quality, more understandable, and actionable information, while ChatGPT-5 and Claude Sonet 4.5 showed limitations. Most chatbot responses were above an eighth-grade reading level, which may challenge general patient comprehension. Conclusion: AI chatbots vary considerably in the quality and usefulness of information they provide for complex dental procedures like smile design. While some models deliver accurate and comprehensible responses, others may produce lower-quality, less actionable content. Despite high understandability in most responses, high reading levels and low actionability could limit patient comprehension and effective decision-making. Care should be taken when patients rely on AI chatbots for dental education, and further improvements are needed to enhance reliability, readability, and actionable guidance.

Autoren

Institutionen

Universidad CES(CO)

Themen

Artificial Intelligence in Healthcare and EducationAI in Service InteractionsMobile Health and mHealth Applications

Volltext beim Verlag öffnen

A Comparative Analysis of Five AI Chatbots in Providing Patient Education on Smile Design

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen