OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 25.03.2026, 12:50

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

How much can large language models of Artificial Intelligence inform patients about urodynamics? A comparative analysis

2026·0 Zitationen·Anatolian Current Medical JournalOpen Access
Volltext beim Verlag öffnen

0

Zitationen

2

Autoren

2026

Jahr

Abstract

Aims: To evaluate and compare the readability and informational quality of current large language models (LLMs) in providing patient information about urodynamics (UD) testing.Methods: This cross-sectional study, conducted on October 1, 2025, analyzed five widely used LLMs-ChatGPT-5, Gemini 2.5 Pro, Grok 4, Deepseek v3.1, and Microsoft Copilot. The top 25 UD-related keywords, excluding six of them, searched on Google Trends (2004-2025), were entered into each chatbot using identical prompts. Outputs were independently evaluated using the Quality Analysis of Medical Artificial Intelligence (QAMAI) and DISCERN instruments to evaluate text quality and reliability, while Flesch-Kincaid Reading Ease (FKRE) and Grade Level (FKGL) indices measured readability. Additionally, each LLM was asked to generate a visual depiction of a UD setting to assess the educational potential of AI-based multimodal content.Results: The evaluated LLMs showed significant differences in readability and informational quality (p=0.001). Gemini achieved the highest FKRE score (49.0±8.4) and the lowest FKGL (9.4±1.3), indicating superior readability. Deepseek achieved the highest QAMAI (27.7±1.5) and DISCERN (71.5±6.4) scores, indicating superior quality and reliability. Copilot demonstrated lower readability and consistency scores compared with the other evaluated models. AI-generated visualizations of UD settings (using Gemini, GPT-5, Grok, Copilot, and DALL-E) effectively depicted the main components of the procedures. Conclusion: LLMs show significant variability in the quality, accuracy, and readability of UD-related patient information. Deepseek delivered the most accurate and structured content, whereas Gemini provided the most understanable language. Continuous validation, guideline-based fine-tuning, and expert supervision are essential before AI chatbots can be reliably adopted in patient education and urology practice.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareElectronic Health Records Systems
Volltext beim Verlag öffnen