Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
How much can large language models of Artificial Intelligence inform patients about urodynamics? A comparative analysis
0
Zitationen
2
Autoren
2026
Jahr
Abstract
Aims: To evaluate and compare the readability and informational quality of current large language models (LLMs) in providing patient information about urodynamics (UD) testing.Methods: This cross-sectional study, conducted on October 1, 2025, analyzed five widely used LLMs-ChatGPT-5, Gemini 2.5 Pro, Grok 4, Deepseek v3.1, and Microsoft Copilot. The top 25 UD-related keywords, excluding six of them, searched on Google Trends (2004-2025), were entered into each chatbot using identical prompts. Outputs were independently evaluated using the Quality Analysis of Medical Artificial Intelligence (QAMAI) and DISCERN instruments to evaluate text quality and reliability, while Flesch-Kincaid Reading Ease (FKRE) and Grade Level (FKGL) indices measured readability. Additionally, each LLM was asked to generate a visual depiction of a UD setting to assess the educational potential of AI-based multimodal content.Results: The evaluated LLMs showed significant differences in readability and informational quality (p=0.001). Gemini achieved the highest FKRE score (49.0±8.4) and the lowest FKGL (9.4±1.3), indicating superior readability. Deepseek achieved the highest QAMAI (27.7±1.5) and DISCERN (71.5±6.4) scores, indicating superior quality and reliability. Copilot demonstrated lower readability and consistency scores compared with the other evaluated models. AI-generated visualizations of UD settings (using Gemini, GPT-5, Grok, Copilot, and DALL-E) effectively depicted the main components of the procedures. Conclusion: LLMs show significant variability in the quality, accuracy, and readability of UD-related patient information. Deepseek delivered the most accurate and structured content, whereas Gemini provided the most understanable language. Continuous validation, guideline-based fine-tuning, and expert supervision are essential before AI chatbots can be reliably adopted in patient education and urology practice.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.303 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.155 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.555 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.453 Zit.