OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.03.2026, 20:39

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Content Validity of AI-Generated Medical Information on Idiopathic Pulmonary Fibrosis (IPF): A Comparative Analysis of ChatGPT-4 and Gemini 1.5 Pro

2025·0 Zitationen
Volltext beim Verlag öffnen

0

Zitationen

10

Autoren

2025

Jahr

Abstract

<bold>Background:</bold> IPF is characterized by progressive declining respiratory function and quality of life, with high mortality. Large language models (LLMs) produce coherent medical information, but their accuracy, readability, and adherence to IPF guidelines remain unconfirmed. <bold>Aim:</bold> To evaluate the reliability and accuracy of LLMs in generating medically and clinically relevant content related to IPF. <bold>Methods:</bold> A comparative analysis of ChatGPT-4 and Gemini 1.5 Pro responses about IPF-related 23 questions from ATS/ERS/JRS/ALAT guidelines was conducted. Six independent ILD experts assessed responses for accuracy (DISCERN), reliability (JAMA Benchmark Criteria), readability (Flesch-Kincaid), and guidelines adherence. Mann-Whitney U tests and intraclass correlation coefficients (ICC) were used to compare model performance. <bold>Results:</bold> Both LLMs provided partially sufficient responses, with a median JAMA Benchmark score of 2 for both models (p = 0.24). Gemini 1.5 Pro generated higher-quality treatment-related responses compared to ChatGPT-4, as reflected by significantly higher DISCERN scores of 56 and 43, respectively (p < 0.001). Regarding readability, both models required college-level comprehension. The ICC analysis revealed significant inter-rater variability, with ChatGPT-4 demonstrating lower agreement (ICC = 0.361) than Gemini 1.5 Pro (ICC = 0.813). <bold>Conclusion:</bold> While both models offer coherent medical information, their reliability remains suboptimal. Further research should focus on improving AI readability on IPF for practical integration into clinical practice.

Ähnliche Arbeiten