OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 02.05.2026, 17:53

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the reliability of the responses of large language models to keratoconus-related questions

2024·7 Zitationen·Clinical and Experimental Optometry
Volltext beim Verlag öffnen

7

Zitationen

3

Autoren

2024

Jahr

Abstract

CLINICAL RELEVANCE: Artificial intelligence has undergone a rapid evolution and large language models (LLMs) have become promising tools for healthcare, with the ability of providing human-like responses to questions. The capabilities of these tools in addressing questions related to keratoconus (KCN) have not been previously explored. BACKGROUND: In this study, the responses were evaluated from three LLMs - ChatGPT-4, Copilot, and Gemini - to common patient questions regarding KCN. METHODS: Fifty real-life patient inquiries regarding general information, aetiology, symptoms and diagnosis, progression, and treatment of KCN were presented to the LLMs. Evaluations of the answers were conducted by three ophthalmologists with a 5-point Likert scale ranging from 'strongly disagreed' to 'strongly agreed'. The reliability of the responses provided by LLMs was evaluated using the DISCERN and the Ensuring Quality Information for Patients (EQIP) scales. Readability metrics (Flesch Reading Ease Score, Flesch-Kincaid Grade Level, and Coleman-Liau Index) were calculated to evaluate the complexity of responses. RESULTS: < 0.001), with ChatGPT-4 scoring highest and Copilot scoring lowest. Although ChatGPT-4 exhibited more reliability based on the DISCERN scale, it was characterised by lower readability and higher complexity. While all LLMs provided responses categorised as 'extremely difficult to read', the responses provided by Copilot showed higher readability. CONCLUSIONS: Despite the responses provided by ChatGPT-4 exhibiting lower readability and greater complexity, it emerged as the most proficient in answering KCN-related questions.

Ähnliche Arbeiten