Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Large language model evaluation in autoimmune disease clinical questions comparing ChatGPT 4o, Claude 3.5 Sonnet and Gemini 1.5 pro
3
Zitationen
13
Autoren
2025
Jahr
Abstract
Large language models (LLMs) have established a presence in providing medical services to patients and supporting clinical practice for doctors. To explore the ability of LLMs in answering clinical questions related to autoimmune diseases, this study was designed with 65 questions related to autoimmune diseases, covering five domains: concepts, report interpretation, diagnosis, prevention and treatment, and prognosis. Types of diseases include Sjögren's syndrome, systemic lupus erythematosus, rheumatoid arthritis, systemic sclerosis, and others. These questions were answered by three LLMs: ChatGPT 4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro. The responses were then evaluated by 8 clinicians based on criteria including relevance, completeness, accuracy, safety, readability, and simplicity. We analyzed the scores of the three LLMs across five domains and six dimensions and compared their accuracy in answering the report interpretation section with that of two senior doctors and two junior doctors. The results showed that the performance of the three LLMs in the evaluation of autoimmune diseases significantly surpassed that of both junior and senior doctors. Notably, Claude 3.5 Sonnet excelled in providing comprehensive and accurate responses to clinical questions on autoimmune diseases, demonstrating the great potential of LLMs in assisting doctors with the diagnosis, treatment, and management of autoimmune diseases.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.303 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.155 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.555 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.453 Zit.