Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluation of large language models in rheumatology and clinical immunology: a systematic assessment based on Chinese national health professional qualification examination

2026·0 Zitationen·Frontiers in MedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

In recent years, large language models (LLMs) have achieved remarkable progress in natural language processing and demonstrated potential applications in medicine. However, their professional capabilities in specific medical subfields, such as immunology, still require systematic evaluation. This study systematically evaluated 11 representative LLMs, including DeepSeek, GPT, Llama, Gemma, and Qwen series, based on the Chinese National Health Professional Qualification Examination in Rheumatology and Clinical Immunology. The evaluation covered four dimensions: basic medical knowledge, related medical knowledge, immunology knowledge, and professional practice ability. Results show significant differences among LLMs. DeepSeek-R1 and Qwen3 achieve the best performance, with accuracy exceeding 90%. However, performance on professional practice ability tasks remained relatively low, highlighting limitations in complex clinical applications.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsRheumatoid Arthritis Research and Therapies

Volltext beim Verlag öffnen

Evaluation of large language models in rheumatology and clinical immunology: a systematic assessment based on Chinese national health professional qualification examination

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen