Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

From GPT-3.5 to GPT-5.2: a paired longitudinal evaluation of large language models in clinical neurology

2026·0 Zitationen·Neurological Research

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

INTRODUCTION: Large language models are increasingly used in evaluating medical data and clinical decision making. Data on the performance evolution of these models are limited. This study evaluated the intergenerational development of model performance using paired methods in the discipline of neurology, where the need to synthesize and contextualize complex information is high. METHODS: The scoring system and clinical neurology question set comprising 216 questions used in our previous study were employed using methodological replication. The questions, evenly distributed across 12 subspecialties, were divided into subgroups based on question type, difficulty level, and qualitative characteristics. The responses underwent accuracy and comprehensiveness analyses by three independent academics. Effect sizes were calculated using matched analyses between the two generations. RESULTS: < 0.001; r:0.78) were significantly higher than the previous version, with effect sizes observed at the medium to high levels. Consistent performance improvement was observed across question types, difficulty levels, and qualitative characteristics. Performance was relatively low in some subspecialties. CONCLUSION: The GPT-5.2 model demonstrated a significant performance increase compared with the previous model when presented with questions in clinical neurology. The performance increase was supported by high effect sizes, indicating potential clinical relevance. Model evolution was not homogeneous across subspecialties. Integrating it into clinical systems with strict control mechanisms may alleviate safety concerns.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareGenomics and Rare Diseases

Volltext beim Verlag öffnen

From GPT-3.5 to GPT-5.2: a paired longitudinal evaluation of large language models in clinical neurology

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen