Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Can ChatGPT 4.0 Diagnose Epilepsy? A Study on Artificial Intelligence’s Diagnostic Capabilities

2025·5 Zitationen·Journal of Clinical MedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Objectives: This study investigates the potential of artificial intelligence (AI), specifically large language models (LLMs) like ChatGPT, to enhance decision support in diagnosing epilepsy. AI tools can improve diagnostic accuracy, efficiency, and decision-making speed. The aim of this study was to compare the level of agreement in epilepsy diagnosis between human experts (epileptologists) and AI (ChatGPT), using the 2014 International League Against Epilepsy (ILAE) criteria, and to identify potential predictors of diagnostic errors made by ChatGPT. Methods: A retrospective analysis was conducted on data from 597 patients who visited the emergency department for either a first epileptic seizure or a recurrence. Diagnoses made by experienced epileptologists were compared with those made by ChatGPT 4.0, which was trained on the 2014 ILAE epilepsy definition. The agreement between human and AI diagnoses was assessed using Cohen's kappa statistic. Sensitivity and specificity were compared using 2 × 2 contingency tables, and multivariate analyses were performed to identify variables associated with diagnostic errors. Results: Neurologists diagnosed epilepsy in 216 patients (36.2%), while ChatGPT diagnosed it in 109 patients (18.2%). The agreement between neurologists and ChatGPT was very low, with a Cohen's kappa value of -0.01 (95% confidence intervals, CI: -0.08 to 0.06). ChatGPT's sensitivity was 17.6% (95% CI: 14.5-20.6), specificity was 81.4% (95% CI: 78.2-84.5), positive predictive value was 34.8% (95% CI: 31.0-38.6), and negative predictive value was 63.5% (95% CI: 59.6-67.4). ChatGPT made diagnostic errors in 41.7% of the cases, with errors more frequent in older patients and those with specific medical conditions. The correct classification was associated with acute symptomatic seizures of unknown etiology. Conclusions: ChatGPT 4.0 does not reach human clinicians' performance in diagnosing epilepsy, showing poor performance in identifying epilepsy but better at recognizing non-epileptic cases. The overall concordance between human clinicians and AI is extremely low. Further research is needed to improve the diagnostic accuracy of ChatGPT and other LLMs.

Autoren

Institutionen

Krankenhaus Meran(IT)

Themen

Artificial Intelligence in Healthcare and EducationEEG and Brain-Computer InterfacesMachine Learning in Healthcare

Volltext beim Verlag öffnen

Can ChatGPT 4.0 Diagnose Epilepsy? A Study on Artificial Intelligence’s Diagnostic Capabilities

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen