Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Towards Community-Based Evaluation of AI in Neurology: Development of a Headache Diagnosis Dataset for Large Language Models
0
Zitationen
3
Autoren
2025
Jahr
Abstract
Diagnosing headache disorders remains a clinical challenge due to the heterogeneity of headache phenotypes and the absence of objective biomarkers. This study presents a curated dataset of 50 clinical headache case examples, comprising both real (n = 34) and synthetic (n = 16) cases, categorized across 20 diagnoses according to ICHD-3 criteria. The dataset enables the evaluation of large language models (LLMs) for diagnostic accuracy in headache medicine. Three GPT-based models were tested using different prompting strategies, with diagnostic performance assessed at both diagnosis and group levels. Top-1 accuracy ranged from 24% to 63% at the diagnosis level and up to 92% at the group level. The results highlight the potential of LLMs in supporting differential diagnosis of headache disorders, while also emphasizing the need for further validation with larger, diverse datasets. Future efforts will focus on expanding real-world data through clinical collaborations and benchmarking LLMs against medical professionals to assess their utility in clinical decision-making.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.