OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 17.03.2026, 12:14

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the strengths and weaknesses of large language models in answering neurophysiology questions

2023·3 Zitationen·Research Square (Research Square)Open Access
Volltext beim Verlag öffnen

3

Zitationen

4

Autoren

2023

Jahr

Abstract

<title>Abstract</title> <bold>Background: </bold>Large language models (LLMs), such as ChatGPT, Google's Bard, and Anthropic's Claude, demonstrate impressive natural language capabilities. Assessing their competence in specialized domains such as neurophysiology is important for determining their utility in research, education, and clinical applications. <bold>Objectives:</bold>This study evaluates and compares the performance of LLMs in answering neurophysiology questions in English and Persian across different topics and cognitive levels. <bold>Methods:</bold>Twenty questions spanning 4 topics (general, sensory system, motor system, and integrative) and 2 cognitive levels (lower-order and higher-order) were presented to the LLMs. Physiologists scored the essay-style responses from 0-5 points. Statistical analysis compared the scores at themodel, language, topic, and cognitive levels. <bold>Results:</bold>Overall,the models performed well (mean score=3.56/5), with no significant difference between language or cognitive levels. Performance was the strongest in themotor system (mean=4.52) and the weakest in integrative topics (mean=2.1). Detailed qualitative analysis revealed inconsistencies and gaps in reasoning. <bold>Conclusions:</bold> Thisstudy provides insights into LLMs’ capabilities and limitations in neurophysiology. The models exhibit competence in fundamental concepts but face challenges in advanced reasoning and integration. Targeted training could address gaps in knowledge and causal reasoning. As LLMs evolve, rigorous domain-specific assessments will be important to gauge progress.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationTopic ModelingExplainable Artificial Intelligence (XAI)
Volltext beim Verlag öffnen