Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Performance comparison of large language models in boron neutron capture therapy knowledge assessment
0
Zitationen
8
Autoren
2026
Jahr
Abstract
Accelerator-based boron neutron capture therapy (BNCT) is a binary radiation therapy that has rapidly developed in recent years. This study systematically evaluated and compared the performance of four mainstream model families [ChatGPT, Bard (Gemini), Claude, and ERNIE Bot] in answering BNCT-related knowledge questions, providing a reference for exploring their potential in BNCT professional education. Forty-seven bilingual BNCT questions covering key concepts, clinical practice, and reasoning tasks were constructed. Four mainstream model families [ ChatGPT, Claude, Bard(Gemini), and ERNIE Bot] were tested across five rounds in two languages and question formats. The accuracy, reasoning ability, uncertainty expression, and version effects were analyzed. ChatGPT (72.8%) and Claude (70.4%) showed significantly higher overall accuracy rates than Bard(Gemini) (62.0%) and ERNIE Bot (55.6%) (p < 0.001). Both high-performance models performed significantly better on reasoning-based questions than on fact-based questions (p < 0.001). The average performance improvement from version updates (7.51 ± 8.46percentage points) was numerically higher than the changes during same-version maintenance (0.61 ± 8.68 percentage points, p = 0.126). Although language and questioning methods showed statistically significant effects, the effect sizes were minimal (η2p < 0.01). Uncertainty acknowledgment rates varied significantly among the model families (4.7%-23.7%, p = 0.003). ChatGPT can provide relatively accurate knowledge for the popularization of BNCT. However, existing general-purpose LLMs still cannot accurately answer all BNCT questions and show significant differences in uncertainty expression.
Ähnliche Arbeiten
Fully optimized contracted Gaussian basis sets of triple zeta valence quality for atoms Li to Kr
1994 · 9.101 Zit.
Synthesis of borophenes: Anisotropic, two-dimensional boron polymorphs
2015 · 2.684 Zit.
Experimental realization of two-dimensional boron sheets
2016 · 1.850 Zit.
Cucurbituril Homologues and Derivatives: New Opportunities in Supramolecular Chemistry
2003 · 1.799 Zit.
Frustrated Lewis Pair Chemistry: Development and Perspectives
2015 · 1.733 Zit.