OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 24.04.2026, 21:54

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Cognitively Biased Prompt Effects on Large Language Model Accuracy for Radiology Board-Style Examination Questions

2026·0 Zitationen·Radiology Artificial Intelligence
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2026

Jahr

Abstract

Large language models (LLMs) are increasingly explored for radiology-related applications, yet their vulnerability to cognitive biases remains undercharacterized. The aim of this study was to investigate whether targeted prompts exploiting cognitive biases degrade LLM accuracy on radiology board-style questions. Ten contemporary LLMs were evaluated on 200 text-based and 200 multimodal American Board of Radiology examination-style questions under baseline and three cognitive bias prompts: authority bias prompts (ABPs), complexity bias prompts (CBPs), and anchoring bias prompts (AnBPs). Two mitigation approaches, a prompt bias audit and a one-shot mitigation strategy, were also evaluated. Under baseline prompts, models achieved a mean accuracy of 84.8 ± 5.5% (154-186 of 200) for text-based and 59.5 ± 7.7% (101-143 of 200) for multimodal questions. All models showed reduced accuracy to cognitively biased prompts, with ABP, CBP, and AnBP yielding absolute declines of 21.1%, 10.1%, and 4.4%, respectively, for text questions (<i>P</i> < .001 for each), and 44.9%, 44.4%, 39.6%, respectively, for multimodal questions (<i>P</i> < .001 for each). The prompt bias audit increased accuracy by 5.6% for text-based and 15.8% for multimodal questions, while the one-shot mitigation yielded gains of 4.0% for text questions and 24.9% for multimodal questions. These findings demonstrate that LLMs are susceptible to cognitively biased inputs. ©RSNA, 2026.

Ähnliche Arbeiten