Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Exploring the capabilities and limitations of large language models in nuclear medicine knowledge with primary focus on GPT-3.5, GPT-4 and Google Bard

2024·7 Zitationen·Journal of Medical Artificial IntelligenceOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Abstract: Although large language models (LLMs) represent a technological advancement with the potential to transform online information search and retrieval, the possibility of them generating false information has led to significant concerns. This pilot study assessed the accuracy of three prominent LLMs—GPT-3.5, GPT-4, and Bard—in answering nuclear medicine-related medical questions relevant to the levels of medical students and general practitioners. We tested each LLM with 20 questions, presented in a four-choice single best-answer format as prompts for the LLMs, and assessed their accuracies using correct response rates. The questions varied in their complexity, encompassing the remember level, the understand level, and the apply level of Bloom’s cognitive taxonomy. Our results showed a correct response rate of 85.0% for GPT-3.5, 95.0% for GPT-4, and 90.0% for Bard. The question answered incorrectly by the LLMs included not only questions in the apply level, but also those in the more basic understand level and the remember level. This result suggests that LLMs were not yet able to correctly answer all nuclear medicine questions at the level of medical students and general practitioners. This could imply that caution should be exercised when using LLMs as a tool for retrieving medical information related to nuclear medicine.

Autoren

Institutionen

Themen

Radiomics and Machine Learning in Medical ImagingArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Exploring the capabilities and limitations of large language models in nuclear medicine knowledge with primary focus on GPT-3.5, GPT-4 and Google Bard

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen