Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative Evaluation the Knowledge of Large Language Models about Response Evaluation Criteria in Solid Tumors?
0
Zitationen
3
Autoren
2025
Jahr
Abstract
<title>Abstract</title> <bold>PURPOSE:</bold> To evaluate the diagnostic prowess of eight cutting‐edge large language models (LLMs) in applying the RECIST 1.1 guidelines for oncologic imaging and to compare their performance with that of board‐certified radiologists. This study explores the potential of LLMs as transformative adjuncts in cancer follow‐up imaging. <bold>MATERIAL AND METHOD:</bold> In this experimental cross‐sectional study, 50 text‐based and 30 case‐based multiple‐choice questions (MCQs) derived from RECIST 1.1 were administered to eight LLMs—including ChatGPT variants, Claude (3 Opus and 3.5 Sonnet), Google Gemini 1.5 Pro, Meta Llama 3.1 405B, Mistral Large 2, and Perplexity Pro—and two junior radiologists with seven years of experience. Responses were independently scored as correct or incorrect, and non‐parametric statistical analyses were performed to compare performance across groups. <bold>RESULTS:</bold> Strikingly, all LLMs demonstrated competence comparable to that of the radiologists, with only minor performance variations. Claude 3.5 Sonnet led the pack, achieving 83.3% accuracy on case‐based and 90% on text‐based questions. Other models exhibited robust performance, with no significant differences in case‐based assessments between LLMs and radiologists. <bold>CONCLUSION:</bold> Our findings may pioneer a great change in the reporting of follow-up imaging of cancer patients, which has an important place in clinical practice. The exceptional performance of LLMs,-particularly Claude 3.5 Sonnet- and their peers underscores the promise of LLMs as revolutionary tools in oncologic imaging. These models not only support radiologist but may soon redefine clinical workflows, setting a new benchmark for diagnostic excellence in radiology.
Ähnliche Arbeiten
New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1)
2008 · 28.834 Zit.
TNM Classification of Malignant Tumours
1987 · 16.123 Zit.
A survey on deep learning in medical image analysis
2017 · 13.528 Zit.
Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
2011 · 10.749 Zit.
The American Joint Committee on Cancer: the 7th Edition of the AJCC Cancer Staging Manual and the Future of TNM
2010 · 9.104 Zit.