Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Performance Review of Meta LLaMa 3.1 in Thoracic Imaging and Diagnostics
3
Zitationen
3
Autoren
2025
Jahr
Abstract
ABSTRACT Background The integration of artificial intelligence (AI) in radiology has opened new possibilities for diagnostic accuracy, with large language models (LLMs) showing potential for supporting clinical decision‐making. While proprietary models like ChatGPT have gained attention, open‐source alternatives such as Meta LLaMa 3.1 remain underexplored. This study aims to evaluate the diagnostic accuracy of LLaMa 3.1 in thoracic imaging and to discuss broader implications of open‐source versus proprietary AI models in healthcare. Methods Meta LLaMa 3.1 (8B parameter version) was tested on 126 multiple‐choice thoracic imaging questions selected from Thoracic Imaging: A Core Review by Hobbs et al. These questions required no image interpretation. The model’s answers were validated by two board‐certified diagnostic radiologists. Accuracy was assessed overall and across subgroups, including intensive care, pathology, and anatomy. Additionally, a narrative review introduces three widely used AI platforms in thoracic imaging: DeepLesion, ChexNet, and 3D Slicer. Results LLaMa 3.1 achieved an overall accuracy of 61.1%. It performed well in intensive care (90.0%) and terms and signs (83.3%) but showed variability across subgroups, with lower accuracy in normal anatomy and basic imaging (40.0%). Subgroup analysis revealed strengths in infectious pneumonia and pleural disease, but notable weaknesses in lung cancer and vascular pathology. Conclusion LLaMa 3.1 demonstrates promise as an open‐source NLP tool in thoracic diagnostics, though its performance variability highlights the need for refinement and domain‐specific training. Open‐source models offer transparency and accessibility, while proprietary models deliver consistency. Both hold value, depending on clinical context and resource availability.
Ähnliche Arbeiten
New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1)
2008 · 28.988 Zit.
TNM Classification of Malignant Tumours
1987 · 16.123 Zit.
A survey on deep learning in medical image analysis
2017 · 13.698 Zit.
Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening
2011 · 10.808 Zit.
The American Joint Committee on Cancer: the 7th Edition of the AJCC Cancer Staging Manual and the Future of TNM
2010 · 9.118 Zit.