Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Artificial Intelligence for CT and MRI Protocoling: A Meta-Analysis of Traditional Machine Learning, BERT, and Large Language Models
1
Zitationen
3
Autoren
2025
Jahr
Abstract
<b>BACKGROUND.</b> Examination protocoling is a resource-intensive task. Various artificial intelligence (AI) approaches have been investigated to automate this process. <b>OBJECTIVE.</b> The purpose of this study was to evaluate performance of traditional machine learning (ML) models, bidirectional encoder representations from transformers (BERT) models, and large language models (LLMs) for automated CT and MRI protocoling. <b>EVIDENCE ACQUISITION.</b> MEDLINE, Embase, Scopus, Web of Science, IEEE Xplore, and Google Scholar databases were searched through July 2025 for studies reporting the performance of an AI-based technique in assigning protocols for CT or MRI requisitions. Accuracy results were separately extracted for all models tested in each study and pooled using a random-effects meta-analysis. AI approaches were compared using Welch <i>t</i> tests. Common sources of error were qualitatively summarized. <b>EVIDENCE SYNTHESIS.</b> The final analysis included 23 studies, comprising 1,196,259 imaging requisitions. Requisition subspecialties included body imaging (<i>n</i> = 4), musculoskeletal imaging (<i>n</i> = 3), neuroradiology (<i>n</i> = 6), thoracic imaging (<i>n</i> = 1), and multiple subspecialties (<i>n</i> = 9). Sixteen studies evaluated traditional ML models, eight evaluated BERT models, and five evaluated LLMs. Task-specific model fine-tuning was performed in three studies for traditional ML models, all studies for BERT models, and one study for LLMs. The overall pooled protocoling accuracy was 85% (95% CI, 83-87%). The pooled accuracy was 83% (95% CI, 80-85%) for traditional ML models, 87% (95% CI, 85-89%) for BERT models, and 86% (95% CI, 83-89%) for LLMs; these pooled accuracies were not significantly different between any pairwise combination of the three AI approaches (all <i>p</i> > .05). Among 30 distinct models (14 traditional ML models, nine BERT models, seven LLMs), the top-10 performing models comprised two traditional ML models, six BERT models (including the top performing model [BioBERT, a biomedical-domain BERT; accuracy, 93%]), and two LLMs. Common sources of error included ambiguous requisition text, data imbalance yielding incorrect protocol assignments for low-volume protocols, the presence of multiple clinically reasonable protocols for given requisitions, and difficulty handling requisitions containing terms strongly associated with disparate protocols. <b>CONCLUSION.</b> The top-performing AI models for automated CT and MRI protocoling included predominantly fine-tuned BERT models. <b>CLINICAL IMPACT.</b> AI tools show strong potential to help streamline radiologist workflows, possibly through hybrid AI-radiologist approaches. Fine-tuned LLMs warrant further exploration. <b>TRIAL REGISTRATION.</b> PROSPERO identifier CRD420251088671.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.231 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.423 Zit.