Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Development and External Validation of an Artificial Intelligence Model for Identifying Radiology Reports Containing Recommendations for Additional Imaging
17
Zitationen
11
Autoren
2023
Jahr
Abstract
<b>BACKGROUND.</b> Reported rates of recommendations for additional imaging (RAIs) in radiology reports are low. Bidirectional encoder representations from transformers (BERT), a deep learning model pretrained to understand language context and ambiguity, has potential for identifying RAIs and thereby assisting large-scale quality improvement efforts. <b>OBJECTIVE.</b> The purpose of this study was to develop and externally validate an artificial intelligence (AI)-based model for identifying radiology reports containing RAIs. <b>METHODS.</b> This retrospective study was performed at a multisite health center. A total of 6300 radiology reports generated at one site from January 1, 2015, to June 30, 2021, were randomly selected and split by 4:1 ratio to create training (<i>n</i> = 5040) and test (<i>n</i> = 1260) sets. A total of 1260 reports generated at the center's other sites (including academic and community hospitals) from April 1 to April 30, 2022, were randomly selected as an external validation group. Referring practitioners and radiologists of varying sub-specialties manually reviewed report impressions for presence of RAIs. A BERT-based technique for identifying RAIs was developed by use of the training set. Performance of the BERT-based model and a previously developed traditional machine learning (TML) model was assessed in the test set. Finally, performance was assessed in the external validation set. The code for the BERT-based RAI model is publicly available. <b>RESULTS.</b> Among a total of 7419 unique patients (4133 women, 3286 men; mean age, 58.8 years), 10.0% of 7560 reports contained RAI. In the test set, the BERT-based model had 94.4% precision, 98.5% recall, and an F1 score of 96.4%. In the test set, the TML model had 69.0% precision, 65.4% recall, and an F1 score of 67.2%. In the test set, accuracy was greater for the BERT-based than for the TML model (99.2% vs 93.1%, <i>p</i> < .001). In the external validation set, the BERT-based model had 99.2% precision, 91.6% recall, an F1 score of 95.2%, and 99.0% accuracy. <b>CONCLUSION.</b> The BERT-based AI model accurately identified reports with RAIs, outperforming the TML model. High performance in the external validation set suggests the potential for other health systems to adapt the model without requiring institution-specific training. <b>CLINICAL IMPACT.</b> The model could potentially be used for real-time EHR monitoring for RAIs and other improvement initiatives to help ensure timely performance of clinically necessary recommended follow-up.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.231 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.423 Zit.