Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Opening the ‘black box’ of the silent phase evaluation for artificial intelligence: a scoping review and critical analysis
0
Zitationen
25
Autoren
2025
Jahr
Abstract
‘Silent’ evaluation refers to the prospective, non-interventional testing of artificial intelligence (AI) model performance in the intended clinical setting without affecting patient care or institutional operations. The silent evaluation phase has received less attention than in silico algorithm development or formal clinical evaluations, despite increasing recognition of this type of evaluation as a critical phase in an effective translation process for healthcare AI tools. There are currently no formal guidelines for conducting silent AI evaluations in health settings. We undertook a scoping review to identify silent AI evaluations described in the literature, aiming to summarize current practices for the conduct of silent evaluations. We screened PubMed, Web of Science, and Scopus databases for articles fitting our criteria for silent AI evaluations, or ‘silent trials’, published from 2015 to 2025. A total of 891 articles were identified, and 75 met the criteria for inclusion into the final review. We found wide variance in terminology, description, and rationale for silent evaluations; this led to substantial heterogeneity in what was reported. Overwhelmingly, papers reported measurement of AUC, precision/recall, positive and negative predictive values and similar technical performance metrics. Far fewer studies reported the verification of outputs against an in-situ clinical ground truth, and, when reported, the comprehensiveness of such verification was highly variable. We noted relatively less discussion of sociotechnical components such as stakeholder engagement and human-computer interaction elements. We conclude that there is an opportunity to bring together diverse evaluative practices (e.g., from data science, human factors, and other fields) if the silent evaluation phase is to be maximally effective as a translational mechanism these gaps mirror challenges in effective translation of AI tools from “computer to bedside” and identify opportunities to improve silent evaluation protocols that address key translational needs. This is important as healthcare organizations and regulatory bodies worldwide seek guidance for gathering meaningful evidence of the impact of AI tools on clinical practice.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.200 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.051 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.416 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.410 Zit.
Autoren
- Lana Tikhomirov
- Carolyn Semmler
- Noah Prizant
- Srijan Bhasin
- Georgia Kenyon
- Anton van der Vegt
- Lauren Erdman
- Lyle J. Palmer
- Ahmed Abdullahi Mohamud
- Judy Wawira Gichoya
- Seyi Soremekun
- Mark Sendak
- James A. Anderson
- Stephen Pfohl
- Ian Stedman
- Daniel Ehrmann
- Karin Verspoor
- Jethro C.C. Kwong
- Lesley-Anne Farmer
- Alex John London
- Ismail Akrout
- Shalmali Joshi
- Elena Dicus
- Xiaoxuan Liu
- Melissa D. McCradden