Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparison of AI-Generated Radiology Impressions: A Multi-Stakeholder Evaluation
0
Zitationen
14
Autoren
2026
Jahr
Abstract
<title>Abstract</title> <bold>Objective</bold> To evaluate the quality, safety, and clinical utility of AI-generated radiology impressions compared with human-authored impressions across multiple clinical stakeholder groups. <bold>Materials & Methods</bold> A retrospective, blinded evaluation was conducted using 200 oncologic computed-tomography reports from a U.S. academic cancer center. Three impression types were assessed for each report: original radiologist-authored impressions, impressions generated by a custom domain-specific AI model fine-tuned on institutional data, and impressions generated by a general-purpose large language model. Original authoring radiologists, independent radiologists, and oncologists evaluated impressions using structured Likert-scale metrics assessing completeness, correctness, conciseness, clarity, clinical utility, and potential patient harm. Pairwise comparisons were performed using Wilcoxon signed-rank and two-proportion z-tests. <bold>Results</bold> Custom model AI impressions demonstrated near parity with human-authored impressions across most quality metrics. Original radiologists rated their own impressions as slightly more complete, while independent radiologists showed no significant differences between original and custom model impressions. Generic model impressions were longer, rated as more complete but significantly less concise. Patient harm ratings were uniformly low. Radiologists preferred original and custom model impressions over generic model impressions, whereas oncologists showed no significant preference. <bold>Discussion</bold> Evaluation outcomes varied by stakeholder group, highlighting differing priorities between radiologists and oncologists. Low inter-rater agreement across several quality metrics suggests that impression quality is inherently subjective and context dependent rather than defined by a single objective standard. <bold>Conclusion</bold> AI-generated radiology impressions, particularly those produced by custom domain-specific models, can achieve quality and safety comparable to human-authored impressions. These findings support the use of AI as an adaptable drafting aid that complements radiologist judgment.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.