Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
<i>RADO</i> : Trustworthy Radiology Impression Generation using Safety and Faithfulness based Preference Optimization
0
Zitationen
6
Autoren
2026
Jahr
Abstract
Radiology impression generation involves producing concise, clinically meaningful summaries from detailed imaging findings such as CT and MRI scans, serving as a critical aid in diagnosis and treatment planning. However, recent studies highlight a severe shortage of radiologists, particularly in low and middle-income countries, where there is fewer than one radiologist per 100,000 people, making timely expert interpretation a significant challenge. While advancements in AI, especially large language models (LLMs), offer promising potential to automate this task, current systems often suffer from hallucinations, omissions of key clinical details, and a lack of linguistic clarity, thereby raising serious concerns about their safety and reliability in real-world clinical settings. In this work, we attempted to address this issue by introducing RADO , a novel framework for radiology impression generation that integrates safety, faithfulness, and linguistic refinement rewards for preference optimization. To support robust evaluation, we introduce RIB , a real-world benchmark dataset curated and annotated by radiologists, spanning 1,429 annotated CT and MRI findings and impressions across 27 study types. RADO enforces critical safety and factuality constraints via carefully designed reward models and achieves state-of-the-art performance across multiple automatic and human evaluation metrics. Our framework significantly outperforms existing baselines, demonstrating improved factual consistency, reduced omissions, and higher clinical relevance, thus advancing the safety and reliability of generative AI in high-stakes medical applications. The code and dataset associated with the work are made available at RADO . Disclaimer: This work includes descriptions of medical reports related to the subject of the study, which some readers may find sensitive or potentially distressing.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.626 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.532 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.046 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.843 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.