Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Extractive Radiology Reporting with Memory-based Cross-modal Representations
1
Zitationen
4
Autoren
2025
Jahr
Abstract
Radiology report generation (RRG) produces detailed textual descriptions for radiographs, serving as a crucial task for medical analysis and diagnosis. Most existing RRG approaches naturally follow the multimodal text generation paradigm, where autoregressive models are utilized to perform token-by-token report generation and thus are potentially risky in generating invalid content while being limited in low information processing speed. Although advanced architectures, such as pre-trained models and large language models (LLMs), are applied for RRG and achieve good performance, they still face the aforementioned risk and speed limitation, especially that LLMs may introduce hallucinations. Consider that radiology reports are highly patternized, sentences in them convey specific meanings independently and are frequently reused, we propose a new extractive radiograph reporting (ERR) workflow and design a dedicated framework that efficiently and accurately extracts appropriate sentences from existing radiological cases for report generation. Our approach employs a memory module to store important medical information and enhance the encoding for input radiograph with better cross-modal representations, which are used to match sentences for the extraction process. We conducted experiments on two widely used benchmark datasets, with the results demonstrating that our approach outperforms strong baselines and achieves comparable results with existing state-of-the-art generative models. Analyses further confirm that our ERR approach not only produces reports with reliable content but also ensures high training and inference efficiency.