Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Who Matters More in Radiology Report Generation: Vision Encoders or Language Models?
0
Zitationen
7
Autoren
2025
Jahr
Abstract
The rapid development of Multimodal Large Language Models (MLLMs) has advanced Radiology Report Generation (RRG). While much of this progress is driven by increasingly powerful Large Language Models (LLMs), the roles of both the vision encoder and the LLM remain underexplored, especially in domain-specific contexts. In this work, we systematically study how different vision encoders and LLMs affect RRG performance, analyzing the task from both vision- and languagecentric perspectives. Through extensive evaluation, we show that domain-adapted vision encoders and LLMs significantly enhance the quality and clinical relevance of generated reports. These findings offer practical guidance for building effective MLLMs in medical imaging.
Ähnliche Arbeiten
Refinement and reassessment of the SERVQUAL scale.
1991 · 3.966 Zit.
Features and uses of high-fidelity medical simulations that lead to effective learning: a BEME systematic review
2005 · 3.758 Zit.
Radiobiology for the Radiologist.
1974 · 3.501 Zit.
International evidence-based recommendations for point-of-care lung ultrasound
2012 · 2.808 Zit.
Radiation Dose Associated With Common Computed Tomography Examinations and the Associated Lifetime Attributable Risk of Cancer
2009 · 2.428 Zit.