OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 00:38

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Who Matters More in Radiology Report Generation: Vision Encoders or Language Models?

2025·0 Zitationen
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2025

Jahr

Abstract

The rapid development of Multimodal Large Language Models (MLLMs) has advanced Radiology Report Generation (RRG). While much of this progress is driven by increasingly powerful Large Language Models (LLMs), the roles of both the vision encoder and the LLM remain underexplored, especially in domain-specific contexts. In this work, we systematically study how different vision encoders and LLMs affect RRG performance, analyzing the task from both vision- and languagecentric perspectives. Through extensive evaluation, we show that domain-adapted vision encoders and LLMs significantly enhance the quality and clinical relevance of generated reports. These findings offer practical guidance for building effective MLLMs in medical imaging.

Ähnliche Arbeiten