Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
A clinically accessible small multimodal radiology model and evaluation metric for chest X-ray findings
21
Zitationen
27
Autoren
2025
Jahr
Abstract
Large foundation models show promise in biomedicine but face challenges in clinical use due to performance gaps, accessibility, cost, and lack of scalable evaluation. Here we show that open-source small multimodal models can bridge these gaps in radiology by generating free-text findings from chest X-ray images. Our data-centric approach leverages 697K curated radiology image-text pairs to train a specialized, domain-adapted chest X-ray encoder. We integrate this encoder with pre-trained language models via a lightweight adapter that aligns image and text modalities. To enable robust, clinically relevant evaluation, we develop and validate CheXprompt, a GPT-4-based metric for assessing factual accuracy aligned with radiologists' evaluations. Benchmarked with CheXprompt and other standard factuality metrics, LLaVA-Rad (7B) achieves state-of-the-art performance, outperforming much larger models like GPT-4V and Med-PaLM M (84B). While not immediately ready for real-time clinical deployment, LLaVA-Rad is a scalable, privacy-preserving and cost-effective step towards clinically adaptable multimodal AI for radiology.
Ähnliche Arbeiten
Autoren
- Juan Manuel Zambrano Chaves
- Shih-Cheng Huang
- Yanbo Xu
- Hanwen Xu
- Naoto Usuyama
- Sheng Zhang
- Fei Wang
- Yujia Xie
- Mahmoud Khademi
- Ziyi Yang
- Hany Awadalla
- Julia Gong
- Houdong Hu
- Jianwei Yang
- Chunyuan Li
- Jianfeng Gao
- Yu Gu
- Cliff Wong
- Mu Wei
- Tristan Naumann
- Muhao Chen
- Matthew P. Lungren
- Akshay Chaudhari
- Serena Yeung
- Curtis P. Langlotz
- Sheng Wang
- Hoifung Poon