Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation
0
Zitationen
9
Autoren
2025
Jahr
Abstract
We propose a novel approach that adapts hierarchical vision foundation models for real-time ultrasound image segmentation. Existing ultrasound segmentation methods often struggle with adaptability to new tasks, relying on costly manual annotations, while real-time approaches generally fail to match state-of-the-art performance. To overcome these limitations, we introduce an adaptive framework that leverages the vision foundation model Hiera to extract multi-scale features, interleaved with DINOv2 representations to enhance visual expressiveness. These enriched features are then decoded to produce precise and robust segmentation. We conduct extensive evaluations on six public datasets and one in-house dataset, covering both cardiac and thyroid ultrasound segmentation. Experiments show that our approach outperforms state-of-the-art methods across multiple datasets and excels with limited supervision, surpassing nnUNet by over 20\% on average in the 1\% and 10\% data settings. Our method achieves $\sim$77 FPS inference speed with TensorRT on a single GPU, enabling real-time clinical applications.
Ähnliche Arbeiten
Deep Residual Learning for Image Recognition
2016 · 216.943 Zit.
U-Net: Convolutional Networks for Biomedical Image Segmentation
2015 · 86.371 Zit.
ImageNet classification with deep convolutional neural networks
2017 · 75.550 Zit.
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014 · 75.407 Zit.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
2016 · 52.915 Zit.