Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Make me an Expert: Distilling from Generalist Black-Box Models into Specialized Models for Semantic Segmentation
0
Zitationen
7
Autoren
2025
Jahr
Abstract
The rise of Artificial Intelligence as a Service (AIaaS) democratizes access to pre-trained models via Application Programming Interfaces (APIs), but also raises a fundamental question: how can local models be effectively trained using black-box models that do not expose their weights, training data, or logits, a constraint in which current domain adaptation paradigms are impractical ? To address this challenge, we introduce the Black-Box Distillation (B2D) setting, which enables local model adaptation under realistic constraints: (1) the API model is open-vocabulary and trained on large-scale general-purpose data, and (2) access is limited to one-hot predictions only. We identify that open-vocabulary models exhibit significant sensitivity to input resolution, with different object classes being segmented optimally at different scales, a limitation termed the "curse of resolution". Our method, ATtention-Guided sCaler (ATGC), addresses this challenge by leveraging DINOv2 attention maps to dynamically select optimal scales for black-box model inference. ATGC scores the attention maps with entropy to identify informative scales for pseudo-labelling, enabling effective distillation. Experiments demonstrate substantial improvements under black-box supervision across multiple datasets while requiring only one-hot API predictions. Our code is available at https://github.com/yasserben/ATGC.
Ähnliche Arbeiten
MizAR 60 for Mizar 50
2023 · 74.119 Zit.
ImageNet: A large-scale hierarchical image database
2009 · 60.460 Zit.
Microsoft COCO: Common Objects in Context
2014 · 41.105 Zit.
Fully convolutional networks for semantic segmentation
2015 · 36.286 Zit.
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.305 Zit.
Autoren
Institutionen
- Télécom Paris(FR)
- Centre National de la Recherche Scientifique(FR)
- École Polytechnique(FR)
- Laboratoire d'Informatique de l'École Polytechnique(FR)
- Institut Polytechnique de Paris(FR)
- University of Bergamo(IT)
- Nvidia (United States)(US)
- Institut national de recherche en sciences et technologies du numérique(FR)