Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Understanding and Mitigating the Soft Error of Contrastive Language-Image Pre-training Models
1
Zitationen
6
Autoren
2024
Jahr
Abstract
In recent years, MultiModal Large Language Models (MM-LLMs), based on the Contrastive Language-Image Pretraining models (CLIP), have achieved the best results in many fields. CLIP breaks through the gaps between language models and image models, realizes zero-shot image classification, and achieves excellent performance in tasks such as text-to-image generation, image style transformation, and long video generation. However, there are few studies on the fault tolerance of CLIP with soft errors, which hinders the application of multimodal large models in the field of security. Based on the analysis of the fault tolerance of common multimodal large models, we proposes a soft error mitigation framework. According to the experiments in this paper, the framework can effectively detect soft errors and mitigate the errors.
Ähnliche Arbeiten
Rethinking the Inception Architecture for Computer Vision
2016 · 30.326 Zit.
MobileNetV2: Inverted Residuals and Linear Bottlenecks
2018 · 24.398 Zit.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2020 · 21.297 Zit.
CBAM: Convolutional Block Attention Module
2018 · 21.274 Zit.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
2015 · 18.491 Zit.