Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
TextObfuscator: Making Pre-trained Language Model a Privacy Protector via Obfuscating Word Representations
18
Zitationen
9
Autoren
2023
Jahr
Abstract
In real-world applications, pre-trained language models are typically deployed on the cloud, allowing clients to upload data and perform compute-intensive inference remotely. To avoid sharing sensitive data directly with service providers, clients can upload numerical representations rather than plain text to the cloud. However, recent text reconstruction techniques have demonstrated that it is possible to transform representations into original words, suggesting that privacy risk remains. In this paper, we propose TextObfuscator, a novel framework for protecting inference privacy by applying random perturbations to clustered representations. The random perturbations make the representations indistinguishable from surrounding clustered representations, thus obscuring word information while retaining the original word functionality. To achieve this, we utilize prototypes to learn clustered representation, where tokens of similar functionality are encouraged to be closer to the same prototype during training.Additionally, we design different methods to find prototypes for token-level and sentence-level tasks, which can improve performance by incorporating semantic and task information.Experimental results on token and sentence classification tasks show that TextObfuscator achieves improvement over compared methods without increasing inference cost.
Ähnliche Arbeiten
k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY
2002 · 8.395 Zit.
Calibrating Noise to Sensitivity in Private Data Analysis
2006 · 6.867 Zit.
Communication-Efficient Learning of Deep Networks from Decentralized\n Data
2016 · 5.591 Zit.
Deep Learning with Differential Privacy
2016 · 5.587 Zit.
Large-Scale Machine Learning with Stochastic Gradient Descent
2010 · 5.559 Zit.