Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Differential Privacy on Large Language Models for Privacy Preserving Clinical Coding

2025·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Recent advancements in Large Language Models (LLMs) have significantly enhanced performance across various Natural Language Processing (NLP) tasks. In certain fields, particularly healthcare, the risk of data leakage in research data management is a critical concern when employing LLMs. To ensure data privacy, recent studies have adopted approaches, such as de-identification by masking out personal identifiable information. However, these anonymisation techniques remain vulnerable to various attacks, including linkage attacks, attribute inference attacks, and membership inference attacks. Differential privacy is a robust anonymisation technique that constrains the influence of individual data samples during model training to address data leakage. Nonetheless, the trade-off between utility and privacy protection remains challenging. Moreover, while differential privacy has been extensively studied in the context of tabular and image data, its application in NLP, especially with clinical data, is limited. In this paper, we explore the integration of differential privacy into the fine-tuning process of LLMs for clinical data, covering a range of model sizes and privacy standards within a healthcare context. We utilise these LLMs to generate synthetic medical notes and assess the privacy and utility of our differential privacy training approach by deploying these synthetic notes in a downstream clinical coding task. Our findings demonstrate that synthetic data from differential privacy-based LLMs achieve comparable or superior classification accuracy to non-differential privacy-based LLMs.

Autoren

Institutionen

Themen

Privacy-Preserving Technologies in DataMachine Learning in HealthcareArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Differential Privacy on Large Language Models for Privacy Preserving Clinical Coding

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen