Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Auditing the Shadows: A Review of Methods to Detect Shared Training Data in Large Language Models
0
Zitationen
1
Autoren
2025
Jahr
Abstract
Large language models (LLMs) are often trained on undisclosed data. This practice has intensified debates about transparency, copyright compliance, and reproducibility. From this lens, this article systematically reviews methodologies to detect shared training data across LLMs. More specifically, our review spans five methodological families: (1) lexical/semantic overlap metrics, which compare output similarity but struggle with knowledge convergence; (2) memorization analysis, which identifies verbatim regurgitation of rare training examples yet risks extracting copyrighted content; (3) temporal alignment by leveraging models’ knowledge cutoffs to infer shared data timelines; (4) adversarial susceptibility correlation to measure shared failure modes under attack; and (5) synthetic fingerprinting by embedding detectable artifacts in training data. Our review reveals several critical gaps: existing methods are highly siloed, with little cross-community dialogue, evaluation inconsistency epidemics, and ethical risks are understudied and potentially violate data privacy laws. We propose the taxonomy of auditing techniques, an approach to conduct the same, and conduct a case study to showcase its merit. Finally, this review highlights an under-explored knowledge gap and sets a roadmap for future research directions.
Ähnliche Arbeiten
The global landscape of AI ethics guidelines
2019 · 4.511 Zit.
The Limitations of Deep Learning in Adversarial Settings
2016 · 3.858 Zit.
Trust in Automation: Designing for Appropriate Reliance
2004 · 3.382 Zit.
Fairness through awareness
2012 · 3.269 Zit.
Mind over Machine: The Power of Human Intuition and Expertise in the Era of the Computer
1987 · 3.183 Zit.