Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
MPCI-Bench: A Benchmark for Multimodal Pairwise Contextual Integrity Evaluation of Language Model Agents
0
Zitationen
2
Autoren
2026
Jahr
Abstract
As language-model agents evolve from passive chatbots into proactive assistants that handle personal data, evaluating their adherence to social norms becomes increasingly critical, often through the lens of Contextual Integrity (CI). However, existing CI benchmarks are largely text-centric and primarily emphasize negative refusal scenarios, overlooking multimodal privacy risks and the fundamental trade-off between privacy and utility. In this paper, we introduce MPCI-Bench, the first Multimodal Pairwise Contextual Integrity benchmark for evaluating privacy behavior in agentic settings. MPCI-Bench consists of paired positive and negative instances derived from the same visual source and instantiated across three tiers: normative Seed judgments, context-rich Story reasoning, and executable agent action Traces. Data quality is ensured through a Tri-Principle Iterative Refinement pipeline. Evaluations of state-of-the-art multimodal models reveal systematic failures to balance privacy and utility and a pronounced modality leakage gap, where sensitive visual information is leaked more frequently than textual information. We will open-source MPCI-Bench to facilitate future research on agentic CI.
Ähnliche Arbeiten
The global landscape of AI ethics guidelines
2019 · 4.725 Zit.
The Limitations of Deep Learning in Adversarial Settings
2016 · 3.886 Zit.
Trust in Automation: Designing for Appropriate Reliance
2004 · 3.512 Zit.
Fairness through awareness
2012 · 3.302 Zit.
AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations
2018 · 3.202 Zit.