Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Biosecure-LLM Framework: Protecting LLMs from Cyberbiosecurity Threats and the Case for Independent AI Safety Governance
0
Zitationen
5
Autoren
2026
Jahr
Abstract
Large Language Models (LLMs) are becoming critical infrastructure in scientific, healthcare, and governmental contexts. As frontier AI laboratories increasingly partner with government agencies, a fundamental question arises: Who should control the safety and policy-enforcement layers that constrain model behavior? Current safety mechanisms (LLM guardrails) are typically designed for generic "harmlessness" and operate by detecting semantic patterns and refusing requests. However, they are inadequate governance instruments because they cannot implement auditable, domain-specific controls tied to external regulatory policy objects (e.g., control lists or rules governing personally identifying information). Even a perfectly aligned model is not able to express institution-specific policy without an external control layer. This paper argues that the logical separability of policy enforcement from model inference, demonstrated by firewall-style architectures, demands corresponding institutional separability as well. Concentrating both model development and safety governance within the same commercial entities creates unacceptable conflicts of interest, regulatory capture risks, and accountability gaps. We propose that the policy control layers must be housed within independent regulatory bodies, governmental agencies, or trusted third parties rather than the organizations that build and profit from the underlying models. Drawing on the Biosecure-LLM framework as a technical proof-of-concept, we demonstrate that such separation is architecturally feasible and argue it is well-suited for verifiable compliance.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.287 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.140 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.534 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.450 Zit.