Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Decoding Drift: Trustworthy Prompt Optimization in High-Stakes AI Environments
0
Zitationen
5
Autoren
2025
Jahr
Abstract
The reliability and trustworthiness of Large Language Models (LLMs) take shape as more high-stake industries like healthcare, finance, and legal systems embrace them as their applications. Nonetheless, the issue of prompt drift (minor cumulative deviations in the model behaviour because of differences in the prompt structure and contextual framing) is a major threat to model outputs that are consistent. We have introduced a sound solution to the trustworthy prompt optimization (TPO) problem in the form of a systematic and comprehensive methodology that mitigates the drift issue in three major ways: (1) a drift-sensitive evaluation criterion measuring semantic and policy deviation in the responses of LLCs through LLPitals, (2) a prompt tuning algorithm that balances performance and interpretability based on reinforcement learning, and (3) a module that allows the calibration of a human in the loop in high-stakes decision situations. The outcomes of experiments in three benchmark datasets of clinically relevant, fraud detection and legalistic reasoning tasks show that the proposed TPO framework is up to 27 percent more stable in output and 19 percent more faithful to the facts compared to the competing prompt engineering baselines. This study preconditions ethically sound and reproducible prompt design in safety-sensitive AI products, providing a template of compliance of regulations and ethical assurance within the framework of LLM implementation.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.