Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Prompt Governance in Financial AI: Comparative Performance of Structured Frameworks in Insurance and Investment Tasks
0
Zitationen
1
Autoren
2026
Jahr
Abstract
Large Language Models (LLMs) are increasingly deployed in high-stakes financial domains, yet their output stability and factual reliability remain central concerns. We present the first systematic benchmark of four prompt governance approaches (Baseline zero-shot, Few-shot, CRISPE, and ACTFIT) across two representative financial tasks: life insurance risk classification and investment portfolio allocation. Using three commercial LLMs (Claude Sonnet 4.5, GPT-4.1, and Gemini 2.5 Flash), we evaluate 60 simulated cases through 2,160 API calls, measuring accuracy, hallucination rate, cross-run consistency, and a composite Efficiency Score. Evaluation employs two independent LLM judges from different model families (Claude Opus 4.6 and GPT-4o) with inter-rater agreement of κ = 0.97 for accuracy, validated by a stratified human sample (n = 98, κ = 0.82 judge-human). ACTFIT achieves 67.0% accuracy versus 44.9% for the baseline (+22.1pp), with particularly pronounced effects on investment tasks (+46.1pp). ACTFIT yields the highest Efficiency Score (0.537 vs. 0.342 baseline, +56.8%). GPT-4.1 combined with ACTFIT reaches the dataset peak: 81.7% accuracy with 0% hallucination at US$ 0.006 per correct response. CRISPE produces a surprising negative finding: accuracy below the baseline (41.1% vs. 44.9%) with a 49.1% hallucination rate, demonstrating that poorly designed governance degrades performance. Our findings suggest that institutional prompt governance offers a cost-effective alternative to fine-tuning for financial applications, with implications for regulatory compliance and operational risk management.
Ähnliche Arbeiten
Transaction-Cost Economics: The Governance of Contractual Relations
1979 · 9.994 Zit.
Open Innovation: The New Imperative for Creating and Profiting from Technology
2003 · 9.454 Zit.
The dynamics of crowdfunding: An exploratory study
2013 · 3.977 Zit.
Opinion Paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy
2023 · 3.281 Zit.
Open Innovation: The New Imperative for Creating and Profiting from Technology
2004 · 2.823 Zit.