OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 13.03.2026, 23:53

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Prompt Governance in Financial AI: Comparative Performance of Structured Frameworks in Insurance and Investment Tasks

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access
Volltext beim Verlag öffnen

0

Zitationen

1

Autoren

2026

Jahr

Abstract

Large Language Models (LLMs) are increasingly deployed in high-stakes financial domains, yet their output stability and factual reliability remain central concerns. We present the first systematic benchmark of four prompt governance approaches (Baseline zero-shot, Few-shot, CRISPE, and ACTFIT) across two representative financial tasks: life insurance risk classification and investment portfolio allocation. Using three commercial LLMs (Claude Sonnet 4.5, GPT-4.1, and Gemini 2.5 Flash), we evaluate 60 simulated cases through 2,160 API calls, measuring accuracy, hallucination rate, cross-run consistency, and a composite Efficiency Score. Evaluation employs two independent LLM judges from different model families (Claude Opus 4.6 and GPT-4o) with inter-rater agreement of κ = 0.97 for accuracy, validated by a stratified human sample (n = 98, κ = 0.82 judge-human). ACTFIT achieves 67.0% accuracy versus 44.9% for the baseline (+22.1pp), with particularly pronounced effects on investment tasks (+46.1pp). ACTFIT yields the highest Efficiency Score (0.537 vs. 0.342 baseline, +56.8%). GPT-4.1 combined with ACTFIT reaches the dataset peak: 81.7% accuracy with 0% hallucination at US$ 0.006 per correct response. CRISPE produces a surprising negative finding: accuracy below the baseline (41.1% vs. 44.9%) with a 49.1% hallucination rate, demonstrating that poorly designed governance degrades performance. Our findings suggest that institutional prompt governance offers a cost-effective alternative to fine-tuning for financial applications, with implications for regulatory compliance and operational risk management.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

FinTech, Crowdfunding, Digital FinanceArtificial Intelligence in Healthcare and EducationFinancial Distress and Bankruptcy Prediction
Volltext beim Verlag öffnen