Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
SAFE-PE, A Systematic Assessment Framework for Evaluating Prompt Engineering in Generative AI
0
Zitationen
2
Autoren
2026
Jahr
Abstract
Prompt engineering is emerging as an essential tradition to use generative AI in such domains as software, learning, health, and creativity. However, the field is yet to have a clear framework on assessing prompt quality, reliability and reproducibility. The comparisons and best practices are complicated by the fact that current efforts are most likely to be based on trial-and-error or task-specific benchmark. We offer A Systematic Assessment Framework for Evaluating Prompt Engineering (SAFE-PE), which implies standard measures and principles, including multi-dimensional evaluation numbers. It puts alongside use of quantitative data (accuracy, diversity, robustness) with qualitative data (interpretability, fairness, ethics) in order to give a holistic picture of prompt performance. We show that the framework is effective through case studies of the large language models Meta AI (LLaMA) in summarization, question and answer, and code generation. SAFE-PE provides a systematic assessment procedure that progresses the timely engineering as a scientific field and enables practitioners to obtain a useful means of using generative AI in a responsible manner.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.