Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Multi-Criteria Evaluation of Large Language Models (LLMs): Balancing Performance and Security
0
Zitationen
3
Autoren
2026
Jahr
Abstract
Because of its functionality and practicality, Large Language Models (LLMs) have been widely discussed, with a large number of benchmarking being done to evaluate them, especially their efficiency. But despite their numerous applications and the significant benefits they offer, LLMs have proven to be extremely susceptible to attacks of various natures due to their, often unknown, large number of vulnerabilities, characteristics often ignored by benchmarking. Given that, this paper aims to develop a multi-criteria method to assist stakeholders in selecting the most suitable Large Language Model taking into account based on both its efficiency in carrying out tasks of various natures, such as math and reasoning, and its capability to resist a large range of security vulnerabilities, such as prompt injection and jailbreaking. This study utilized the Analytic Hierarchy Process (AHP) along with tools developed to evaluate the capabilities of LLMs in multi-interaction dialogues and LLM vulnerability scanner applied in open source models. The analysis showed that an more efficient model does not mean that it is safer. In addition, it reveals an efficient method for analyzing both model performance and security issues.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.