Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Towards Responsible AI: Evaluating Large Language Models Across Trustworthy Dimensions
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Since the boom of Large Language Models (LLMs), there has been an increase in the integration of these models into real-world applications to boost the efficiency of the work. This raises an urgent question about their trustworthiness across properties, such as truthfulness, safety, robustness, privacy, fairness and ethics. While prior work has been done analyzing and benchmarking individual properties, a comprehensive and standardized trust evaluation remains limited, especially in open-source models. This study systematically evaluates trustworthiness in six state-of-the-art open-source large language models, ranging from 1 billion to 10.7 billion parameters. Utilizing the TrustLLM framework, this study benchmarks models in different properties in over 20 different tasks using unified metrics designed for both generative and classified tasks. The results have uncovered notable disparities among models by revealing that improvements in scale do not uniformly translate to better trustworthiness. It also pointed out that no model was uniformly better in all of the properties of trustworthiness. Thus, this study further reinforces the need for a holistic evaluation of LLMs’ trustworthines.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.316 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.177 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.575 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.468 Zit.