Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
A Conceptual Framework for Simulated Self-Assessment and Meta-Evaluation of Generative AI Models
0
Zitationen
5
Autoren
2026
Jahr
Abstract
The increasing integration of generative artificial intelligence (GenAI) into scientific research raises the question of whether such systems can be evaluated not only through external benchmarks but also through structured analysis of their own meta-evaluative responses. This study introduces a conceptual framework for simulated self-assessment of GenAI models, formalized through a multidimensional self-assessment profile and a metacognitive self-assessment index (MSI). The proposed framework integrates quantitative criteria capturing hallucination propensity, knowledge currency, formal-structure handling, source validity, and terminological precision. To evaluate the reliability of model-generated self-assessments, psychometric instruments traditionally used in human metacognition research—MAI, SRIS, and SDQ—are adapted for large language models. Experimental results across multiple GPT models indicate that, despite the absence of genuine introspective mechanisms, GenAI systems can produce internally consistent and moderately calibrated meta-evaluative responses. These findings suggest that simulated self-assessment, when interpreted within a rigorous methodological framework and combined with external validation, can serve as a complementary quantitative tool for trust analysis and reliability assessment of generative models.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.611 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.504 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.025 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.835 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.