Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Minimum Reporting Items for Clear Evaluation of Accuracy Reports of Large Language Models in Healthcare (MI-CLEAR-LLM): 2025 Updates
5
Zitationen
7
Autoren
2025
Jahr
Abstract
Recent systematic reviews have raised concerns about the quality of reporting in studies evaluating the accuracy of large language models (LLMs) in medical applications. Incomplete and inconsistent reporting hampers the ability of reviewers and readers to assess study methodology, interpret results, and evaluate reproducibility. To address this issue, the MInimum reporting items for CLear Evaluation of Accuracy Reports of Large Language Models in healthcare (MI-CLEAR-LLM) checklist was developed. This article presents an extensively updated version. While the original version focused on proprietary LLMs accessed via web-based chatbot interfaces, the updated checklist incorporates considerations relevant to application programming interfaces and self-managed models, typically based on open-source LLMs. As before, the revised MI-CLEAR-LLM focuses on reporting practices specific to LLM accuracy evaluations: specifically, the reporting of how LLMs are specified, accessed, adapted, and applied in testing, with special attention to methodological factors that influence outputs. The checklist includes essential items across categories such as model identification, access mode, input data type, adaptation strategy, prompt optimization, prompt execution, stochasticity management, and test data independence. This article also presents reporting examples from the literature. Adoption of the updated MI-CLEAR-LLM can help ensure transparency in reporting and enable more accurate and meaningful evaluation of studies.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.200 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.051 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.416 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.410 Zit.