Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Ein externer Link zum Volltext ist derzeit nicht verfügbar.
Performance Assessment Strategies for Language Model Applications in Healthcare
0
Zitationen
3
Autoren
2025
Jahr
Abstract
Language models (LMs) represent an emerging paradigm within artificial intelligence, with applications throughout the medical enterprise. A comprehensive understanding of the clinical task and awareness of the variability in performance when implemented in actual clinical environments lays the foundation for the LM application assessment. Presently, a prevalent method for evaluating the performance of these generative models relies on quantitative benchmarks. Such benchmarks have limitations and may suffer from train-to-the-test overfitting, optimizing performance for a specified test set at the cost of generalizability across other tasks and data distributions. Evaluation strategies leveraging human expertise and utilizing cost-effective computational models as evaluators are gaining interest. We discuss current state-of-the-art methodologies for assessing the performance of LM applications in healthcare and medical devices.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.357 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.221 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.640 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.482 Zit.