Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
“Small” Large Language Models in the hospital: an evaluation study on real-world data in a resources-constrained setting (Preprint)
0
Zitationen
17
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Large Language Models (LLMs) offer promise for healthcare but face challenges of scale, privacy, and limited evidence in non-English settings. Smaller, locally deployable LLMs remain underexplored. </sec> <sec> <title>OBJECTIVE</title> To assess the feasibility of small open-source LLMs (1–24B parameters) in French-language clinical tasks and provide a reproducible hospital-based evaluation framework. </sec> <sec> <title>METHODS</title> Six state-of-the-art small LLMs from the Mistral, Phi-4, Llama-3.1, Meditron-3, Falcon 3 model families were tested in a zero-shot setting on de-identified discharge letters across seven use cases, including information extraction, translation, summarization, and clinical decision support. Performance was measured with F1 scores, readability indices, embedding similarity, and structured clinician reviews. </sec> <sec> <title>RESULTS</title> The models achieved high recall in simple retrieval tasks (up to 99.6%) but showed poor performance in detection of protected health information, adverse-event extraction, summarization, and decision support. Translation quality varied, with general-purpose models outperforming medical-focused models. </sec> <sec> <title>CONCLUSIONS</title> In localized resource-constrained deployments, small LLMs are suitable for basic tasks but insufficient for complex reasoning or clinical decision-making. Our framework supports context-specific evaluation for safe adoption in hospitals. </sec>
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.231 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.084 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.444 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.423 Zit.