Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Application of large language models in clinical record correction: a comprehensive study on various retraining methods
6
Zitationen
8
Autoren
2024
Jahr
Abstract
OBJECTIVES: We evaluate the effectiveness of large language models (LLMs), specifically GPT-based (GPT-3.5 and GPT-4) and Llama-2 models (13B and 7B architectures), in autonomously assessing clinical records (CRs) to enhance medical education and diagnostic skills. MATERIALS AND METHODS: Various techniques, including prompt engineering, fine-tuning (FT), and low-rank adaptation (LoRA), were implemented and compared on Llama-2 7B. These methods were assessed using prompts in both English and Spanish to determine their adaptability to different languages. Performance was benchmarked against GPT-3.5, GPT-4, and Llama-2 13B. RESULTS: GPT-based models, particularly GPT-4, demonstrated promising performance closely aligned with specialist evaluations. Application of FT on Llama-2 7B improved text comprehension in Spanish, equating its performance to that of Llama-2 13B with English prompts. Low-rank adaptation significantly enhanced performance, surpassing GPT-3.5 results when combined with FT. This indicates LoRA's effectiveness in adapting open-source models for specific tasks. DISCUSSION: While GPT-4 showed superior performance, FT and LoRA on Llama-2 7B proved crucial in improving language comprehension and task-specific accuracy. Identified limitations highlight the need for further research. CONCLUSION: This study underscores the potential of LLMs in medical education, providing an innovative, effective approach to CR correction. Low-rank adaptation emerged as the most effective technique, enabling open-source models to perform on par with proprietary models. Future research should focus on overcoming current limitations to further improve model performance.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.557 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.447 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.944 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.797 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.