Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

sLLM: A Memory-Efficient Fine-Tuning and Evaluation Pipeline for Medical Large Language Models

2025·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Recent large language models (LLMs) have shown great performance in medical question answering (QA), but are still limited in use due to challenges such as training and inference costs, medical domain prompt sensitivity, and lack of evaluation frameworks. To address this, a system has been built for fine-tuning and evaluating medical LLMs. Using QLoRA for low-memory fine-tuning, the system integrates the Hugging Face Accelerate framework with multi-GPU distributed training. The lm-eval-harness ensures robust automated evaluation. The validity of the system is demonstrated using the MedGemma 2B model and KorMedMCQA benchmarks. The experimental results show that sLLM can achieve 78.87% accuracy on MedQA, while maintaining training efficiency. This suggests that prompt engineering can outperform meticulously calibrated models, offering a cost-effective way to implement medical LLMs. This work presents a scalable, efficient, and reproducible approach for developing high-performance LLMs, laying the foundation for future clinical integration using transparent systems.

Autoren

Institutionen

Themen

Topic ModelingMachine Learning in HealthcareArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

sLLM: A Memory-Efficient Fine-Tuning and Evaluation Pipeline for Medical Large Language Models

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen