Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Efficient Domain-Specific LLMs: Energy Profiling in Medical QA Tasks
0
Zitationen
4
Autoren
2025
Jahr
Abstract
A current challenge in computational technology is the increasing energy consumption associated with the development and deployment of domain-specific large language models (LLMs), such as those used in the medical domain. In this work, we explore the balance between performance and energy consumption trade-offs across several small-scale LLMs (1B–8B parameters), fine-tuned for medical multiple choice question answering using efficient methods such as Low-Rank Adaptation (LoRA) and quantization. For our experiments, we utilized two entry-level professional hardware setups representative of accessible workstation environments, measuring energy and time consumption across multiple models. The results show that moderate-sized models, such as LLaMA-3.1-8B, can achieve accuracy levels comparable to much larger biomedical-tuned models, such as PMC-LLaMA-13B, while requiring substantially less hardware. Additionally, smaller models, such as the 1B and 3B parameter versions, demonstrate notable efficiency during inference, making them suitable for deployment on edge devices, which are traditionally highly constrained by energy and computational resources. This paper offers practical insights into the deployment of medical models under realistic hardware limitations, supporting the goals of energy-aware and accessible machine learning.