OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 00:38

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Efficient Inference Offloading for Mixture-of-Experts Large Language Models in Internet of Medical Things

2024·6 Zitationen·ElectronicsOpen Access
Volltext beim Verlag öffnen

6

Zitationen

4

Autoren

2024

Jahr

Abstract

Despite recent significant advancements in large language models (LLMs) for medical services, the deployment difficulties of LLMs in e-healthcare hinder complex medical applications in the Internet of Medical Things (IoMT). People are increasingly concerned about e-healthcare risks and privacy protection. Existing LLMs face difficulties in providing accurate medical questions and answers (Q&As) and meeting the deployment resource demands in the IoMT. To address these challenges, we propose MedMixtral 8x7B, a new medical LLM based on the mixture-of-experts (MoE) architecture with an offloading strategy, enabling deployment on the IoMT, improving the privacy protection for users. Additionally, we find that the significant factors affecting latency include the method of device interconnection, the location of offloading servers, and the speed of the disk.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

COVID-19 diagnosis using AIPrivacy-Preserving Technologies in DataArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen