Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Efficient Inference Offloading for Mixture-of-Experts Large Language Models in Internet of Medical Things
6
Zitationen
4
Autoren
2024
Jahr
Abstract
Despite recent significant advancements in large language models (LLMs) for medical services, the deployment difficulties of LLMs in e-healthcare hinder complex medical applications in the Internet of Medical Things (IoMT). People are increasingly concerned about e-healthcare risks and privacy protection. Existing LLMs face difficulties in providing accurate medical questions and answers (Q&As) and meeting the deployment resource demands in the IoMT. To address these challenges, we propose MedMixtral 8x7B, a new medical LLM based on the mixture-of-experts (MoE) architecture with an offloading strategy, enabling deployment on the IoMT, improving the privacy protection for users. Additionally, we find that the significant factors affecting latency include the method of device interconnection, the location of offloading servers, and the speed of the disk.
Ähnliche Arbeiten
Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study
2020 · 22.609 Zit.
La certeza de lo impredecible: Cultura Educación y Sociedad en tiempos de COVID19
2020 · 19.271 Zit.
A Multi-Modal Distributed Real-Time IoT System for Urban Traffic Control (Invited Paper)
2024 · 14.254 Zit.
UNet++: A Nested U-Net Architecture for Medical Image Segmentation
2018 · 8.503 Zit.
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
2021 · 7.117 Zit.