Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Large Language Model‐Embedded Intelligent Robotic Scrub Nurse with Multimodal Input for Enhancing Surgeon–Robot Interaction
0
Zitationen
5
Autoren
2025
Jahr
Abstract
Scrub nurses have crucial responsibilities, particularly in handling instrument‐related tasks. However, significant mental burdens and unfamiliarity with instruments can lead to various human errors. Consequently, the research community has explored robotic prototypes. Unfortunately, these prototypes often focus on specific instrument‐handling tasks or offer non‐intuitive interaction methods, hindering social acceptance. This article proposes a surgeon‐friendly robotic scrub nurse platform that addresses multiple instrument‐related tasks, including grasping and transferring, automatic sorting, and counting. To the best of the authors’ knowledge, this is the first prototype to incorporate audiovisual input modalities and a large language model (LLM) for smooth and intuitive interaction between the surgeon and the robot. Specifically, vision artificial intelligence (AI) provides accurate instrument detection results using oriented bounding boxes with an average precision of 97.6%, guiding robot motion planning. The speech AI recognizes the surgeon's voice commands. The LLM further interprets multimodal information to trigger different robot actions via the “tool use” capability, achieving a standalone success rate of 94% with an average action latency of less than 1 s on a real robotic scrub nurse hardware platform. Physical validation demonstrated that the proposed prototype successfully completed all assigned tasks, proving its feasibility and effectiveness.
Ähnliche Arbeiten
Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study
2020 · 22.607 Zit.
La certeza de lo impredecible: Cultura Educación y Sociedad en tiempos de COVID19
2020 · 19.271 Zit.
A Multi-Modal Distributed Real-Time IoT System for Urban Traffic Control (Invited Paper)
2024 · 14.251 Zit.
UNet++: A Nested U-Net Architecture for Medical Image Segmentation
2018 · 8.491 Zit.
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
2021 · 7.104 Zit.