OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 12.03.2026, 04:46

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

MT-FiST: A Multi-Task Fine-Grained Spatial-Temporal Framework for Surgical Action Triplet Recognition

2023·12 Zitationen·IEEE Journal of Biomedical and Health Informatics
Volltext beim Verlag öffnen

12

Zitationen

5

Autoren

2023

Jahr

Abstract

Surgical action triplet recognition plays a significant role in helping surgeons facilitate scene analysis and decision-making in computer-assisted surgeries. Compared to traditional context-aware tasks such as phase recognition, surgical action triplets, comprising the instrument, verb, and target, can offer more comprehensive and detailed information. However, current triplet recognition methods fall short in distinguishing the fine-grained subclasses and disregard temporal correlation in action triplets. In this article, we propose a multi-task fine-grained spatial-temporal framework for surgical action triplet recognition named MT-FiST. The proposed method utilizes a multi-label mutual channel loss, which consists of diversity and discriminative components. This loss function decouples global task features into class-aligned features, enabling the learning of more local details from the surgical scene. The proposed framework utilizes partial shared-parameters LSTM units to capture temporal correlations between adjacent frames. We conducted experiments on the CholecT50 dataset proposed in the MICCAI 2021 Surgical Action Triplet Recognition Challenge. Our framework is evaluated on the private test set of the challenge to ensure fair comparisons. Our model apparently outperformed state-of-the-art models in instrument, verb, target, and action triplet recognition tasks, with mAPs of 82.1% (+4.6%), 51.5% (+4.0%), 45.50% (+7.8%), and 35.8% (+3.1%), respectively. The proposed MT-FiST boosts the recognition of surgical action triplets in a context-aware surgical assistant system, further solving multi-task recognition by effective temporal aggregation and fine-grained features.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Surgical Simulation and TrainingCardiac, Anesthesia and Surgical OutcomesArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen