Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating Large Language Models for Decision Support in Minimally Invasive Spine Surgery Triage and Procedural Categories
0
Zitationen
8
Autoren
2025
Jahr
Abstract
Study DesignVignette-based cross-sectional study.ObjectiveGenerative artificial intelligence (AI) programs such as large language models (LLMs) are reshaping treatment decision-making, yet applications in minimally invasive spine surgery (MISS) are still scarce. This study examines whether OpenAI's ChatGPT-5 Pro and Google's Gemini 2.5 Pro reproduce expert management categories from published MISS cases and measures agreement at procedural and binary triage levels.MethodsWe constructed 90 clinical vignettes from published case reports and prompted each LLM to assign 1 or more of ten predefined categories with two-sentence rationales. Agreement with reference was assessed using Jensen-Shannon divergence (JSD), Stuart-Maxwell tests, Cohen's κ, and McNemar's test for surgical vs non-surgical triage.ResultsDivergence from reference was small, with Jensen-Shannon divergence 0.115 (ChatGPT-5 Pro) and 0.112 (Gemini 2.5 Pro), and smaller between models at 0.073. Paired multinomial tests found differences from the reference (Stuart-Maxwell χ<sup>2</sup>(9) = 24.8 and 26.0; <i>P</i> = 0.007 and 0.006) but not between models (14.4; <i>P</i> = 0.108). Case-level agreement was slight for ChatGPT-5 Pro and fair for Gemini 2.5 Pro (κ = 0.146 and 0.245). Collapsing categories to surgical vs non-surgical improved agreement (κ = 0.415 and 0.587 vs reference; 0.692 between models) with no bias in rates (<i>P</i> ≥ 0.401).ConclusionsLLMs may differentiate between surgical and non-surgical triage, but procedure selection should remain expert-led until systems mature. These findings establish a baseline for integrating LLMs into surgical triage workflows and highlight promise and limitations of generative AI in precision spine care.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.312 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.169 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.564 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.466 Zit.