Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Large Language Models for Zero-Shot Procedure Extraction in Orthopedic Surgery: A Comparative Evaluation
0
Zitationen
17
Autoren
2025
Jahr
Abstract
Abstract Background Operative notes in electronic health records contain critical information for understanding surgical care, yet manual coding is time-consuming, costly, and inconsistent. Large language models (LLMs) promise to transform this process by automatically extracting detailed procedure information — a capability with significant implications for scaling clinical registries and advancing surgical research. Methods We conducted a large-scale evaluation of state-of-the-art LLMs for zero-shot structured information extraction from orthopedic clinical notes. Fourteen open-source and proprietary models were tested on 800 real operative notes, annotated by both an orthopedic surgeon and an administrator using a curated list of 74 procedure classes. We compared model outputs to human annotations, assessing accuracy and exploring the effects of model scale, reasoning capabilities, and prompt design. Results Across models, LLMs consistently outperformed administrator-assigned labels, achieving macro-F1 scores above 0.6 and improving over administrative coding by up to 10 points. Larger models and reasoning capabilities further boosted performance, though gains plateaued beyond 30 billion parameters. Performance varied by procedure frequency, revealing clear strengths and persistent challenges for rare or complex cases. Conclusion Modern LLMs can already outperform routine administrative coding in extracting detailed surgical procedure data, pointing to a future where registry curation could be faster, cheaper, and more consistent. Yet, full alignment with surgical experts remains an open challenge—especially for rare procedures —emphasizing the need for domain adaptation and thoughtful deployment. Our findings illustrate how general-purpose LLMs can advance automated clinical data curation and inform the next generation of surgical informatics.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.250 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.109 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.482 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.434 Zit.