Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Predicting 30-Day Postoperative Mortality and American Society of Anesthesiologists Physical Status Using Retrieval-Augmented Large Language Models: Development and Validation Study (Preprint)
0
Zitationen
3
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Accurately assessing perioperative risk is critical for informed surgical planning and patient safety. However, current prediction models often rely on structured data and overlook the nuanced clinical reasoning embedded in free-text preoperative notes. Recent advances in large language models (LLMs) have opened opportunities for harnessing unstructured clinical data, yet their application in perioperative prediction remains limited by concerns about factual accuracy. Retrieval-augmented generation (RAG) offers a promising solution—enhancing LLM performance by grounding outputs in domain-specific knowledge sources, potentially improving both predictive accuracy and clinical interpretability. </sec> <sec> <title>OBJECTIVE</title> This study aimed to investigate whether integrating LLMs with RAG can improve the prediction of 30-day postoperative mortality and American Society of Anesthesiologists (ASA) physical status classification using unstructured preoperative clinical notes. </sec> <sec> <title>METHODS</title> We conducted a retrospective cohort study using 24,491 medical records from a tertiary medical center, including preoperative anesthesia assessments, discharge summaries, and surgical information. To extract clinical insights from free-text data, we used the LLaMA 3.1-8B language model with RAG, using MedEmbed for text embedding and Miller’s Anesthesia as the primary retrieval source. We evaluated model performance under various configurations, including embedding models, chunk sizes, and few-shot prompting. Machine learning (ML) models, including random forest, support vector machines (SVM), Extreme Gradient Boosting (XGBoost), and logistic regression, were trained on structured features as baselines. </sec> <sec> <title>RESULTS</title> A total of 520 (2.1%) patients experienced in-hospital 30-day postoperative mortality. The ASA physical status distribution was as follows: class I: 535 (2.2%); class II: 15,272 (62.4%); class III: 8024 (32.8%); class IV: 606 (2.5%); and class V: 54 (0.22%). For 30-day postoperative mortality prediction, the LLaMA‑RAG model achieved an <i>F</i><sub>1</sub>-score of 0.4663 (95% CI 0.4654-0.4672), versus 0.2369 (95% CI 0.2341-0.2397) without few‑shot prompting, 0.0879 (95% CI 0.0717-0.1041) without RAG, and 0.0436 (95% CI 0.0292-0.0580) without either few‑shot prompting or RAG. Among ML models, XGBoost scored 0.4459 (95% CI 0.4176-0.4742); random forest, 0.3953 (95% CI 0.3791-0.4115); logistic regression, 0.2720 (95% CI 0.2647-0.2793); and SVM, 0.2474 (95% CI 0.2275-0.2673). For ASA classification, LLaMA‑RAG achieved a micro <i>F</i><sub>1</sub>-score of 0.8409 (95% CI 0.8238-0.8551) versus 0.6546 (95% CI 0.6430-0.6796) without few-shot prompting, 0.6340 (95% CI 0.6157-0.6535) without RAG, and 0.4238 (95% CI 0.3952-0.4490) without either few‑shot prompting or RAG. In comparison, XGBoost achieved 0.8273 (95% CI 0.8209-0.8498); logistic regression, 0.7940 (95% CI 0.7671-0.7950); random forest, 0.7847 (95% CI 0.7637-0.7868); and SVM, 0.7697 (95% CI 0.7637-0.7697). Notably, the model demonstrated exceptional sensitivity in identifying rare but high-risk cases, such as ASA Class 5 patients and postoperative deaths. </sec> <sec> <title>CONCLUSIONS</title> The LLaMA-RAG model significantly improved the prediction of postoperative mortality and ASA classification, especially for rare high-risk cases. By grounding outputs in domain knowledge, retrieval-augmented generation enhanced both accuracy and prompt‑driven interpretability over ML and ablation models—highlighting its promise for real-world clinical decision support. </sec> <sec> <title>CLINICALTRIAL</title> <p/> </sec>
Ähnliche Arbeiten
Classification of Surgical Complications
2004 · 30.256 Zit.
2013 ESH/ESC Guidelines for the management of arterial hypertension
2013 · 13.651 Zit.
CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials
2010 · 13.449 Zit.
Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure
2003 · 13.235 Zit.
2013 ACCF/AHA Guideline for the Management of Heart Failure
2013 · 12.586 Zit.