Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Abstract 2747: Source discipline matters: Guideline anchored large language model outperforms Open Evidence for decision support in acute leukemias.

2026·0 Zitationen·Cancer Research

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Abstract Background Acute leukemia is one of the most complex and rapidly evolving domains in hematologic oncology, where treatment selection depends on a variety of factors such as molecular subtype and performance status. The National Comprehensive Cancer Network (NCCN) provides updated, lineage-specific algorithms for Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL), yet these guidelines are dense and frequently revised. Large language models (LLMs) may assist clinicians in synthesizing this data, but the reliability of their outputs depends critically on their evidence sources. This study compared an NCCN-anchored retrieval-augmented model (RAG GPT-5) with Open Evidence (OE), a model linked to journal-based sources such as NEJM and JAMA, to assess accuracy, safety, and guideline concordance in acute leukemia decision support. Methods Forty de-identified AML and ALL vignettes were independently evaluated by two models: Open Evidence (O1) and an NCCN-anchored retrieval-augmented GPT-5 model (O2). ). Reviewers were blinded to model identity and rated each response using a modified Generative Performance Score (mGPS = Guideline Concordance - Hallucination Penalty; range −1.0 to + 1.0). Statistical comparison used independent-samples t-tests. Results The RAG model (O2) demonstrated significantly higher overall performance (mean = 0.84, SD = 0.25) compared with Open Evidence (O1, mean = 0.70, SD = 0.32); t(≈78) = −2.17, p = 0.033. Qualitative review revealed key distinctions in clinical reasoning: • Open Evidence frequently hallucinated agents (e.g., ipilimumab), omitted prior therapy context, and failed to adjust for infection recovery or cardiac risk before chemotherapy. • RAG GPT-5 exclusively cited NCCN recommendations, with minor rounding errors (e.g., ATRA dose), and occasionally defaulted to conservative but still guideline-concordant dosing (e.g., daunorubicin). • Neither model fully addressed dual-tumor or BCR-ABL-positive scenarios, and both under-recognized recent updates such as menin inhibitors for MLL-rearranged AML, which are emerging but not yet NCCN-listed. Variance was smaller for the RAG system, indicating more consistent performance across cases. Conclusions In acute leukemias, evidence source materially alters LLM behavior and reliability. Guideline-anchored retrieval produced significantly more NCCN-concordant recommendations and fewer hallucinations than OE. While both systems occasionally missed nuanced treatment history or recent investigational agents, only OE introduced clinically unsafe suggestions. These findings support NCCN-anchored RAG as the safer and more consistent foundation for LLM-based decision support in acute leukemias, where precision and patient context are paramount. Future work should expand to relapse and transplant scenarios with prospective clinician validation. Citation Format: Peter Palumbo, Connor Yost, Emilio Del Toro, Demetrios Garbis, Peter Odutola, Yash Kumar, Arturo Loaiza, Matthew Sullivan. Source discipline matters: Guideline anchored large language model outperforms Open Evidence for decision support in acute leukemias [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 2747.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationAcute Myeloid Leukemia ResearchMachine Learning in Healthcare

Volltext beim Verlag öffnen

Abstract 2747: Source discipline matters: Guideline anchored large language model outperforms Open Evidence for decision support in acute leukemias.

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen