Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Effect of human-AI teams on oncology prescreening: Final analysis of a randomized trial.
1
Zitationen
10
Autoren
2025
Jahr
Abstract
1508 Background: Eligibility assessment for oncology clinical trials – “prescreening” – relies on manual review of unstructured clinical notes, which is error-prone and time-consuming. Artificial intelligence (AI) language models that merge deep learning with oncologist-derived rules (neurosymbolic AI) can enhance prescreening by automating eligibility extraction from longitudinal electronic health records (EHRs), yet real-world evaluations are limited. We compared the accuracy and efficiency of traditional vs. AI-augmented (Human+AI) prescreening. Methods: In this randomized non-inferiority trial, two research coordinators (RCs) abstracted 12 common trial eligibility criteria from complete EHRs from patients with advanced non-small cell lung cancer (NSCLC) or colorectal cancer (CrC) treated in a community oncology practice. Before the trial, gold-standard abstraction was performed by 3 independent oncologist reviewers. Charts were randomized in blocks of 20 to be viewed alone (Human-alone) or augmented by a pretrained neurosymbolic model (Human+AI) in a paired design, such that each RC reviewed each patient chart. The primary aim was to evaluate noninferiority (margin ±5%) and subsequent superiority of chart-level accuracy (proportion of correctly abstracted elements per chart relative to gold standard) between Human+AI vs. Human-alone. Secondary outcomes were criterion-level accuracy (proportion of correctly abstracted elements across charts for each eligibility criterion), and efficiency (median abstraction time per chart). Paired t-tests and Wilcoxon rank-sum tests assessed differences between Human+AI vs. Human-alone. We descriptively compared accuracy of both arms vs. the AI algorithm (AI-alone). Results: Among 356 charts (196 NSCLC, 160 CrC), Human+AI had noninferior and superior accuracy than Human-alone (76.1% vs. 71.5%, p < 0.001); both Human arms were superior to AI-alone (59.9%). Human+AI had greatest criterion-level accuracy for 7 of 12 criteria. Efficiency was similar between Human arms (32.1 vs. 31.8 min, p = 0.51). Conclusions: AI-augmented prescreening was more accurate than RC or AI prescreening alone. Human+AI teaming most improved accuracy for biomarker, staging, and response criteria. While Human+AI did not save time, efficiency gains may be realized as RCs become more familiar with AI eligibility models. AI language models can enhance CRC prescreening and identification of trial-eligible patients. Clinical trial information: NCT06561217 . Accuracy across arms. Criteria Accuracy (%) p-value Human- Alone Human+AI AI-Alone Overall 71.5 76.1 59.9 <0.001 Neoplasm Cancer Type 86.9 86.4 73.3 0.80 Stage Group 71.7 73.4 57.0 0.57 M Stage 43.9 57.0* 60.2 <0.001 N Stage 50.5 66.3* 52.6 <0.001 T Stage 56.3 71.6* 54.3 <0.001 Biomarker Biomarker Tested? 84.6 93.2** 88.1 <0.001 Biomarker Result 67.9 79.0* 32.5 <0.001 Biomarker Result Interpretation 80.8 91.3* 35.7 <0.001 Other Outcome 23.7 35.9* 55
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.100 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.466 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.