Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Real-time implementation and validation of a natural language processing tool to improve clinical trial enrollment in pancreatic cancer.
0
Zitationen
8
Autoren
2025
Jahr
Abstract
127 Background: Clinical trial enrollment in pancreatic cancer (PDAC) remains suboptimal, especially among minority patients. Artificial intelligence (AI)-powered screening has been proposed as a method to reduce disparities. However, both advanced AI models and more traditional models that use logic-based rules, termed natural language processing (NLP), lack prospective validation in real-world settings. The aim of this study was to evaluate the real-time performance of an NLP tool to identify patients eligible for PDAC trials at a high-volume academic cancer center. Methods: We prospectively validated Deep6 AI, a program that uses traditional NLP to screen clinical notes, imaging and pathology reports, and laboratory results, for all new patients seen for PDAC by medical, surgical, and radiation oncology (September 2024 to January 2025). Rule-based screening algorithms were built for each trial based on eligibility criteria, and patients were screened after their initial visit. Concurrently, manual screening was conducted for all minority patients and served as the reference standard. Model performance was assessed using precision (true positives/flagged matches) in the full cohort and recall (true positives/all true matches) and F1 score (harmonic mean of precision and recall) within the manually screened subset. Secondary analyses evaluated performance by age, sex, self-reported race and ethnicity, clinical stage, and trial characteristics. Results: Of the 402 patients screened, Deep6 identified potential matches among 105 (26.1%) patients with an average of 1.6 potential trial matches per patient (172 total). A total of 48 matches were true matches (precision 27.9%). Manual review of 101 minority patients identified 65 patient-trial matches missed by Deep6 (recall 14.5%, F1 score 19.6%). Among matched patients, precision did not differ by age, sex, race, ethnicity, or stage. On univariate analysis, increasing age (OR 0.97, p =0.011) and locally advanced disease (ref. metastatic; OR 0.49, p =0.047) were associated with lower odds of matching. Among 18 trials, Deep6 performance was highest in trials for resectable tumors (average precision 40.4%) and accounted for an average of 20.0% of verified matches. Trials with interpretive eligibility criteria, such as imaging-based assessments of disease extent or progression, had lower average precision (7.0%) and captured 9.0% of matched patients. Conclusions: NLP tools can support equitable screening for trials. However, their effectiveness is greatest in trials where eligibility criteria align closely with structured data and rule-based logic. Trials with complex eligibility requirements may require more advanced AI models. These findings support hybrid workflows that integrate NLP, advanced AI, and manual review to improve the accuracy, scalability, and equity of clinical trial screening.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.316 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.177 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.575 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.468 Zit.