OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 23.03.2026, 13:08

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

291P Harnessing AI and clinical guidelines to find metastatic breast cancer patients not tested and treated according to guidelines

2025·0 Zitationen·ESMO Real World Data and Digital OncologyOpen Access
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2025

Jahr

Abstract

Background: Advances in NLP and LLM offer promising approaches for automated data extraction.In China, NLP has been applied in specific database studies, but wider adoption is limited by lack of algorithm transparency and performance consistency.This study assessed a rule-based NLP and a fine-tuned LLM for extracting key variables from Chinese electronic health records. Methods: In this pilot project, we retrospectively sampled clinical data of 480 NSCLC patients in the National Anti-tumor Drug Surveillance System between 2018 and 2023.Two automated approaches were compared with manual review: (1) a well-validated NLP integrating BERT-based named entity recognition and ALBERT-based relation extraction; (2) an exploratory Qwen3-30B-based LLM fine-tuned for few-shot learning.Accuracy and F1-scores were assessed across variables, including demographics, clinical characteristics, treatment history, and mortality.Time efficiency was measured. Results:The NLP and LLM models achieved overall F1-scores of 72.0% and 76.0%, respectively.While both showed comparable accuracy, the NLP performed better on semi-structured variables such as TNM staging (88.0% vs. 82.0%)and ECOG status (79.0% vs. 63.9%).The LLM demonstrated stronger performance on contextdependent variables, with higher F1-scores for start date (LLM 88.00% vs. NLP 72.0%), end date (65.00% vs. 56.0%),and progression in first-line therapy (92.0% vs. 87.0%).With LLM, manual effort reduced from 168 to 40 hours for quality control and optimization.Further NLP optimization was not conducted due to prior training on 100,000 diverse sentences derived from over two million records. Conclusions:The fine-tuned LLM outperformed the rule-based NLP overall, particularly in extracting complex, context-dependent variables.These results underscored the potential of NLP and LLM to improve research efficiency, lower data curation costs, and enable scalable real-world evidence generation in oncology research in China.Further research-specific refinement is essential to meet rigorous accuracy standards and ensure robustness across diverse medical documentation.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingCancer Genomics and Diagnostics
Volltext beim Verlag öffnen