Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Using Natural Language Processing to Improve Efficiency of Manual Chart Abstraction in Research: The Case of Breast Cancer Recurrence
173
Zitationen
7
Autoren
2014
Jahr
Abstract
The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction.
Ähnliche Arbeiten
Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support
2008 · 50.779 Zit.
Gene Ontology: tool for the unification of biology
2000 · 44.315 Zit.
STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets
2018 · 18.994 Zit.
Haploview: analysis and visualization of LD and haplotype maps
2004 · 14.687 Zit.
A translation approach to portable ontology specifications
1993 · 12.495 Zit.