Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Ontology-based venous thromboembolism risk assessment model developing from medical records
32
Zitationen
6
Autoren
2019
Jahr
Abstract
BACKGROUND: Padua linear model is widely used for the risk assessment of venous thromboembolism (VTE), a common but preventable complication for inpatients. However, genetic and environmental differences between Western and Chinese population limit the validity of Padua model in Chinese patients. Medical records which contain rich information about disease progression, are useful in mining new risk factors related to Chinese VTE patients. Furthermore, machine learning (ML) methods provide new opportunities to build precise risk prediction model by automatic selection of risk factors based on original medical records. METHODS: Medical records of 3,106 inpatients including 224 VTE patients were collected and various types of ontologies were integrated to parse unstructured text. A workflow of ontology-based VTE risk prediction model, that combines natural language processing (NLP) and machine learning (ML) technologies, was proposed. Firstly ontology terms were extracted from medical records, then sorted according to their calculated weights. Next importance of each term in the unit of section was evaluated and finally a ML model was built based on a subset of terms. Four ML methods were tested, and the best model was decided by comparing area under the receiver operating characteristic curve (AUROC). RESULTS: Medical records were first split into different sections and subsequently, terms from each section were sorted by their weights calculated by multiple types of information. Greedy selection algorithm was used to obtain significant sections and terms. Top terms in each section were selected to construct patients' distributed representations by word embedding technique. Using top 300 terms of two important sections, namely the 'Progress Note' section and 'Admitting Diagnosis' section, the model showed relatively better predictive performance. Then ML model which utilizes a subset of terms from two sections, about 110 terms, achieved the best AUC score, of 0.973 ± 0.006, which was significantly better compared to the Padua's performance of 0.791 ± 0.022. Terms found by the model showed their potential to help clinicians explore new risk factors. CONCLUSIONS: In this study, a new VTE risk assessment model based on ontologies extraction from raw medical records is developed and its performance is verified on real clinical dataset. Results of selected terms can help clinicians to discover meaningful risk factors.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.789 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.555 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.989 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.598 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.124 Zit.