Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Ontology-based venous thromboembolism risk assessment model developing from medical records

2019·32 Zitationen·BMC Medical Informatics and Decision MakingOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2019

Jahr

Abstract

BACKGROUND: Padua linear model is widely used for the risk assessment of venous thromboembolism (VTE), a common but preventable complication for inpatients. However, genetic and environmental differences between Western and Chinese population limit the validity of Padua model in Chinese patients. Medical records which contain rich information about disease progression, are useful in mining new risk factors related to Chinese VTE patients. Furthermore, machine learning (ML) methods provide new opportunities to build precise risk prediction model by automatic selection of risk factors based on original medical records. METHODS: Medical records of 3,106 inpatients including 224 VTE patients were collected and various types of ontologies were integrated to parse unstructured text. A workflow of ontology-based VTE risk prediction model, that combines natural language processing (NLP) and machine learning (ML) technologies, was proposed. Firstly ontology terms were extracted from medical records, then sorted according to their calculated weights. Next importance of each term in the unit of section was evaluated and finally a ML model was built based on a subset of terms. Four ML methods were tested, and the best model was decided by comparing area under the receiver operating characteristic curve (AUROC). RESULTS: Medical records were first split into different sections and subsequently, terms from each section were sorted by their weights calculated by multiple types of information. Greedy selection algorithm was used to obtain significant sections and terms. Top terms in each section were selected to construct patients' distributed representations by word embedding technique. Using top 300 terms of two important sections, namely the 'Progress Note' section and 'Admitting Diagnosis' section, the model showed relatively better predictive performance. Then ML model which utilizes a subset of terms from two sections, about 110 terms, achieved the best AUC score, of 0.973 ± 0.006, which was significantly better compared to the Padua's performance of 0.791 ± 0.022. Terms found by the model showed their potential to help clinicians explore new risk factors. CONCLUSIONS: In this study, a new VTE risk assessment model based on ontologies extraction from raw medical records is developed and its performance is verified on real clinical dataset. Results of selected terms can help clinicians to discover meaningful risk factors.

Autoren

Institutionen

Themen

Machine Learning in HealthcareBiomedical Text Mining and OntologiesArtificial Intelligence in Healthcare

Volltext beim Verlag öffnen

Ontology-based venous thromboembolism risk assessment model developing from medical records

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen