Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Enhancing Recall Using Data Cleaning for Biomedical Big Data

2020·4 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2020

Jahr

Abstract

In clinical practice, large amounts of heterogeneous medical data are generated on a daily basis. This data has the potential to be used for biomedical research and as a diagnostic reference for physicians. However, leveraging heterogeneous data for analysis requires integrating it first. Integration process includes a pre-processing data cleaning phase that eliminates inconsistencies and errors originating from each data source. In this paper, we describe a workflow for cleaning heterogeneous biomedical data sources. Our novel data cleaning approach can be applied for replacement of missing text and to improve the number of relevant cases retrieved by search queries. When the threshold for missing category replacement is met, our results show that our method achieves a missing content replacement precision of 85%, which represents an improvement of 18% over the baseline state of our datasets.

Autoren

Institutionen

Themen

Artificial Intelligence in HealthcareArtificial Intelligence in Healthcare and EducationAI in cancer detection

Volltext beim Verlag öffnen

Enhancing Recall Using Data Cleaning for Biomedical Big Data

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen