Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Toward a Literature-Driven Definition of Big Data in Healthcare
209
Zitationen
4
Autoren
2015
Jahr
Abstract
OBJECTIVE: The aim of this study was to provide a definition of big data in healthcare. METHODS: A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. RESULTS: A total of 196 papers were included. Big data can be defined as datasets with Log(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. CONCLUSION: Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.449 Zit.
UCI Machine Learning Repository
2007 · 24.319 Zit.
An introduction to ROC analysis
2005 · 20.871 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.165 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.077 Zit.