Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Flexible Imputation of Missing Data

2018·722 Zitationen·Journal of Statistical SoftwareOpen Access

Volltext beim Verlag öffnen

722

Zitationen

Autoren

2018

Jahr

Abstract

Missingness is a commonly occurring phenomenon in many applications. Determining a suitable analytical approach in the absence of complete observations is a major focus of scientific inquiry due to the extra sophistication that arises through missing data. Incompleteness generally complicates the statistical analysis in terms of reduced statistical power, biased parameter estimates, and degraded confidence intervals, and thereby may lead to false inferences. Developments in computational statistics have produced flexible missing-data procedures with a sound statistical basis. One of these procedures involves multiple imputation (MI), which is a stochastic simulation technique in which the missing values are replaced by m > 1 simulated versions. Subsequently, each of the simulated complete data sets is analyzed by standard methods, and the results are combined into a single inferential statement that formally incorporates missing-data uncertainty to the modeling process. MI has gained widespread acceptance and popularity in the last few decades. It has some well-accepted advantages: First, MI allows researchers to use conventional models and software; an imputed data set may be analyzed by literally any method that would be appropriate if the data were complete. As computing environments and statistical models grow increasingly complex, the value of using familiar methods and software becomes more pronounced. Second, there are still many classes of problems for which no direct maximum likelihood procedure is available. Even when such a procedure exists, MI can be more attractive due to fact that the separation of the imputation phase from the analysis phase lends greater flexibility to the entire process. Lastly, MI singles out missing data as a source of random variation distinct from ordinary sampling variability. Van Buuren's work is one of the few books that exclusively focus on MI. The book can be regarded as an extended tutorial on the practical application of the R package mice that is available on the CRAN (Comprehensive R Archive Network). The name of the package stands for "multiple imputation by chained equations". This MI technique (also known as fully conditional specification-FCS) is built upon a series of univariate models. Sequential regression equations are specified in a cyclic fashion, by taking the type and nature of a variable given others (e.g., logistic model for a binary variable, proportional odds model for

Autoren

Hakan Demirtaş

Themen

Machine Learning and Data ClassificationMachine Learning in Healthcare

Volltext beim Verlag öffnen

Flexible Imputation of Missing Data

Abstract

Ähnliche Arbeiten

Autoren

Themen