Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
HIOC: a hybrid imputation method to predict missing values in medical datasets
24
Zitationen
3
Autoren
2021
Jahr
Abstract
Purpose Decision support systems developed using machine learning classifiers have become a valuable tool in predicting various diseases. However, the performance of these systems is adversely affected by the missing values in medical datasets. Imputation methods are used to predict these missing values. In this paper, a new imputation method called hybrid imputation optimized by the classifier (HIOC) is proposed to predict missing values efficiently. Design/methodology/approach The proposed HIOC is developed by using a classifier to combine multivariate imputation by chained equations (MICE), K nearest neighbor (KNN), mean and mode imputation methods in an optimum way. Performance of HIOC has been compared to MICE, KNN, and mean and mode methods. Four classifiers support vector machine (SVM), naive Bayes (NB), random forest (RF) and decision tree (DT) have been used to evaluate the performance of imputation methods. Findings The results show that HIOC performed efficiently even with a high rate of missing values. It had reduced root mean square error (RMSE) up to 17.32% in the heart disease dataset and 34.73% in the breast cancer dataset. Correct prediction of missing values improved the accuracy of the classifiers in predicting diseases. It increased classification accuracy up to 18.61% in the heart disease dataset and 6.20% in the breast cancer dataset. Originality/value The proposed HIOC is a new hybrid imputation method that can efficiently predict missing values in any medical dataset.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.449 Zit.
UCI Machine Learning Repository
2007 · 24.319 Zit.
An introduction to ROC analysis
2005 · 20.888 Zit.
Prediction of Coronary Heart Disease Using Risk Factor Categories
1998 · 9.596 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.166 Zit.