Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Exploring the Utility of Anonymized EHR Datasets in Machine Learning Experiments in the Context of the MODELHealth Project

2022·2 Zitationen·Applied SciencesOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2022

Jahr

Abstract

The object of this paper was the application of machine learning to a clinical dataset that was anonymized using the Mondrian algorithm. (1) Background: The preservation of patient privacy is a necessity rising from the increasing digitization of health data; however, the effect of data anonymization on the performance of machine learning models remains to be explored. (2) Methods: The original EHR derived dataset was subjected to anonymization by applying the Mondrian algorithm for various k values and quasi identifier (QI) set attributes. The logistic regression, decision trees, k-nearest neighbors, Gaussian naive Bayes and support vector machine models were applied to the different dataset versions. (3) Results: The classifiers demonstrated different degrees of resilience to the anonymization, with the decision tree and the KNN models showing remarkably stable performance, as opposed to the Gaussian naïve Bayes model. The choice of the QI set attributes and the generalized information loss value played a more important role than the size of the QI set or the k value. (4) Conclusions: Data anonymization can reduce the performance of certain machine learning models, although the appropriate selection of classifier and parameter values can mitigate this effect.

Autoren

Institutionen

National Technical University of Athens(GR)

Themen

Privacy-Preserving Technologies in DataData Quality and ManagementArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Exploring the Utility of Anonymized EHR Datasets in Machine Learning Experiments in the Context of the MODELHealth Project

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen