OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 02.05.2026, 10:33

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Identifying and handling data bias within primary healthcare data using synthetic data generators

2024·23 Zitationen·HeliyonOpen Access
Volltext beim Verlag öffnen

23

Zitationen

4

Autoren

2024

Jahr

Abstract

Advanced synthetic data generators can simulate data samples that closely resemble sensitive personal datasets while significantly reducing the risk of individual identification. The use of these advanced generators holds enormous potential in the medical field, as it allows for the simulation and sharing of sensitive patient data. This enables the development and rigorous validation of novel AI technologies for accurate diagnosis and efficient disease management. Despite the availability of massive ground truth datasets (such as UK-NHS databases that contain millions of patient records), the risk of biases being carried over to data generators still exists. These biases may arise from the under-representation of specific patient cohorts due to cultural sensitivities within certain communities or standardised data collection procedures. Machine learning models can exhibit bias in various forms, including the under-representation of certain groups in the data. This can lead to missing data and inaccurate correlations and distributions, which may also be reflected in synthetic data. Our paper aims to improve synthetic data generators by introducing probabilistic approaches to first detect difficult-to-predict data samples in ground truth data and then boost them when applying the generator. In addition, we explore strategies to generate synthetic data that can reduce bias and, at the same time, improve the performance of predictive models.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Machine Learning in HealthcareMedical Coding and Health InformationInsurance, Mortality, Demography, Risk Management
Volltext beim Verlag öffnen