Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Data-centric approach to uncover and mitigate representation bias in AI segmentation of esophageal tumors on CT

2026·0 Zitationen·European Journal of Radiology Artificial IntelligenceOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Accurate CT-based segmentation of primary esophageal tumors is critical for disease management. While AI models show promise for automated segmentation, the underrepresentation of certain subgroups may limit their robustness and generalizability. However, it remains unclear which factors contribute to reduced model effectiveness. In this retrospective study, segmentation models using nnU-Net were trained on baseline CT images. We systematically excluded tumors based on histological subtype and anatomical location to assess the impact on segmentation performance, measured by Dice similarity coefficient (DSC) and tumor detection rate. Additionally, we explored the effect of including intravenous contrast-enhanced scans in the training set. 275 patients were included: 71 from the Erasmus University Medical Center for training and internal testing, and 204 from the Memorial Sloan Kettering Cancer Center for external testing. Excluding squamous cell carcinoma tumors from the training set significantly reduced average DSC for these lesions on the external test set (p < 0.0001; Cohen’s d=0.79), while results for adenocarcinoma remained stable (p > 0.1; Cohen’s d=-0.03). Similarly, excluding mid and upper esophageal tumors decreased segmentation performance (p < 0.01; Cohen’s d=0.90 and 0.72 for the mid and upper ones). Finally, incorporating contrast-enhanced CT improved performance with a large effect size compared to a model trained only on non-contrast scans, with a mean DSC increase of 39% on contrast-enhanced images. Underrepresentation of histological subtypes, tumor locations, and imaging protocols in training data introduces representation bias that can considerably reduce segmentation performance. Expanding the training set to include these factors helps mitigate this bias and improves the robustness and generalizability of AI models. • Training data diversity strongly impacts esophageal tumor segmentation generalizability. • Underrepresentation of subgroups introduces representation bias and reduces AI model performance. • Including diverse tumor locations, histologies and imaging protocols improve model robustness.

Autoren

Institutionen

Themen

Esophageal Cancer Research and TreatmentRadiomics and Machine Learning in Medical ImagingArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Data-centric approach to uncover and mitigate representation bias in AI segmentation of esophageal tumors on CT

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen