OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 01.04.2026, 20:59

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Importance of stratified sampling for use in the development of training and test sets: medical imaging AI applications

2025·3 Zitationen
Volltext beim Verlag öffnen

3

Zitationen

4

Autoren

2025

Jahr

Abstract

The purpose of our study was to understand the importance of stratified sampling across multiple dataset characteristics (attributes) to yield appropriate training and test sets in developing and evaluating AI. Sampling algorithms are widely used to split data into training and testing cases in AI model development. Datasets are often split into a training set and test set to balance disease classes. However, other patient characteristics such as demographic attributes (age, race, ethnicity, and sex) can also be used. Here, we measured the similarity of subsets stratified on demographic attributes and disease classes. To do this, we built on our previous work using the Jensen-Shannon distance (JSD). JSD is a measure of similarity between two distributions. Previously, we had measured the similarity across datasets in terms of separate demographic attributes and disease states. In this study, we used a multidimensional JSD score that incorporates multiple demographic attributes and disease state into a single score. We calculated JSD scores that allowed us to compare the similarity of the subsets produced by the stratified sampling algorithm for each attribute separately and for all attributes combined (i.e., multidimensional JSD). Thus, a secondary aim of our study was to validate this generalized stratified sampling algorithm used to sequester images in the Medical Imaging and Data Resource Center (MIDRC) database. The third aim of our study was to calculate an upper limit for the JSD score to calibrate our intuition on the performance of the stratified sampling algorithm as compared to random sampling. The multidimensional JSD was calculated using an aggregate method. This method lists all possible combinations of attributes (demographic and disease state), counts instances from each dataset, and compares their similarity using the JSD. Our results show that the multi-dimensional JSD scores from random sampling and stratified sampling ranged from 0.1843 to 0.2159 and 0.1468 to 0.1674, respectively. This indicates that the stratified sampling framework yields training and test sets with a high degree of similarity. These results indicate the requirement for stratified sampling when training and testing AI.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen