Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating Text-to-Image Generated Photorealistic Images of Human Anatomy

2024·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Abstract Background Generative AI models that can produce photorealistic images from text descriptions have many applications in medicine, including medical education and synthetic data. However, it can be challenging to evaluate and compare their range of heterogeneous outputs, and thus there is a need for a systematic approach enabling image and model comparisons. Methods We develop an error classification system for annotating errors in AI-generated photorealistic images of humans and apply our method to a corpus of 240 images generated with three different models (DALL-E 3, Stable Diffusion XL and Stable Cascade) using 10 prompts with 8 images per prompt. The error classification system identifies five different error types with three different severities across five anatomical regions and specifies an associated quantitative scoring method based on aggregated proportions of errors per expected count of anatomical components for the generated image. We assess inter-rater agreement by double-annotating 25% of the images and calculating Krippendorf’s alpha and compare results across the three models and ten prompts quantitatively using a cumulative score per image. Findings The error classification system, accompanying training manual, generated image collection, annotations, and all associated scripts are available from our GitHub repository at https://github.com/hastingslab-org/ai-human-images . Inter-rater agreement was relatively poor, reflecting the subjectivity of the error classification task. Model comparisons revealed DALL-E 3 performed consistently better than Stable Diffusion, however, the latter generated images reflecting more diversity in personal attributes. Images with groups of people were more challenging for all the models than individuals or pairs; some prompts were challenging for all models. Interpretation Our method enables systematic comparison of AI-generated photorealistic images of humans; our results can serve to catalyse improvements in these models for medical applications. Funding This study received support from the University of Zurich’s Digital Society Initiative, and the Swiss National Science Foundation under grant agreement 209510. Research in context Evidence before this study The authors searched PubMed and Google Scholar to find publications evaluating text-to-image model outputs for medical applications between 2014 (when generative adversarial networks first become available) and 2024. While the bulk of evaluations focused on task-specific networks generating single types of medical image, a few evaluations emerged exploring the novel general-purpose text-to-image diffusion models more broadly for applications in medical education and synthetic data generation. However, no previous work attempts to develop a systematic approach to evaluate these models’ representations of human anatomy. Added value of this study We present an anatomical error classification system, the first systematic approach to evaluate AI-generated images of humans that enables model and prompt comparisons. We apply our method to a corpus of generated images to compare state of the art large-scale models DALL-E 3 and two models from the Stable Diffusion family. Implications of all the available evidence While our approach enables systematic comparisons, it remains limited by subjectivity and is labour-intensive for images with many represented figures. Future research should explore automation of some aspects of the evaluation through coupled segmentation and classification models.

Autoren

Institutionen

University of Zurich(CH)

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingAI in cancer detection

Volltext beim Verlag öffnen

Evaluating Text-to-Image Generated Photorealistic Images of Human Anatomy

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen