OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 18.03.2026, 00:10

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Comparative evaluation of generative artificial intelligence models for synthetic knee radiograph augmentation in clinical research

2026·0 Zitationen·BMC Medical ImagingOpen Access
Volltext beim Verlag öffnen

0

Zitationen

10

Autoren

2026

Jahr

Abstract

In this study, the capability of state-of-the-art generative models to synthesize realistic knee radiographs was evaluated to address dataset scarcity in osteoarthritis (OA) research. Three generative frameworks—Style Generative Adversarial Network3 (StyleGAN3), a stable diffusion + Cycle-consistent Generative Adversarial Network (CycleGAN) pipeline, and Deep Convolutional Generative Adversarial Network (DCGAN)—were trained on 10,042 real knee X-rays. Image quality was assessed using Fréchet Inception Distance (FID) while visual fidelity was evaluated via a Visual Turing Test conducted by two orthopedic surgeons and a musculoskeletal radiologist. Joint Line Convergence Angle (JLCA) was compared between real and synthetic images for anatomical fidelity. Inter- and intra-observer reliability for JLCA was measured using intraclass correlation coefficients (ICC). StyleGAN3 achieved the best performance (FID 10.84), showing high visual and anatomical fidelity. Integrating Stable Diffusion with CycleGAN showed a moderate FID of 39.79, suggesting that adversarial enhancements improved the diffusion-based synthesis. DCGAN showed lower quality, achieving an FID of 74.15. Expert accuracy in distinguishing real from synthetic images ranged between 36% and 88%, confirming difficulty in visual differentiation. Furthermore, JLCA measurements showed no significant difference between real (4.19 ± 3.07°) and synthetic (3.36 ± 2.19°) images generated by DCGAN (p = 0.12). Similarly, Diffusion + CycleGAN (3.91 ± 2.59° vs. 3.72 ± 2.52°, p = 1.00) and StyleGAN3 (4.27 ± 3.01° vs. 3.60 ± 2.37°, p = 0.25) showed no statistically significant differences. These results indicate that all elevated generative models maintained high anatomical fidelity relative to real radiographs. Inter-observer agreement was strong, with ICC values ranging between 0.83 and 0.97. Intra-observer reliability was also excellent. StyleGAN3 generated the most realistic knee radiographs. Diffusion-based pipelines showed promising results when enhanced with adversarial networks. These findings underscore the potential of generative AI to mitigate data limitations in orthopedic research.

Ähnliche Arbeiten