Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative evaluation of generative artificial intelligence models for synthetic knee radiograph augmentation in clinical research
0
Zitationen
10
Autoren
2026
Jahr
Abstract
In this study, the capability of state-of-the-art generative models to synthesize realistic knee radiographs was evaluated to address dataset scarcity in osteoarthritis (OA) research. Three generative frameworks—Style Generative Adversarial Network3 (StyleGAN3), a stable diffusion + Cycle-consistent Generative Adversarial Network (CycleGAN) pipeline, and Deep Convolutional Generative Adversarial Network (DCGAN)—were trained on 10,042 real knee X-rays. Image quality was assessed using Fréchet Inception Distance (FID) while visual fidelity was evaluated via a Visual Turing Test conducted by two orthopedic surgeons and a musculoskeletal radiologist. Joint Line Convergence Angle (JLCA) was compared between real and synthetic images for anatomical fidelity. Inter- and intra-observer reliability for JLCA was measured using intraclass correlation coefficients (ICC). StyleGAN3 achieved the best performance (FID 10.84), showing high visual and anatomical fidelity. Integrating Stable Diffusion with CycleGAN showed a moderate FID of 39.79, suggesting that adversarial enhancements improved the diffusion-based synthesis. DCGAN showed lower quality, achieving an FID of 74.15. Expert accuracy in distinguishing real from synthetic images ranged between 36% and 88%, confirming difficulty in visual differentiation. Furthermore, JLCA measurements showed no significant difference between real (4.19 ± 3.07°) and synthetic (3.36 ± 2.19°) images generated by DCGAN (p = 0.12). Similarly, Diffusion + CycleGAN (3.91 ± 2.59° vs. 3.72 ± 2.52°, p = 1.00) and StyleGAN3 (4.27 ± 3.01° vs. 3.60 ± 2.37°, p = 0.25) showed no statistically significant differences. These results indicate that all elevated generative models maintained high anatomical fidelity relative to real radiographs. Inter-observer agreement was strong, with ICC values ranging between 0.83 and 0.97. Intra-observer reliability was also excellent. StyleGAN3 generated the most realistic knee radiographs. Diffusion-based pipelines showed promising results when enhanced with adversarial networks. These findings underscore the potential of generative AI to mitigate data limitations in orthopedic research.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.