Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Are AI-generated electrocardiograms clinically accurate? Benchmarking accuracy of AI-generated ECGs: a multiplatform performance study of public LLMs
0
Zitationen
12
Autoren
2026
Jahr
Abstract
Abstract Background The use of generative AI to simulate electrocardiograms (ECGs) is expanding in medical education and digital cardiology. However, the diagnostic accuracy of ECGs produced by publicly accessible AI platforms has not been systematically evaluated. This study assessed whether synthetic ECGs generated by general-purpose AI services can accurately represent pre-specified arrhythmias. Purpose To evaluate the diagnostic accuracy and interpretability of ECGs generated by three widely available public AI services when prompted to simulate specific cardiac rhythms. Methods Bard, Bing Image Creator, and DALL-E were each prompted to generate ECG strips for ten common cardiac rhythms: sinus rhythm, sinus tachycardia, sinus bradycardia, atrial fibrillation, atrial flutter, ventricular tachycardia, ventricular fibrillation, complete heart block, supraventricular tachycardia, and asystole. Each platform produced four ECGs per rhythm (n=120). After excluding duplicates and non-ECG outputs (n=7), 113 ECGs remained. Three blinded physicians, including a cardiologist, independently reviewed each image and attempted to diagnose the rhythm. Discrepancies were resolved via adjudication. Accuracy was defined as agreement between the prompted rhythm and final expert consensus. Results were stratified by platform and rhythm. Results Only 37 of 113 ECGs (32.7%) accurately matched the intended rhythm. Additionally, 25.1% were uninterpretable due to graphical artefacts, physiologically implausible tracings, or distorted morphology. Bard produced the highest proportion of correct ECGs (84.5%) but primarily retrieved existing online images. Bing and DALL-E achieved rhythm-matched outputs in only 12.5% and 10% of cases, respectively. Atrial flutter (58.3%) and atrial fibrillation (50%) were the most accurately generated rhythms. Conclusion Synthetic ECGs generated by public AI tools demonstrate poor and inconsistent diagnostic accuracy. While Bard produced more rhythm-matched images, these were often retrieved rather than generated. These findings highlight the current limitations of publicly available generative AI for ECG simulation and support the need for domain-specific models before integration into clinical education.
Ähnliche Arbeiten
A Real-Time QRS Detection Algorithm
1985 · 7.622 Zit.
An Overview of Heart Rate Variability Metrics and Norms
2017 · 6.371 Zit.
Power Spectrum Analysis of Heart Rate Fluctuation: A Quantitative Probe of Beat-to-Beat Cardiovascular Control
1981 · 5.055 Zit.
The impact of the MIT-BIH Arrhythmia Database
2001 · 4.497 Zit.
Decreased heart rate variability and its association with increased mortality after acute myocardial infarction
1987 · 3.988 Zit.