Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
VP34.14: Machine learning algorithms in ultrasound: quality assurance metrics for annotations used in training
0
Zitationen
11
Autoren
2020
Jahr
Abstract
To reduce annotator bias in fetal ultrasound artificial intelligence (AI) models, variation in ultrasound annotation between sonologists needs to be determined. Our aim was to develop metrics to quantify this variability and establish baseline intra- and inter-annotator agreement. Two experienced sonologists (A1 and A2) annotated individual image frames from fetal ultrasound videos. Each frame was manually labelled by drawing bounding boxes around anatomical features (11 labels). Pairwise comparisons were undertaken between annotations performed 2 weeks apart (intra-annotator agreement) and between sonologists (inter-annotator agreement). Percentage agreement and pooled kappa values were calculated for the following metrics: exact matching of labels on a frame-by-frame basis; Intersection of Union (IoU) between bounding box areas of >50% for each label; and the number of matching frames labelled with each anatomical feature. We annotated 18,717 frames from 20 videos. Intra-annotator agreement was high, with exact matching of labels on a frame-by-frame basis in 84.0% for A1 and 91.9% for A2. The average IoU (>50%) was 94.6% and 96.8% of annotations for A1 and A2, respectively. Agreement in the number of matching frames labelled with each anatomical feature for A1 and A2 was 91.8% and 95.7% (pooled kappa 0.93 (95% CI 0.93-0.94) and 0.96 (0.96-0.97)), respectively, demonstrating excellent intra-annotator agreement. Inter-annotator agreement was also excellent: 71.4% of frames had exactly matching labels and IoU (>50%) was 92.9%. The number of matching frames labelled with each anatomical feature was 76.9% (pooled kappa 0.86 (0.84-0.87)), demonstrating high inter-annotator agreement. We propose a framework to quantify agreement between sonologists when manually annotating ultrasound videos, and report intra- and inter-annotator agreement using these tests. These estimates of agreement can be used as a guide for acceptable levels of agreement.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.456 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.332 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.779 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.533 Zit.