OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 20.03.2026, 06:16

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Accuracy and efficiency of using artificial intelligence for data extraction in systematic reviews. A noninferiority study within reviews

2026·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

10

Autoren

2026

Jahr

Abstract

Abstract Background Systematic reviews are important for informing public health policies and program selection; however, they are time- and resource-intensive. Artificial intelligence (AI) offers a solution to reduce these labour-intensive requirements for various aspects of systematic review production, including data extraction. To date, there is limited robust evidence evaluating the accuracy and efficiency of AI for data extraction. This study within a review (SWAR) aimed to determine whether human data extraction assisted by an AI research assistant (Elicit ® ) is noninferior to human-only data extraction in terms of accuracy (i.e. agreement) and time-to-completion. Secondary aims included comparing error types and costs. Methods A two-arm noninferiority SWAR was conducted to compare AI-assisted and human-only data extraction from 50 RCTs chronic disease interventions. Participants were randomised to extract all data required for conducting a review, using either the AI-assisted or human-only method. Accuracy was assessed using a three-point rubric by an independent assessor blinded to group allocation, based on agreement between extracted data and the assessor. Accuracy scores were standardized to a 0–100 scale. Analysis included overall and subgroup accuracy (data group and data type) using paired t-tests. Time-to-completion was self-reported by data extractors. Type of errors were coded by type and severity, and costs were calculated for data extraction, preparation of files, training and the Elicit ® Pro subscription. Results There was no difference in overall accuracy between the AI-assisted and human-only arms (mean difference (MD) 0.57 (on a 0-100 scale), 95% confidence interval (CI) -1.29, 2.43). Subgroup analysis by data group found AI-assisted to be more accurate than human-only data extraction for data variables describing ‘intervention and control group’ (MD 4.75, 95% CI 2.13, 7.38), but otherwise no subgroup differences were observed. AI-assisted data extraction was significantly faster (MD 24.82 mins, 95% CI 18.80, 30.84). The AI-assisted arm made similar error types (missed or omitted data: AI-assisted 3.6%, human-only 3.4%) and severity (minor errors: AI-assisted 6.7%, human-only 6.5%) and cost $181.98 less than the human-only data extraction across the 50 studies. Conclusion AI-assisted data extraction using Elicit ® showed noninferior accuracy, faster completion times, similar error types and severity, and lower costs compared to human-only extraction. These efficiency gains, without loss in accuracy suggest AI-assisted data extraction can replace one human-only data extractor in future systematic reviews of RCTs. Future research should explore different models of AI data extraction such as two AI-assisted extractors or AI-only extractor with human-only extractor, and comparison of AI-assisted to AI-only.

Ähnliche Arbeiten