Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Using Elicit AI research assistant for data extraction in systematic reviews: a feasibility study across environmental and life sciences
4
Zitationen
7
Autoren
2025
Jahr
Abstract
Data extraction in systematic reviews, maps and meta-analyses is time-consuming and prone to human error or subjective judgment. Large Language Models offer potential for automating this process, yet their performance has been evaluated in a limited range of platforms, disciplines, and review types. We assessed the performance of the Elicit platform across diverse data extraction tasks using journal articles from seven systematic-like reviews in life and environmental sciences. Human-extracted data served as the gold standard. For each review, we used eight articles for prompt development and another eight for testing. Initial prompts were iteratively refined to exceed 87% accuracy or up to five rounds. We then tested extraction accuracy, reproducibility across user accounts, and the effect of Elicit’s high-accuracy mode. Of 90 considered prompts, 70 exceeded the 87% accuracy when compared to gold standard values but tended to be lower when tested on a new set of articles. Repeating data extractions with different Elicit user accounts resulted in 90% agreement on extracted values, though supporting quotes and reasoning matched in only 46% and 30% of cases, respectively. In high-accuracy mode, value matches dropped to 77%, with just 10% quote matches and 0% reasoning matches. Extraction accuracy did not differ by data types. Elicit also helped identify eight (<1%) errors in the gold standard data. Our results show that Elicit can complement, but not replace, human data extractors. Elicit may be best used as a secondary reviewer and to evaluate the clarity of data extraction protocols. Prompts must be fine-tuned and independently validated.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.445 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.325 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.761 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.530 Zit.