Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Human Researchers are Superior to Large Language Models in Writing a Systematic Review in a Comparative Multitask Assessment
0
Zitationen
8
Autoren
2025
Jahr
Abstract
<title>Abstract</title> <bold>Background</bold> The capability of Large Language Models (LLMs) to support and facilitate research activities has sparked growing interest in their integration into scientific workflows. This paper aims to evaluate and compare against human researchers the performance of 6 different LLMs in conducting the various tasks necessary to produce a systematic literature review.<bold>Methods</bold> The evaluation of the 6 LLMs was split into 3 tasks: literature search, article screening and selection (task 1); data extraction and analysis (task 2); final paper drafting (task 3). Their results were compared with a human-produced systematic review on the same topic, serving as reference standard. The evaluation was repeated on two rounds to evaluate reproducibility and improvements of LLMs over time.<bold>Results</bold> Out of the 18 scientific articles to be extracted from the literature for task 1, the best LLM managed to identify 13. Data extraction and analysis for task 2 was only partially accurate and cumbersome. The full papers generated by LLMs for task 3 were short and uninspiring, often not fully adhering to the standard template for a systematic review.<bold>Conclusion</bold> Currently, LLMs are not capable of independently conducting a scientific systematic review. However, their capabilities are advancing rapidly, and, with an appropriate supervision they can provide valuable support throughout the review process.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.239 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.428 Zit.