Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Capability of ChatGPT to support the screening process of scoping reviews: A feasibility study
3
Zitationen
2
Autoren
2024
Jahr
Abstract
Abstract Background The time-consuming screening phase in health-related evidence syntheses is increasingly supported by artificial intelligence (AI). However, scoping reviews have not benefited as much as systematic reviews from such AI tools as they utilize conceptual rather than keyword-specific search strategies to address broad research questions. Context-understanding chatbots based on large language models could potentially enhance the efficiency of scoping review screenings. This study evaluates the performance of ChatGPT against an open-access AI tool used for abstract screening in systematic reviews and the costs involved. Methods Leveraging data from a prior scoping review, we compared the performance of ChatGPT 4.0 and 3.5 against Rayyan, using human researchers’ decisions as a benchmark. A random set of 50 included and 50 excluded abstracts was used to train Rayyan’s algorithm and to develop ChatGPT’s prompt. ChatGPT 4.0’s evaluation was repeated after 5-7 days to assess response consistency. We computed performance metrics including sensitivity, specificity, and accuracy. Results ChatGPT 4.0 and 3.5 achieved 68% accuracy, 11% precision, 99% negative predictive value, and 67% specificity. Sensitivity was high at 88-89% for ChatGPT 4.0 and 84% for ChatGPT 3.5. ChatGPT 4.0 showed a substantial interrater reliability between the two evaluations and moderate reliability compared to ChatGPT 3.5. The cost of deployment varied, with Rayyan being free, ChatGPT 3.5 costing $9.06 and ChatGPT 4.0 $505.72. Conclusions Given the exponential increase in publications, effective mechanisms to support the screening phase of scoping reviews are needed. Our feasibility study using ChatGPT to decide on abstracts’ inclusion or exclusion achieved good performance metrics. Given further positive evaluation, such chatbots might be incorporated in the standard screening process, possibly replacing a second screener, saving time and costs, and accelerating evidence synthesis. Key messages • ChatGPT performs well in supporting screening for scoping reviews, outperforming a traditional AI tool at reasonable costs. • Employing chatbots like ChatGPT could potentially cut costs and time in scoping reviews.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.292 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.143 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.539 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.452 Zit.