Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessing data extraction in randomized clinical trials with large language models
0
Zitationen
9
Autoren
2026
Jahr
Abstract
Data extraction is an essential step in evidence synthesis but remains time-consuming and prone to human error. Large language models (LLMs) such as ChatGPT-4 and Claude 3 Opus may offer partial automation solutions. This proof-of-concept study evaluated their preliminary performance in extracting data from full-text randomized controlled trial (RCT) reports within systematic reviews. Two previously validated systematic reviews published in European Urology (105 trials in total) were selected. Standardized prompts were created and optimized with ChatGPT-4 and tested independently across trials using both ChatGPT-4 (paid version) and Claude 3 Opus via the standard web interface. Each prompt was executed three times, and only the first output was used to calculate the proportion of correctly extracted items (Pacc). Extracted data were compared with verified gold-standard datasets. For binary outcomes, ChatGPT-4 and Claude 3 Opus showed high accuracy in group size extraction (Pacc: 91%-94%) and moderate accuracy for event counts (Pacc: 57%-71%). For continuous outcomes, group size accuracy was moderate (Pacc: 59%-69%), while mean and standard deviation extraction was poor (Pacc: 24%-56%). Test-retest reliability was substantial to almost perfect. Current LLMs can assist in automating data extraction for binary outcomes but remain inconsistent for continuous outcomes. These preliminary results should be interpreted with caution. Further research using larger datasets and iterative prompt refinement is needed before LLMs can be reliably integrated into systematic review workflows.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.200 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.051 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.416 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.410 Zit.
Autoren
Institutionen
- Peking University(CN)
- Wuhan University(CN)
- Zhongnan Hospital of Wuhan University(CN)
- Hubei University of Science and Technology(CN)
- Qianjiang Central Hospital(CN)
- Hubei Provincial Center for Disease Control and Prevention(CN)
- Unidad de Cirugía Artroscópica(ES)
- Hubei University of Technology(CN)
- Temple University(US)