OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 09.04.2026, 05:02

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

From Automation to Strategy: Managing AI Uncertainty in Meta-analysis with Context Engineering. (Preprint)

2025·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

6

Autoren

2025

Jahr

Abstract

<sec> <title>BACKGROUND</title> The application of Large Language Models (LLMs) to systematic reviews and meta-analyses promises to accelerate evidence synthesis. However, prior research has focused on automating discrete tasks, such as abstract screening. By concentrating primarily on performance metrics, this work has failed to validate a reliable end-to-end workflow. Their use introduces new forms of AI-enabled uncertainty, challenging traditional validation metrics and creating a need for new management strategies. </sec> <sec> <title>OBJECTIVE</title> This study aims to propose and validate a novel strategic framework, "Context Engineering," designed to navigate the uncertainties of LLM-driven research and manage an LLM to perform a reliable end- to-end meta-analysis. </sec> <sec> <title>METHODS</title> We benchmarked its performance by tasking it with replicating a previously published meta-analysis. We designed a five-layer (Instruction, Knowledge, Tool, History, Formatting) context engineering framework to enable ChatGPT-5 to perform the meta-analysis automation. This framework was designed to manage the LLM's workflow from literature search to statistical synthesis while ensuring methodological rigor. </sec> <sec> <title>RESULTS</title> The LLM pipeline included 19 final studies, demonstrating a low recall of 27.5% compared to the 40 studies in the reference meta-analysis. However, despite the divergent study cohorts, the pooled OR for non-advanced adenoma was nearly identical between the LLM pipeline and the original study (1.46 vs 1.45, respectively). This outcome, a high-fidelity result despite low screening recall, presents a novel finding in contrast to prior literature focused solely on screening performance. Critically, for advanced adenoma, the LLM produced a more conservative estimate (OR 1.70 vs 2.06), a finding consistent with the visibly greater symmetry of its corresponding funnel plot. This suggests a potentially lower risk of publication bias in the LLM-selected evidence base. Furthermore, the identification of 8 unique studies missed by the original review reinforces that, despite a lower recall, the pipeline's overall process led to a robust and accurate final synthesis. </sec> <sec> <title>CONCLUSIONS</title> The strategic implication of this study is that by managing LLMs with a structured framework like context engineering, their inherent uncertainties can be navigated to produce reliable and potentially more robust final results. Our work is the first to demonstrate that with a structured approach, an LLM can function as an independent research agent capable of producing these trustworthy outcomes, moving beyond its role as a simple assistant tool. This methodology has the Potential to accelerate the pace of evidence synthesis in medicine. </sec> <sec> <title>CLINICALTRIAL</title> Not Applicable (This study was a methdologycal evaluation designed to replicate a previously published meta-analysis (PROSPERO:CRD42022308533) using a large language model.) </sec>

Ähnliche Arbeiten

Autoren

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingExplainable Artificial Intelligence (XAI)
Volltext beim Verlag öffnen