OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 29.03.2026, 23:17

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Large Language Models in Randomized Controlled Trials Design (Preprint)

2024·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2024

Jahr

Abstract

<sec> <title>BACKGROUND</title> Randomized controlled trials (RCTs) face challenges such as limited generalizability, insufficient recruitment diversity, and high failure rates, often due to restrictive eligibility criteria and inefficient patient selection. Large language models (LLMs) have shown promise in various clinical tasks, but their potential role in RCT design remains underexplored. </sec> <sec> <title>OBJECTIVE</title> This study investigates the ability of LLMs, specifically GPT-4-Turbo-Preview, to assist in designing RCTs that enhance generalizability, recruitment diversity, and reduce failure rates, while maintaining clinical safety and ethical standards. </sec> <sec> <title>METHODS</title> We conducted a non-interventional, observational study analyzing 20 parallel-arm RCTs, comprising 10 completed and 10 ongoing studies published after January 2024 to mitigate pretraining biases. The LLM was tasked with generating RCT designs based on input criteria, including eligibility, recruitment strategies, interventions, and outcomes. The accuracy of LLM-generated designs was quantitatively assessed by comparing them to clinically validated ground truth data from ClinicalTrials.gov. Qualitative assessments were performed using Likert scale ratings (1–3) for domains such as safety, accuracy, objectivity, pragmatism, inclusivity, and diversity. </sec> <sec> <title>RESULTS</title> The LLM achieved an overall accuracy of 72% in replicating RCT designs. Recruitment and intervention designs demonstrated high agreement with the ground truth, achieving 88% and 93% accuracy, respectively. However, LLMs showed lower accuracy in designing eligibility criteria (55%) and outcomes measurement (53%). Qualitative evaluations showed that LLM-generated designs scored above 2 points across all domains, indicating strong clinical alignment. In particular, LLMs enhanced diversity and pragmatism, which are key factors in improving RCT generalizability and addressing failure rates. </sec> <sec> <title>CONCLUSIONS</title> LLMs, such as GPT-4-Turbo-Preview, have demonstrated potential in improving RCT design, particularly in recruitment and intervention planning, while enhancing generalizability and addressing diversity. However, expert oversight and regulatory measures are essential to ensure patient safety and ethical standards. The findings support further integration of LLMs into clinical trial design, although continued refinement is necessary to address limitations in eligibility and outcomes measurement. </sec>

Ähnliche Arbeiten

Autoren

Themen

Radiomics and Machine Learning in Medical ImagingArtificial Intelligence in Healthcare and EducationMachine Learning in Healthcare
Volltext beim Verlag öffnen