Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Analysis of Eligibility Criteria Clusters Based on Large Language Models for Clinical Trial Design
0
Zitationen
8
Autoren
2024
Jahr
Abstract
ABSTRACT Objectives Clinical trials (CTs) are essential for improving patient care by evaluating new treatments’ safety and efficacy. A key component in CT protocols is the study population defined by the eligibility criteria. This study aims to evaluate the effectiveness of large language models (LLMs) in encoding eligibility criterion information to support CT protocol design. Materials and Methods We extracted eligibility criterion sections, phases, conditions, and interventions from CT protocols available in the ClinicalTrials.gov registry. Eligibility sections were split into individual rules using a criterion tokenizer and embedded using LLMs. The obtained representations were clustered. The quality and relevance of the clusters for protocol design was evaluated through 3 experiments: intrinsic alignment with protocol information and human expert cluster coherence assessment, extrinsic evaluation through CT-level classification tasks, and eligibility section generation. Results Sentence embeddings fine-tuned using biomedical corpora produce clusters with the highest alignment to CT-level information. Human expert evaluation confirms that clusters are well-structured and coherent. Despite the high information compression, clusters retain significant CT information, up to 97% of the classification performance obtained with raw embeddings. Finally, eligibility sections automatically generated using clusters achieve 95% of the ROUGE scores obtained with a generative LLM. Conclusions We show that clusters derived from sentence-level LLM embeddings are effective in summarizing complex eligibility criterion data while retaining relevant CT protocol details. Clustering-based approaches provide a scalable enhancement in CT design that balances information compression with accuracy.
Ähnliche Arbeiten
"Why Should I Trust You?"
2016 · 14.227 Zit.
A Comprehensive Survey on Graph Neural Networks
2020 · 8.601 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Artificial intelligence in healthcare: past, present and future
2017 · 4.387 Zit.