Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Using a large language model artificial intelligence agent to improve the efficiency of clinical quality measure evidence evaluation: a case study
0
Zitationen
4
Autoren
2026
Jahr
Abstract
OBJECTIVES: To evaluate the feasibility and performance of a large language model (LLM)-based artificial intelligence (AI) agent, implemented within a structured Claim-Argument-Evidence System (CAES), for supporting the review of clinical quality measure (CQM) evidence in the Centers for Medicare & Medicaid Services Consensus-Based Entity (CBE) endorsement process. METHODS: The CBE conducted a pilot study using a previously endorsed measure. CAES extracted claims and citations from a submitted diagnostic performance measure for pneumonia, automatically retrieved additional relevant evidence from PubMed abstracts and assessed the quality, confidence and agreement of evidence supporting each claim. The system's assessments were compared with the judgement of a subject matter expert (SME). RESULTS: CAES completed the assessment in approximately 5 hours. The SME agreed with the CAES-assigned claim statuses for 69% of claims, was neutral for 11% and disagreed for 14%. Disagreements primarily stemmed from the need for contextual interpretation beyond abstracts. DISCUSSION: Manual evaluation of CQM evidence requires significant time and resources, estimated at over 2400 labour hours per review cycle, limiting efficiency and transparency.The AI agent evaluated 64 claims and 355 claim-evidence pairs related to the pneumonia diagnosis measure. It assigned claim statuses based on evidence strength and generated justifications. CONCLUSION: This pilot demonstrated the feasibility and potential of LLM-based AI agents to improve the efficiency and transparency of evidence review for CQMs. Further development is needed to incorporate additional data sources and extend applicability across the measure development lifecycle.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.611 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.504 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.025 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.835 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.