OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 09.05.2026, 20:26

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Confidence-linked and uncertainty-based staged framework for phenotype validation using large language models

2025·2 Zitationen·Journal of the American Medical Informatics Association
Volltext beim Verlag öffnen

2

Zitationen

12

Autoren

2025

Jahr

Abstract

OBJECTIVES: This study develops and validates the confidence-linked and uncertainty-based staged (CLUES) framework by integrating large language models (LLMs) with uncertainty quantification to assist manual chart review while ensuring reliability through a selective human review. MATERIALS AND METHODS: The CLUES framework assesses stroke-related hospitalizations using imaging reports for 1739 patients across 24 Korean hospitals (2011-2022). Uncertainty was quantified via entropy from LLM-derived confidence values. Our framework operated in 3 stages: (1) zero-shot prompting with ensemble averaging, where high-uncertainty cases advanced to stage 2, (2) few-shot prompting using retrieved low-uncertainty cases, with remaining high-uncertainty cases proceeding to stage 3, and (3) manual chart review for final uncertain cases. Performance was evaluated against physician-labeled data using F1-score and Cohen's Kappa. RESULTS: Among 1072 test cases, stage 1 classified 507 cases as low uncertainty, while 565 were high uncertainty. Stage 2 reclassified 280 cases as low uncertainty, leaving 285 for manual review. Low-uncertainty cases consistently outperformed high-uncertainty cases in both stages (weighted F1-scores: 0.94 vs 0.57 in stage 1 and 0.82 vs 0.58 in stage 2). The overall framework performance showed a progressive improvement in F1-scores from 0.840 (stage 1) to 0.878 (stage 2) to 0.955 (stage 3). DISCUSSION: The CLUES framework reduced manual review burden by 75% while maintaining high accuracy. By integrating uncertainty quantification with selective human oversight, it provides an efficient and reliable approach to phenotype validation. CONCLUSION: This framework demonstrates the effective integration of LLMs into clinical workflows while ensuring human oversight, enhancing both accuracy and efficiency.

Ähnliche Arbeiten