OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 18.03.2026, 10:15

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Enhancing Diagnostic Precision: Utilising a Large Language Model to Extract U Scores from Thyroid Sonography Reports

2025·1 Zitationen·Studies in health technology and informaticsOpen Access
Volltext beim Verlag öffnen

1

Zitationen

8

Autoren

2025

Jahr

Abstract

This study evaluates the performance of ChatGPT-4, a Large Language Model (LLM), in automatically extracting U scores from free-text thyroid ultrasound reports collected from University Hospitals Birmingham (UHB), UK, between 2014 and 2024. The LLM was provided with guidelines on the U classification system and extracted U scores independently from 14,248 de-identified reports, without access to human-assigned scores. The LLM-extracted scores were compared to initial clinician-assigned and refined U scores provided by expert reviewers. The LLM achieved 97.7% agreement with refined human U scores, successfully identifying the highest U score in 98.1% of reports with multiple nodules. Most discrepancies (2.5%) were linked to ambiguous descriptions, multi-nodule reports, and cases with human-documented uncertainty. While the results demonstrate the potential for LLMs to improve reporting consistency and reduce manual workload, ethical and governance challenges such as transparency, privacy, and bias must be addressed before routine clinical deployment. Embedding LLMs into reporting workflows, such as Online Analytical Processing (OLAP) tools, could further enhance reporting quality and consistency.

Ähnliche Arbeiten