OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 20.03.2026, 00:46

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Can Modern NLP Systems Reliably Annotate Chest Radiography Exams? A Pre-Purchase Evaluation and Comparative Study of Solutions from AWS, Google, Azure, John Snow Labs, and Open-Source Models on an Independent Pediatric Dataset

2025·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

6

Autoren

2025

Jahr

Abstract

<title>Abstract</title> This study compares four commercial clinical NLP tools - Amazon Comprehend Medical, Google Healthcare NLP, Azure Clinical NLP, and SparkNLP - alongside dedicated radiograph labelers CheXpert and CheXbert for pediatric chest radiograph (CXR) report labeling. Using 95,008 pediatric CXR reports from a large academic hospital, we extracted entities and assertion statuses (positive, negative, uncertain) from findings and impressions, mapped them to 13 categories (12 disease categories and a No Findings category), and compared performance using Fleiss Kappa and accuracy against a pseudo-ground truth. Entity extraction varied widely: SparkNLP extracted 49,688 unique entities, Azure 31,543, AWS 27,216, and Google 16,477. Assertion accuracy ranged from 50% (AWS) to 76% (SparkNLP), while CheXpert and CheXbert achieved 56%. Results reveal substantial performance variability, emphasizing the need for validation and careful review before deploying NLP tools for pediatric clinical report labeling.

Ähnliche Arbeiten