Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Can Modern NLP Systems Reliably Annotate Chest Radiography Exams? A Pre-Purchase Evaluation and Comparative Study of Solutions from AWS, Google, Azure, John Snow Labs, and Open-Source Models on an Independent Pediatric Dataset
0
Zitationen
6
Autoren
2025
Jahr
Abstract
<title>Abstract</title> This study compares four commercial clinical NLP tools - Amazon Comprehend Medical, Google Healthcare NLP, Azure Clinical NLP, and SparkNLP - alongside dedicated radiograph labelers CheXpert and CheXbert for pediatric chest radiograph (CXR) report labeling. Using 95,008 pediatric CXR reports from a large academic hospital, we extracted entities and assertion statuses (positive, negative, uncertain) from findings and impressions, mapped them to 13 categories (12 disease categories and a No Findings category), and compared performance using Fleiss Kappa and accuracy against a pseudo-ground truth. Entity extraction varied widely: SparkNLP extracted 49,688 unique entities, Azure 31,543, AWS 27,216, and Google 16,477. Assertion accuracy ranged from 50% (AWS) to 76% (SparkNLP), while CheXpert and CheXbert achieved 56%. Results reveal substantial performance variability, emphasizing the need for validation and careful review before deploying NLP tools for pediatric clinical report labeling.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.