OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 07.05.2026, 07:55

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Hybrid ReGex and Natural Language Inference Model as a Zero-Shot Classifier for Extracting Data From Medical Reports

2025·0 Zitationen·JCO Clinical Cancer Informatics
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2025

Jahr

Abstract

PURPOSE: This study presents a new method based on regular expressions (ReGex) and artificial intelligence for extracting relevant medical data from clinical reports. This hybrid approach is designed to address the limitations of each technique. The pipeline is evaluated for its effectiveness in extracting key clinical information from prostate cancer medical reports. METHODS: We developed a hybrid pipeline that combines ReGex for initial data extraction with a Natural Language Inference model for classification. This approach was retrospectively applied to 1,000 reports randomly selected among all consultation reports of patients with prostate cancer treated at the institute, focusing on identifying key clinical information such as rectal bleeding, dysuria, pollakiuria, and hematuria. The model's performance was evaluated using precision, recall, accuracy, F1-score, and Cohen's kappa coefficient. RESULTS: The pipeline demonstrated high performance, with precision scores ranging from 0.778 to 0.954 and recall consistently high at 0.920 to 1.00. F1-scores indicated balanced accuracy across symptoms, and Cohen's kappa values (0.871 to 0.951) reflected strong agreement with physician-labeled data. CONCLUSION: The proposed pipeline is both efficient and fast while being computationally lightweight. It achieves high accuracy in extracting medical data from clinical reports, making it an effective and practical tool for clinical research and health care applications.

Ähnliche Arbeiten