Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ACCELERATING PATHOLOGY REPORT DIGITIZATION: A MULTI-ENGINE OCR AND LLM FRAMEWORK FOR HEALTHCARE APPLICATIONS
0
Zitationen
1
Autoren
2026
Jahr
Abstract
Digitization and structuring of pathology reports are essential in modern healthcare for enhancing patient care, data analytics, and medical research. This study presents a framework called Dual-integrated Text Extraction using Hybrid OCR Engines (DiText-OCR), which leverages multiple OCR tools and domain-specific dictionaries to accurately digitize diverse text types, including printed text and low-quality scans. The extracted text is further processed using Large Language Models (LLMs) for named entity recognition, relationship extraction, and data structuring. The resulting structured data are integrated into healthcare databases and systems, enabling applications in clinical decision support, research, and analytics while ensuring interoperability. Despite its effectiveness, the framework faces challenges, such as handling non-standard report formats, maintaining patient privacy, and addressing the current limitations of OCR and LLM technologies in medical contexts. Future research aims to integrate this system with electronic health records, extend its application to other medical documents, and utilize structured data for advanced research and predictive analytics. By addressing these challenges, the proposed framework has the potential to revolutionize medical data management, ultimately improving patient outcomes, enhancing clinical efficiency, and fostering innovation in healthcare.
Ähnliche Arbeiten
Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support
2008 · 49.704 Zit.
Gene Ontology: tool for the unification of biology
2000 · 43.782 Zit.
STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets
2018 · 18.756 Zit.
A translation approach to portable ontology specifications
1993 · 12.444 Zit.
Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research
2005 · 11.942 Zit.