Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Automatic International Classification of Diseases Coding System: Deep Contextualized Language Model With Rule-Based Approaches
23
Zitationen
13
Autoren
2022
Jahr
Abstract
BACKGROUND: The tenth revision of the International Classification of Diseases (ICD-10) is widely used for epidemiological research and health management. The clinical modification (CM) and procedure coding system (PCS) of ICD-10 were developed to describe more clinical details with increasing diagnosis and procedure codes and applied in disease-related groups for reimbursement. The expansion of codes made the coding time-consuming and less accurate. The state-of-the-art model using deep contextual word embeddings was used for automatic multilabel text classification of ICD-10. In addition to input discharge diagnoses (DD), the performance can be improved by appropriate preprocessing methods for the text from other document types, such as medical history, comorbidity and complication, surgical method, and special examination. OBJECTIVE: This study aims to establish a contextual language model with rule-based preprocessing methods to develop the model for ICD-10 multilabel classification. METHODS: score and the micro area under the receiver operating characteristic curve were used to compare the model's performance with that of different preprocessing methods. RESULTS: score that significantly increased from 0.670 (95% CI 0.663-0.678) to 0.726 (95% CI 0.719-0.732) with a combination of discharge diagnoses, surgical methods, and key words of special examination. With our preprocessing methods, the model had the highest area under the receiver operating characteristic curve of 0.853 (95% CI 0.849-0.855) and 0.831 (95% CI 0.827-0.834) for ICD-10-CM and ICD-10-PCS, respectively. CONCLUSIONS: The performance of our model with the pretrained contextualized language model and rule-based preprocessing method is better than that of the state-of-the-art model for ICD-10-CM or ICD-10-PCS. This study highlights the importance of rule-based preprocessing methods based on coder coding rules.
Ähnliche Arbeiten
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
1982 · 21.611 Zit.
Coding Algorithms for Defining Comorbidities in ICD-9-CM and ICD-10 Administrative Data
2005 · 10.529 Zit.
Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases
1992 · 10.501 Zit.
Comorbidity Measures for Use with Administrative Data
1998 · 9.825 Zit.
Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond
2007 · 6.251 Zit.