OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 12:34

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Collaborative and privacy-preserving workflows on a clinical data warehouse: an example developing natural language processing pipelines to detect medical conditions

2023·1 ZitationenOpen Access
Volltext beim Verlag öffnen

1

Zitationen

8

Autoren

2023

Jahr

Abstract

A bstract Objective To develop and validate advanced natural language processing pipelines that detect 18 conditions in clinical notes written in French, among which 16 comorbidities of the Charlson index, while exploring a collaborative and privacy-preserving workflow. Materials and methods The detection pipelines relied both on rule-based and machine learning algorithms for named entity recognition and entity qualification, respectively. We used a large language model pre-trained on millions of clinical notes along with clinical notes annotated in the context of three cohort studies related to oncology, cardiology and rheumatology, respectively. The overall workflow was conceived to foster collaboration between studies while complying to the privacy constraints of the data warehouse. We estimated the added values of both the advanced technologies and the collaborative setting. Results The 18 pipelines reached macro-averaged F1-score positive predictive value, sensitivity and specificity of 95.7 (95%CI 94.5 - 96.3), 95.4 (95%CI 94.0 - 96.3), 96.0 (95%CI 94.0 - 96.7) and 99.2 (95%CI 99.0 - 99.4), respectively. F1-scores were superior to those observed using either alternative technologies or non-collaborative settings. The models were shared through a secured registry. Conclusions We demonstrated that a community of investigators working on a common clinical data warehouse could efficiently and securely collaborate to develop, validate and use sensitive artificial intelligence models. In particular, we provided efficient and robust natural language processing pipelines that detect conditions mentioned in clinical notes.

Ähnliche Arbeiten