OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 12.04.2026, 02:43

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Large language models in portuguese for healthcare: a systematic review

2026·0 Zitationen·Research on Biomedical EngineeringOpen Access
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2026

Jahr

Abstract

This study addresses Large Language Models (LLMs) pre-trained in Portuguese for healthcare applications, focusing on contextual embeddings. Research on LLMs for natural language processing (NLP) tasks in Portuguese is limited, especially within healthcare. Much of the existing research has focused on high-resource languages such as English. However, LLMs demonstrate potential in clinical decision support, diagnosis assistance, patient care, and other healthcare applications. In view of this, the present work assesses the current state of LLMs in Portuguese for healthcare. Our Systematic Literature Review (SLR) followed standard protocols: search, screening based on inclusion/exclusion criteria, quality assessment, data extraction, and analysis. We identified 32 models, mostly based on BERTimbau, mBERT, and BioBERTpt. Adaptation strategies such as fine-tuning, domain-adaptive pre-training, training from scratch, and zero-shot learning have been the most prevalent. Several datasets have been used, including clinical records, social media, and scientific repositories. LLMs in Portuguese are being applied in mental health, general medicine, COVID-19, oncology, and other related areas, accomplishing classification tasks, followed by named entity recognition (NER), topic modeling, question answering, text generation, summarization, de-identification, and conversational agents. Our study identified key gaps and opportunities: (1) unexplored recent LLMs such as T5, Qwen, DeepSeek, BART, among others; (2) insufficient fine-tuning details, hindering reproducibility; (3) limited coverage of healthcare fields; (4) clinical and hospital data widely used but not shared; (5) social media data requiring caution due to potential inconsistencies; and (6) overlooked data privacy.

Ähnliche Arbeiten