Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Fine-Tuning GPT on Biomedical NLP Tasks: An Empirical Evaluation
17
Zitationen
3
Autoren
2024
Jahr
Abstract
The emergence of Large Language Models (LLMs) has marked a transformative era in the field of Natural Language Processing (NLP), propelling AI capabilities to unprecedented heights. LLMs, exemplified by models like GPT-3, GPT-4, or Galactica, have demonstrated extraordinary aptitude for under-standing and generating human-like text. What makes OpenAI's GPT-3 models and beyond stand out is their impressive per-formance using in-context learning, allowing them to effectively tackle tasks with minimal examples or context. This adaptability highlights GPT-3's versatility, making it a dominant force in various NLP applications. However, recent studies have revealed certain limitations and weaknesses in GPT-3, particularly in more specific domains, such as the biomedical field, where it exhibits lower performance under few-shot settings in tasks like named entity recognition (NER) or question answering (QA) compared to smaller language models. Thus, this paper aims to fine-tune GPT-3 as a general-purpose language model using biomedical data and assess its performance across seven different biomedical natural language processing tasks: named entity recognition, relation extraction, question answering, document classification, relation classification, text similarity, and molecular property prediction. We demonstrate that the fine-tuned GPT-3 models excel in tasks such as question answering and document classi-fication while lagging behind state-of-the-art models in others. These results prompt significant consideration of the potential benefits of fine-tuning, pre-training with domain-specific data, or even augmenting the model to improve overall performance.