Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

A benchmark for large language models in bioinformatics

2023·8 Zitationen·bioRxiv (Cold Spring Harbor Laboratory)Open Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2023

Jahr

Abstract

Abstract The rapid advancements in artificial intelligence, particularly in Large Language Models (LLMs) such as GPT-4, Gemini, and LLaMA, have opened new avenues for computational biology and bioinformatics. We report the development of BioLLMBench, a novel framework designed to evaluate LLMs in bioinformatics tasks. This study assessed GPT-4, Gemini, and LLaMA through 2,160 experimental runs, focusing on 24 distinct tasks across six key areas: domain expertise, mathematical problem-solving, coding proficiency, data visualization, research paper summarization, and machine learning model development. Tasks ranged from fundamental to expert-level challenges, and each area was evaluated using seven specific metrics. A Contextual Response Variability Analysis was implemented to understand how model responses varied under different conditions. Results showed diverse performance: GPT-4 led in most tasks, achieving a 91.3% proficiency in domain knowledge, while Gemini excelled in mathematical problem-solving with a 97.5% proficiency score. GPT-4 also outperformed in machine learning model development, though Gemini and LLaMA struggled to generate executable code. All models faced challenges in research paper summarization, scoring below 40% using the ROUGE metric. Model performance variance increased when using a new chat window, though average scores remained similar. The study also discusses the limitations and potential misuse risks of these models in bioinformatics.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationTopic ModelingExplainable Artificial Intelligence (XAI)

Volltext beim Verlag öffnen

A benchmark for large language models in bioinformatics

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen