Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessing the Proficiency of Large Language Models on Funduscopic Disease Knowledge (Preprint)

2024·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

<sec> <title>BACKGROUND</title> Large language models (LLMs) have significantly transformed the field of natural language processing, with cutting-edge models like ChatGPT currently leading the way in medical AI. </sec> <sec> <title>OBJECTIVE</title> This study aimed to assess the performance of five distinct LLMs (GPT-3.5, ChatGPT-4, PaLM2, Claude 2, and SenseNova) in comparison to two human cohorts (a group of funduscopic disease experts and a group of ophthalmologists) on the specialized subject of funduscopic disease. </sec> <sec> <title>METHODS</title> Five distinct LLMs and two distinct human groups independently completed a 100-item funduscopic disease test. The performance of these entities was assessed by comparing their average scores, response stability, and answer confidence, thereby establishing a basis for evaluation. </sec> <sec> <title>RESULTS</title> Among all the LLMs, GPT-4 and PaLM2 exhibited the most substantial average correlation. Additionally, GPT-4 achieved the highest average score and demonstrated the utmost confidence during the exam. In comparison to human cohorts, GPT-4 exhibited comparable performance to ophthalmologists, albeit falling short of the expertise demonstrated by funduscopic disease specialists. </sec> <sec> <title>CONCLUSIONS</title> The study provided evidence of the exceptional performance of GPT-4 in the domain of funduscopic disease. With continued enhancements, validated LLMs have the potential to yield unforeseen advantages in enhancing healthcare for both patients and physicians. </sec>

Autoren

Themen

Artificial Intelligence in Healthcare and EducationBiomedical Text Mining and OntologiesAI in cancer detection

Volltext beim Verlag öffnen

Assessing the Proficiency of Large Language Models on Funduscopic Disease Knowledge (Preprint)

Abstract

Ähnliche Arbeiten

Autoren

Themen