OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 23.03.2026, 04:53

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Exploring the Proficiency of ChatGPT 3.5, 4, and 4 with Vision in the Chile Medical Licensing Exam (Preprint)

2023·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2023

Jahr

Abstract

<sec> <title>BACKGROUND</title> The deployment of OpenAI's ChatGPT 3.5 and its subsequent versions, ChatGPT 4 and 4 with Vision (4V), has notably influenced the medical field. Demonstrating remarkable performance in medical exams globally, these models show potential for educational applications. However, their effectiveness in non-English contexts, particularly in Chile's Medical Licensing Exam, a critical step for medical practitioners in Chile, is less explored. This gap highlights the need to evaluate ChatGPT's adaptability to diverse linguistic and cultural scenarios. </sec> <sec> <title>OBJECTIVE</title> This study aims to evaluate the proficiency of ChatGPT versions 3.5, 4, and 4V in answering EUNACOM (Examen Único Nacional de Conocimientos de Medicina), a major medical examination in Chile, format questions. </sec> <sec> <title>METHODS</title> Three official drills from the University of Chile, mirroring the EUNACOM exam structure and difficulty, were used to test ChatGPT versions 3.5, 4, and 4V. The three ChatGPT versions underwent three rounds of answering each drill. Responses to questions during each round were systematically categorized and analyzed to assess the accuracy rate of the responses. </sec> <sec> <title>RESULTS</title> All versions of ChatGPT successfully passed EUNACOM-style exams, with version 4 outperforming 3.5 and 4V. A detailed analysis revealed a higher accuracy rate in questions related to Surgery and Psychiatry for all versions, while performance dipped in the areas of Internal Medicine and Public Health. Version 4V didn't demonstrate a better performance compared to the two other versions, despite access to figures of the questions. </sec> <sec> <title>CONCLUSIONS</title> The study reveals ChatGPT's ability to pass the EUNACOM test, with distinct proficiencies across versions 3.5, 4, and 4V. Notably, advancements in artificial intelligence (AI) don't significantly enhance image-based question performance. The variations in proficiency across medical fields suggest the need for more nuanced AI training. While AI shows promise in medical education, its limitations in depth and variability of expertise highlight the necessity for medical curricula to focus on critical thinking and reflective practices, ensuring the effective integration of AI in patient-centered care. </sec>

Ähnliche Arbeiten

Autoren

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical Imaging
Volltext beim Verlag öffnen