Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

AI Performance on Image-based Medical Case Scenarios: A Cross-Sectional Comparative Study

2025·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

<title>Abstract</title> Background Large language models (LLMs) have shown remarkable progress in text-based tasks, but their ability to interpret and respond to image-based clinical scenarios remains underexplored. This study evaluated and compared the performance of ChatGPT-5 and Claude in answering subjective image-based medical case questions. Methods A cross-sectional comparative study was conducted using 71 dermatological case scenarios subjective questions designed by the research team. Each AI system generated responses to identical visual and textual inputs without external assistance. Two experienced dermatologists, blinded to model identity, independently scored the responses against standard answers. Inter-rater reliability was assessed using intraclass correlation coefficients (ICC), and comparative analyses employed Mann–Whitney U tests, Bland–Altman plots, and correlation metrics. Results Both evaluators demonstrated excellent inter-rater reliability (ICC > 0.86). Claude achieved higher mean scores (27.39 ± 11.44) than ChatGPT-5 (25.53 ± 11.45; p < 0.001). Claude also showed stronger correlation with reference standards (ρ = 0.88 vs. 0.83), lower mean absolute error (14.76% vs. 19.98%), and reduced root mean square error (7.24 vs. 9.24). Bland–Altman analysis revealed minimal systematic bias between evaluators, indicating consistent scoring reliability. Conclusions Both multimodal LLMs demonstrated strong competence in interpreting image-based medical scenarios. Claude exhibited a modest but consistent advantage in diagnostic reasoning and clinical alignment. These findings support the potential of LLMs as supplementary educational tools in visual disciplines such as dermatology, emphasizing the importance of model selection, supervised use, and continued evaluation as AI integration in medical education expands.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationCutaneous Melanoma Detection and ManagementAI in cancer detection

Volltext beim Verlag öffnen

AI Performance on Image-based Medical Case Scenarios: A Cross-Sectional Comparative Study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen