OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 24.03.2026, 05:39

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating AI Reasoning Models in Pediatric Medicine: A Comparative Analysis of o3-mini and o3-mini-high

2025·4 ZitationenOpen Access
Volltext beim Verlag öffnen

4

Zitationen

5

Autoren

2025

Jahr

Abstract

Abstract Artificial intelligence (AI) is increasingly playing a crucial role in modern medicine, particularly in clinical decision support. This study compares the performance of two OpenAI reasoning models, o3-mini and o3-mini-high, in answering 900 pediatric clinical questions derived from the MedQA-USMLE dataset. The evaluation focuses on accuracy, response time, and consistency to determine their effectiveness in pediatric diagnostic and therapeutic decision-making. The results indicate that o3-mini-high achieves a higher accuracy (90.55% vs. 88.3%) and faster response times (64.63 seconds vs. 71.63 seconds) compared to o3-mini. The chi-square test confirmed that these differences are statistically significant (X 2 = 328.9675, p < 0.00001)). Error analysis revealed that o3-mini-high corrected more errors from o3-mini than vice versa, but both models shared 61 common errors, suggesting intrinsic limitations in training data or model architecture. Additionally, accessibility differences between the models were considered. While DeepSeek-R1, evaluated in a previous study, offers unrestricted free access, OpenAI’s o3 models have message limitations, potentially influencing their suitability in resource-constrained environments. Future improvements should aim at reducing shared errors, optimizing o3-mini’s accuracy while maintaining efficiency, and refining o3-mini-high for enhanced performance. Implementing an ensemble approach that leverages both models’ strengths could provide a more robust AI-driven clinical decision support system, particularly in time-sensitive pediatric scenarios such as emergency care and neonatal intensive care units.

Ähnliche Arbeiten