Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative Evaluation of Diagnostic and Management Capabilities of Infiniti AI and ChatGPT-4o in Corneal Diseases
0
Zitationen
3
Autoren
2025
Jahr
Abstract
BACKGROUND: Artificial intelligence (AI), particularly large language models (LLMs), is rapidly transforming medical education and clinical decision support. Ophthalmology, a specialty heavily reliant on pattern recognition, presents a promising domain for LLM integration. While general-purpose models like ChatGPT-4o have demonstrated strong performance in ophthalmic tasks, domain-specific systems such as Infiniti AI, built with a retrieval-augmented generation (RAG) framework, claim advantages by grounding responses in peer-reviewed ophthalmic literature. This study compares ChatGPT-4o (OpenAI, San Francisco, CA, USA) and Infiniti AI (Sinjab Academy, UAE) in corneal disease case scenarios. MATERIALS AND METHODS: Twenty corneal cases were selected from the University of Iowa EyeRounds database, covering infectious, inflammatory, degenerative, developmental, and systemic associations. ChatGPT-4o, Infiniti AI, and a fellowship-trained cornea specialist independently evaluated each case. Diagnostic and management responses were scored against American Academy of Ophthalmology preferred practice pattern guidelines using a four-point scale (0-3). Statistical comparisons were performed using paired t-tests and Wilcoxon signed-rank tests. RESULTS: ChatGPT-4o significantly outperformed Infiniti AI across all categories. Diagnostic accuracy was higher for ChatGPT-4o (2.37 ± 0.81) than Infiniti AI (1.13 ± 0.71, p < 0.001, Cohen's d = 1.35). Management scores were also superior (2.65 ± 0.65 vs 1.98 ± 0.65, p < 0.001, d = 1.37). Overall, ChatGPT-4o achieved a mean total score of 5.00 ± 1.22 compared with 3.10 ± 1.10 for Infiniti AI (p < 0.001, d = 1.75). CONCLUSIONS: ChatGPT-4o demonstrated greater diagnostic and management accuracy than Infiniti AI in corneal disease scenarios, highlighting the current strength of general-purpose LLMs over specialized retrieval-based systems. Nonetheless, both models remain prone to hallucinations and should serve as adjuncts to, rather than replacements for, expert judgment. Further refinement of ophthalmology-specific models is warranted to improve safety and clinical utility.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.560 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.451 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.948 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.797 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.