OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 16:14

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Effectiveness of Various General large language models in Clinical Consensus and Case Analysis in Dental Implantology: A Comparative Study

2024·1 ZitationenOpen Access
Volltext beim Verlag öffnen

1

Zitationen

5

Autoren

2024

Jahr

Abstract

<title>Abstract</title> Background This study evaluates and compares ChatGPT-4.0, Gemini 1.5, Claude 3, and Qwen 2.1 in answering dental implant questions. The aim is to help doctors in underserved areas choose the best LLMs(Large Language Model) for their procedures, improving dental care accessibility and clinical decision-making. Methods Two dental implant specialists with over twenty years of clinical experience evaluated the models. Questions were categorized into simple true/false, complex short-answer, and real-life case analyses. Performance was measured using precision, recall, and Bayesian inference-based evaluation metrics. Results ChatGPT-4 exhibited the most stable and consistent performance on both simple and complex questions. Gemini performed well on simple questions but was less stable on complex tasks. Qwen provided high-quality answers for specific cases but showed variability. Claude-3 had the lowest performance across various metrics. Statistical analysis indicated significant differences between models in diagnostic performance but not in treatment planning. Conclusions ChatGPT-4 is the most reliable model for handling medical questions, followed by Gemini. Qwen shows potential but lacks consistency, and Claude-3 performs poorly overall. Combining multiple models is recommended for comprehensive medical decision-making.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationDental Radiography and ImagingRadiomics and Machine Learning in Medical Imaging
Volltext beim Verlag öffnen