Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Effectiveness of Various General large language models in Clinical Consensus and Case Analysis in Dental Implantology: A Comparative Study
1
Zitationen
5
Autoren
2024
Jahr
Abstract
<title>Abstract</title> Background This study evaluates and compares ChatGPT-4.0, Gemini 1.5, Claude 3, and Qwen 2.1 in answering dental implant questions. The aim is to help doctors in underserved areas choose the best LLMs(Large Language Model) for their procedures, improving dental care accessibility and clinical decision-making. Methods Two dental implant specialists with over twenty years of clinical experience evaluated the models. Questions were categorized into simple true/false, complex short-answer, and real-life case analyses. Performance was measured using precision, recall, and Bayesian inference-based evaluation metrics. Results ChatGPT-4 exhibited the most stable and consistent performance on both simple and complex questions. Gemini performed well on simple questions but was less stable on complex tasks. Qwen provided high-quality answers for specific cases but showed variability. Claude-3 had the lowest performance across various metrics. Statistical analysis indicated significant differences between models in diagnostic performance but not in treatment planning. Conclusions ChatGPT-4 is the most reliable model for handling medical questions, followed by Gemini. Qwen shows potential but lacks consistency, and Claude-3 performs poorly overall. Combining multiple models is recommended for comprehensive medical decision-making.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.239 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.428 Zit.