Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessment of AI-Driven Large Language Models for Orthodontic Aesthetic Scoring Using the IOTN-AC
5
Zitationen
2
Autoren
2025
Jahr
Abstract
<b>Background/Objectives</b>: The aim of this study was to evaluate the accuracy of aesthetic assessments performed by artificial intelligence (AI)-based large language models (LLMs) using the Aesthetic Component of the Index of Orthodontic Treatment Need (IOTN-AC), which is widely applied to determine the need for orthodontic treatment. <b>Methods</b>: A total of 150 frontal intraoral photographs from patients in the permanent dentition, scored from 1 to 10 on the IOTN-AC, were assessed by two AI-based LLMs (ChatGPT-5 and ChatGPT-5 Pro). Two experienced clinicians independently scored all photographs, with one evaluator's scores used as the reference (κ = 0.91, ICC = 0.88). Model performance was analyzed by comparing IOTN-AC scores and treatment need classifications. In addition, performance parameters such as accuracy, precision, specificity, and sensitivity were evaluated. Statistical analyses included Spearman correlation, Cohen's Kappa, ICC, Mean Absolute Error (MAE), Wilcoxon signed-rank test, and Bland-Altman analysis. <b>Results</b>: Both models demonstrated positive and significant correlations with the reference values for scoring and classification (<i>p</i> < 0.001). Compared to GPT-5 Pro, the GPT-5 model exhibited superior performance, with a lower error rate (MAE = 1.47) and higher classification accuracy (66.7%). Bland-Altman analysis showed that most predictions fell within the 99% confidence interval, and regression analysis revealed no systematic bias (<i>p</i> > 0.05). Conversely, the models failed to achieve consistently high performance in each of the performance parameters. <b>Conclusions</b>: The findings revealed that although AI-based LLMs are promising, statistical accuracy alone is insufficient for safe clinical use, and they should demonstrate consistently high performance across all parameters.
Ähnliche Arbeiten
A 15-year study of osseointegrated implants in the treatment of the edentulous jaw
1981 · 4.809 Zit.
Research diagnostic criteria for temporomandibular disorders: review, criteria, examinations and specifications, critique.
1992 · 3.829 Zit.
Theory and Methods of Scaling
1958 · 2.793 Zit.
The effects of surgical exposures of dental pulps in germ-free and conventional laboratory rats
1965 · 2.324 Zit.
Matrix Metalloproteinases and Other Matrix Proteinases in Relation to Cariology: The Era of ‘Dentin Degradomics'
2015 · 2.308 Zit.