Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Can Large Language Models Detect Periapical Lesions in Anterior Teeth? A Comparative Study
0
Zitationen
10
Autoren
2025
Jahr
Abstract
This study evaluated the diagnostic performance of large language models (LLMs)-ChatGPT 5.0 (OpenAI) and Gemini Flash 2.0 (Google Inc.)-in detecting periapical lesions on periapical radiographs using a standardized multimodal prompt. Seventy-five anonymized periapical radiographs of anterior teeth from the maxilla and mandible were analyzed, evenly distributed between cases with and without lesions. A calibrated endodontic specialist provided the reference diagnosis. Each image was independently assessed five times by both LLMs using the prompt: "Does this image show a periapical lesion? Answer 'Yes' or 'No'. If 'Yes', which tooth?". Balanced accuracy, sensitivity, specificity, and F1-score were calculated with 95% confidence intervals (CIs) obtained via bootstrap resampling. Performance was also stratified by diagnostic difficulty, and the models were compared using the exact McNemar test (α = 0.05). ChatGPT 5.0 showed higher overall performance than Gemini Flash 2.0, with sensitivity of 0.97 (95% CI: 0.95-0.99), specificity of 0.11 (95% CI: 0.07-0.16), and balanced accuracy of 0.54 (95% CI: 0.52-0.57). Gemini Flash 2.0 achieved sensitivity of 0.84 (95% CI: 0.79-0.89), specificity of 0.11 (95% CI: 0.07-0.16), and balanced accuracy of 0.48 (95% CI: 0.44-0.51). Both models showed high false-positive rates and frequent errors in tooth localization. The McNemar test confirmed a significant difference between models (p < 0.05), favoring ChatGPT 5.0. Both LLMs demonstrated high sensitivity but poor specificity, resulting in intermediate diagnostic performance and a bias toward positive classifications. General-purpose LLMs are therefore not yet suitable for radiographic diagnostic use.
Ähnliche Arbeiten
The long-term efficacy of currently used dental implants: a review and proposed criteria of success.
1986 · 3.692 Zit.
The Gingival Index, the Plaque Index and the Retention Index Systems
1967 · 3.644 Zit.
The burden of oral disease: challenges to improving oral health in the 21st century.
2005 · 3.579 Zit.
Periodontitis: Consensus report of workgroup 2 of the 2017 World Workshop on the Classification of Periodontal and Peri‐Implant Diseases and Conditions
2018 · 3.066 Zit.
Osseointegrated Titanium Implants:<i>Requirements for Ensuring a Long-Lasting, Direct Bone-to-Implant Anchorage in Man</i>
1981 · 2.647 Zit.