Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Thyroid Nodule Experts Evaluating ChatGPT’s Assessment of Thyroid Nodules Classified by the Bethesda System for Reporting Thyroid Cytopathology

2025·0 Zitationen·Journal of Otolaryngology - Head and Neck SurgeryOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

ImportanceChatGPT has emerged as a medical resource through advanced language processing. Patients with thyroid nodules classified under The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) may use it to complement discussions with physicians.ObjectiveWe aimed to determine whether ChatGPT's recommendations on managing thyroid nodules classified by TBSRTC align with those of experienced thyroid specialists.Setting/ParticipantsA multidisciplinary panel of 5 thyroid cancer specialists, including otolaryngologists and endocrinologists, from 3 university-affiliated teaching hospitals in Montreal, Canada, evaluated the responses.Intervention/ExposureChatGPT-3.5 was prompted with 4 questions for each of the 6 Bethesda categories regarding the meaning and management of thyroid nodules, generating 24 responses for evaluation.Main Outcome/MeasuresWe assessed ChatGPT's accuracy against the latest American Thyroid Association (ATA) guidelines using a 4-point Likert scale (<50%, 50-74%, 75-89%, >90%). Additionally, specialists rated their comfort or reluctance in recommending ChatGPT as a complementary tool for patient discussions.ResultsOf the 24 ChatGPT-generated responses, 19 (79.2%) demonstrated moderate to good consistency with the ATA guidelines. The mean consistency score was 3.38/4 and median was 3.5. Consensus (IQR ≤ 1) was achieved in 23 out of 24 responses (95.8%), reflecting strong inter-rater reliability. Consistency scores were highest in Bethesda I-III and declined progressively in higher-risk categories, with the lowest mean score observed in Bethesda VI. Similarly, an upward trend in clinician reluctance was observed from Bethesda I through VI, indicating greater caution in recommending ChatGPT responses for patients suspicious for or diagnosed with malignancy (Bethesda V-VI).Conclusion and RelevanceWhile ChatGPT's responses generally align with specialist recommendations, they are not fully reliable. ChatGPT lacks the ability to serve as an independent or accurate source of medical advice for thyroid nodule management. It remains a useful complement for patient discussions, especially in low-risk scenarios, but further improvements are necessary to make it a safe, reliable component of patient care in complex cases.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationThyroid Cancer Diagnosis and TreatmentClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

Thyroid Nodule Experts Evaluating ChatGPT’s Assessment of Thyroid Nodules Classified by the Bethesda System for Reporting Thyroid Cytopathology

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen