Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the Clinical Reasoning Capabilities of AI Language Models in Diagnosing and Treating Depression

2025·0 Zitationen·European PsychiatryOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Introduction Artificial intelligence (AI) language models are increasingly accessible tools that offer potential support in mental health care. Despite their promise in revolutionizing mental health care through symptom assessment and treatment suggestions, concerns about their validity, accuracy, ethical considerations, and risk management persist. This study evaluates the clinical reasoning capabilities of two leading AI language models in assessing a clinical case vignette of Major Depressive Disorder (MDD). Objectives To evaluate the diagnostic accuracy, risk assessment proficiency, and quality of treatment recommendations provided by ChatGPT and Claude when applied to a standardised clinical vignette of a case of MDD. Methods A clinical vignette describing a 50-year-old male patient exhibiting symptoms consistent with MDD was presented to both ChatGPT 4o and Claude 3.5 Sonnet. The patient had significant cardiac disease, leading to unemployment, social withdrawal, and passive suicidal ideation. Both AI models were asked five identical questions regarding: (1) diagnosis, (2) severity assessment, (3) first-line treatment recommendations, (4) optimal antidepressant selection, and (5) suicide risk evaluation. Two psychiatrists independently reviewed the responses for accuracy, comprehensiveness, and alignment with established guidelines and evidence-based treatment for depression with comorbid cardiac disease. Results Both AI models correctly diagnosed MDD and accurately recognized the severity of the case due to the presence of suicidal ideation and significant functional impairment. Both offered comprehensive treatment recommendations, including pharmacotherapy and psychotherapy, and specifically suggested Sertraline as the antidepressant of choice due to its favourable cardiac safety profile. Both models assessed the patient as having a moderate to high suicide risk and provided a reasonably thorough analysis of risk and protective factors. However, limitations were noted in their ability to incorporate individualized patient nuances and psychosocial factors fully. Conclusions ChatGPT 4o and Claude 3.5 Sonnet demonstrated significant capabilities in clinical reasoning, providing diagnoses and treatment recommendations that align with best clinical practices. Their responses were largely accurate and comprehensive, indicating potential utility as supportive tools for healthcare professionals. AI models may assist non-specialists in preliminary assessment and management but are not substitutes for professional psychiatric evaluation. Caution is advised in relying on AI for clinical decision making, and further refinement is necessary to enhance their ability to integrate patients-centered care and adherence to ethical guidelines, to mitigate risks associated with self-diagnosis and inappropriate treatment. Disclosure of Interest None Declared

Autoren

Institutionen

Changi General Hospital(SG)

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in Healthcare

Volltext beim Verlag öffnen

Evaluating the Clinical Reasoning Capabilities of AI Language Models in Diagnosing and Treating Depression

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen