Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

49-PUB: Diagnosis of Polycystic Ovary Syndrome with General-Purpose Online Large Language Models—A Prospective Evaluation Study of ChatGPT-3.5, ChatGPT-4o, Llama-3, and Claude-3.5

2025·0 Zitationen·Diabetes

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Introduction and Objective: To access the trustworthiness and precision of general-purpose online large language models (LLMs) in providing polycystic ovary syndrome (PCOS) diagnosis-related information. Methods: The present study recruited 125 consecutive women suspected with PCOS in the PCOS subspecialty clinic of Shanghai Tenth People's Hospital between October 2023 and March 2024, and finally 50 women with PCOS and 30 women without PCOS were included. Four popular LLMs: OpenAI’s ChatGPT-3.5, ChatGPT-4o, Meta Llama-3, and Anthropic Claude-3.5, were prompted to diagnose PCOS and No-PCOS based on de-identified medical records. The binary diagnosis of PCOS was established utilizing LLMs’ description by two professional physicians. Results: Among 80 patients suspected of having PCOS, all LLMs could provide the diagnostic impression of PCOS from the textual prompts, with all LLMs achieving 100% sensitivity and negative predictive value (NPV). Claude-3.5 demonstrated the best performance, with an accuracy of 93.8%, followed by Llama-3 at 85.0%, ChatGPT-4o at 78.7%, and ChatGPT-3.5 at 70.0% (P for trend &lt; 0.0001). In terms of specificity and positive predictive value (PPV), similar trends were observed, with Claude-3.5 performing the best and ChatGPT-3.5 the worst. Conclusion: Our findings highlight the potential of LLMs, especially Claude-3.5, as the groundwork for a new generation of artificial intelligence tools that can aid clinical physicians in PCOS diagnosis. Disclosure Y. Zhang: None. H. Hui: None. P. Li: None. X. Shao: None. M. Cai: None. D. Dilimulati: None. H. Chen: None. S. Qu: None. M. Zhang: None. H. Ji: None. W. Song: None.

Autoren

Themen

Radiomics and Machine Learning in Medical ImagingArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

49-PUB: Diagnosis of Polycystic Ovary Syndrome with General-Purpose Online Large Language Models—A Prospective Evaluation Study of ChatGPT-3.5, ChatGPT-4o, Llama-3, and Claude-3.5

Abstract

Ähnliche Arbeiten

Autoren

Themen