Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Outstanding performance of <scp>ChatGPT</scp> on the obstetrics and gynecology board certification examination in Japan: Document and image‐based questions analysis

2024·1 Zitationen·Journal of obstetrics and gynaecology research

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

ChatGPT is an artificial intelligence (AI) language model available online, trained on vast amounts of text to excel in natural language processing. The newer versions, like ChatGPT-4, can also interpret images and files, extending their usefulness across various fields.1, 2 Regarding whether ChatGPT can provide correct answers to medical examination questions, previous research has shown that it can achieve passing scores across various medical fields.3 However, its performance in obstetrics and gynecology remains unclear. Additionally, it is still unknown how ChatGPT performs on image-based questions that require the interpretation of imaging tests and physical findings. Herein, we aimed to investigate ChatGPT's performance on the obstetrics and gynecology board certification examination conducted in Japan, specifically focusing on document-based and image-based questions. The Japan Society of Obstetrics and Gynecology annually conducts obstetrics and gynecology board certification examinations in Japan. Eligibility for the examination is granted to those who have obtained a medical license, completed 2 years of junior residency, and underwent at least 3 years of training as obstetrician-gynecologists at a designated training facility. The examination covers four fields: perinatology, gynecologic oncology, reproductive endocrinology, and women's healthcare. It consists of approximately 120 multiple-choice questions. The exam includes not only document-based questions, but also image-based questions that require answers based on ultrasound, magnetic resonance imaging, computed tomography, pathological images, cardiotocogram evaluation, and clinical photographs. For multiple-choice questions, an answer was considered incorrect unless all choices selected were correct. ChatGPT-4 is an advanced version of OpenAI's conversational AI, which offers improved language understanding, image reading, and generation capabilities. It provides more accurate context-aware responses than its predecessors. In the obstetrics and gynecology board certification examination, ChatGPT-4 demonstrated high performance over the past 3 years, with an overall accuracy rate of 70.2%, 64.8%, 66.7%, and 77.3% in perinatology, gynecologic oncology, reproductive endocrinology, and women's healthcare, respectively, although the accuracy rate in actual examinations has not been disclosed. Additionally, ChatGPT-4 achieved accuracy rates with no significant differences in image-based questions, comparable to its performance on document-based questions (Table 1). These results suggest that ChatGPT has promising capabilities to accurately answer clinical questions for both document-based and image-based questions in obstetrics and gynecology. Further evaluation of whether these models can make accurate judgments in real-world clinical scenarios is essential; however, continued advancement of AI models can further enhance their value in medical situations. T.N. contributed to the conception, design, acquisition, analysis, and interpretation of data and wrote the manuscript. R.Y. contributed to the data analysis and interpretation and provided manuscript guidance. A.S and A.O. contributed to the data interpretation, provided manuscript guidance, and supervised the research. All authors have reviewed and approved the final manuscript and agreed to be accountable for all aspects of the study, ensuring its accuracy and integrity. The authors declare no conflict of interests for this article. Data available on request from the authors.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical ImagingUltrasound in Clinical Applications

Volltext beim Verlag öffnen

Outstanding performance of <scp>ChatGPT</scp> on the obstetrics and gynecology board certification examination in Japan: Document and image‐based questions analysis

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen