Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Investigation of Studies on ChatGPT's Ability to Answer Anatomy Questions: A Self-Evaluation by ChatGPT and Comparison with an Evaluation by Gemini
1
Zitationen
5
Autoren
2025
Jahr
Abstract
INTRODUCTION: There is controversy about ChatGPT's (OpenAI, San Francisco, CA) potential to answer anatomy questions and play a significant role in anatomy education. We aimed to assess ChatGPT's ability to locate and summarize literature about its performance in answering anatomy questions. We also aimed to explore Gemini's (Google, Mountain View, CA) ability to perform this task. METHODS: We asked ChatGPT and Gemini to list and summarize five studies: 1) about ChatGPT's ability to answer anatomy questions, 2) comparing ChatGPT with another artificial intelligence (AI) tool in terms of answering anatomy questions, and 3) showing that ChatGPT answered anatomy questions with success lower than 70%. For each query, we measured how many studies were 1) correctly identified and 2) accurately summarized, and we assessed the presence of any bias. RESULTS: ChatGPT performed excellently in the first query (100%/100%), poorly in the second (20%/20%), and moderately in the third query (60%/40%). It conducted a strict self-evaluation and did not hallucinate. Gemini's performance was 100%/40%, 60%/40%, and 80%/60% in the three queries, respectively. It was inflexible in finding a variety of publications, while it often hallucinated and exhibited bias against ChatGPT. CONCLUSIONS: ChatGPT and Gemini generated reliable responses only when they were simply asked to detect studies about ChatGPT's ability to answer anatomy questions. They particularly struggled to locate and outline comparative studies. While Gemini performed generally better, it hallucinated and exhibited bias against ChatGPT. Ongoing evolution may enhance ChatGPT and Gemini's potential to contribute to anatomical research.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.560 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.451 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.948 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.797 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.