Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative Analysis of ChatGPT, GPT-4, and Microsoft Copilot Chatbots for GRE Test
7
Zitationen
4
Autoren
2024
Jahr
Abstract
This paper presents an analysis of how well three artificial intelligence chatbots: Copilot, ChatGPT, and GPT-4, perform when answering questions from standardized tests, mainly the Graduate Record Examination (GRE). A total of 137 questions with different forms of quantitative reasoning and 157 questions with verbal categories were used to assess the chatbot’s capabilities. This paper presents the performance of each chatbot across various skills and styles tested in the exam. The proficiency of the chatbots in addressing image-based questions is also explored, and the uncertainty level of each chatbot is illustrated. The results show varying degrees of success among the chatbots. ChatGPT primarily makes arithmetic errors, whereas the highest percentage of errors made by Copilot and GPT-4 are conceptual. However, GPT-4 exhibited the highest success rates, particularly in tasks involving complex language understanding and image-based questions. Results highlight the ability of these chatbots in helping examinees to pass the GRE with a high score, which encourages the use of them in test preparation. The results also show the importance of preventing access to similar chatbots when tests are conducted online, such as during the COVID-19 pandemic, to ensure a fair environment for all test takers competing for higher education opportunities.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.100 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.466 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.