Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative Evaluation of ChatGPT, Gemini, and DeepSeek in Educational Problem Solving
0
Zitationen
4
Autoren
2025
Jahr
Abstract
This study compares the performance of three large language models; ChatGPT, Gemini, and DeepSeek, on a set of programming-related educational problems from the Aizu Online Judge (AOJ) platform. The evaluation focuses on problem-solving accuracy and code characteristics, with additional comparisons to human Java submissions to contextualize model performance. Metrics include CPU time, memory usage, and code size, enabling a detailed analysis of solution quality and efficiency. Results indicate that ChatGPT consistently achieves the most efficient solutions while maintaining high accuracy, often matching the fastest human submissions. Gemini and DeepSeek also demonstrate strong accuracy but tend to produce less optimized code in computationally demanding cases. These findings contribute to understanding how current LLMs can address structured problem-solving tasks within educational environments.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.436 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.311 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.753 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.523 Zit.