Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparing ChatGPT and Gemini on a Two-Tier Static Fluid Test: Capability and Scientific Consistency
1
Zitationen
3
Autoren
2025
Jahr
Abstract
This study examined the capability and scientific consistency of ChatGPT and Gemini using a two-tier test. The capability and scientific consistency of ChatGPT and Gemini were compared with those of students. The study used 60 new chats with ChatGPT and Gemini, 120 students in 8th and 9th grade, 129 students in 11th and 12th grade, 260 undergraduate elementary teacher education students (across four cohorts), and 51 students from the professional education program for elementary school teachers. Data were collected through online testing for student participants and prompting processes for ChatGPT and Gemini using a 25-item two-tier test. Quantitative data analysis was employed to compare capability and consistency scores across all subjects. Qualitative-descriptive analysis was also conducted to examine the aspects of capability and scientific consistency behavior of ChatGPT and Gemini. Data analysis showed that the capability and scientific consistency of ChatGPT-4 and Gemini in responding to the test type were categorized as low and below the entry threshold, and higher than those of the students. Both generative AI systems performed better at providing theoretical justifications or reasoning than at answering factual questions about static fluids. ChatGPT outperformed Gemini only in the combined scores for Tier-1 and Tier-2 items. Both generative AI systems demonstrated conceptual insights and understanding of static fluids, though these insights sometimes contained biases and contradictions. As AI systems built on large language models, ChatGPT and Gemini heavily rely on availability and require a more extensive and diverse database containing static fluid cases.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.400 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.261 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.695 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.506 Zit.