Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Exploring the Potential of <scp>ChatGPT</scp> for Evaluating English Essays in a Criterion‐Based Assessment
3
Zitationen
3
Autoren
2025
Jahr
Abstract
Abstract Open access to novel AI tools offers unprecedented opportunities for human–AI collaboration in writing instruction and assessment. While research on using generative AI tools like ChatGPT in these contexts is emerging, more is needed to understand their effectiveness as Automated Writing Evaluation (AWE) tools. This study explores the potential of ChatGPT (GPT‐3.5) to assist teachers and learners in the North Atlantic Treaty Organization (NATO) by evaluating English writing based on holistic scoring criteria. Using a mixed‐methods approach, the study compared ChatGPT's ratings with human ratings on 100 writing tests to assess inter‐rater reliability. It also analyzed the justifications provided by both human raters and ChatGPT to evaluate how well ChatGPT understood the rating criteria at different proficiency levels and whether its rationales could provide effective feedback for learners and support teacher feedback practices. Results showed strong agreement between ChatGPT's and human ratings, with ChatGPT demonstrating a similar understanding of the rating scales and offering justifications with elements of effective feedback. These findings indicate that ChatGPT holds promise as an AWE tool, providing meaningful feedback and valuable insights into holistic rating scales. This study encourages further exploration of AI in the L2 classroom and suggests leveraging AI to enhance writing pedagogy and classroom‐based assessment.
Ähnliche Arbeiten
BLEU
2001 · 20.987 Zit.
Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations (Short Paper)
2023 · 14.120 Zit.
Enriching Word Vectors with Subword Information
2017 · 9.610 Zit.
A unified architecture for natural language processing
2008 · 5.178 Zit.
A new readability yardstick.
1948 · 5.077 Zit.