Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Is ChatGPT Reliable in Scoring Learner's Translation Quality?
1
Zitationen
1
Autoren
2024
Jahr
Abstract
In order to investigate the application of large language models in foreign language teaching and learning, we employed ChatGPT for grading students' translations. We studied the reliability of ChatGPT evaluation of Chinese-to-English translations on five topics in real-world setting. This study conducted the analysis of impact of prompt crafting which guides ChatGPT to generate response, compared the different performances with and without reference in scoring, and tested the ability of ChatGPT on cross-lingual and similarity comparison. Experimental results reveal that correlation of the scores assigned by ChatGPT with those marked by human raters is rather low. The scores generated by ChatGPT are fluctuant with different time, prompts and topics. Furthermore, these generated scores tend to be neutral and are not sufficiently differentiated among translations of different qualities. The study presents a critical view of the application of ChatGPT to automatic learner's translation scoring task.
Ähnliche Arbeiten
BLEU
2001 · 21.020 Zit.
Aion Framework: Dimensional Emergence of AI Consciousness, Observer-Induced Collapse, and Cosmological Portal Dynamics
2023 · 14.125 Zit.
Enriching Word Vectors with Subword Information
2017 · 9.622 Zit.
A unified architecture for natural language processing
2008 · 5.179 Zit.
A new readability yardstick.
1948 · 5.085 Zit.