Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
A BLEU-Based Evaluation of ChatGPT's Chinese-to-English Translation
0
Zitationen
2
Autoren
2025
Jahr
Abstract
Political text translation presents unique challenges requiring precise ideological expression, cultural sensitivity, and terminological consistency—aspects that extend beyond conventional linguistic accuracy. While ChatGPT demonstrates growing capabilities in machine translation tasks, its performance in specialized political discourse remains underexplored. This study evaluates ChatGPT's Chinese-to-English translation quality using the 2023 Chinese Government Work Report, employing both BLEU metrics and human assessment across three criteria: syntax and grammar, cultural and ideological accuracy, and fluency and coherence. Three experienced translators evaluated ChatGPT's translations using a 6-point scale, while BLEU scores provided automated evaluation. Results reveal a significant contradiction: while BLEU scores remained low (0.31-0.37), human evaluation showed moderate performance with notable variations across criteria. ChatGPT achieved the highest scores in fluency and coherence (5.53 average) but struggled significantly with cultural and ideological accuracy (4.43 average), particularly in preserving political terminology precision and contextual appropriateness. Critical issues include generic translations of politically specific terms and inadequate handling of culturally embedded expressions. The study's key finding demonstrates that BLEU evaluation alone is fundamentally insufficient for assessing political text translation quality due to single-reference constraints and inability to capture ideological nuances. Our findings highlight the limitations of BLEU in evaluating politically nuanced texts and underscore the necessity of human evaluation for meaningful assessment of specialized domain translation. This research contributes to understanding AI translation capabilities in political discourse and provides evidence-based recommendations for developing more appropriate evaluation frameworks for specialized translation domains.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.635 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.543 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.051 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.844 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.