Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Detecting Human vs AI-Generated Text in Urdu: A Comparative Study of Deep Learning and Transformer Models
0
Zitationen
2
Autoren
2025
Jahr
Abstract
The emergence of large language models (LLMs) including GPT-4 has blurred the line between human and machine-generated writing, raising concerns about incorrect information, academic integrity, and content authenticity. This challenge is specifically crucial for Urdu, a low-resource language with limited datasets and NLP tools. To deal with this gap, we employed the Urdu Human and AI text (UHAT) dataset and performed a comparative study of deep learning and transformer-based models for binary classification. Conventional sequence models, such as RNN, LSTM, and GRU with Word2Vec and FastText embeddings, had been benchmarked in opposition to pretrained transformers including BERT, mBERT, UrduBERT, and DistilBERT etc. Experimental outcomes show that transformer models significantly outperform recurrent architectures. Among them, multilingual BERT (mBERT) achieved the best performance with 91.67% accuracy and F1-score, surpassing UrduBERT and previous benchmarks like XLM-RoBERTa on the HLU corpus. These findings set up strong baseline results for Urdu AI-text detection and underscore the potential of multilingual transformers in low-resource settings.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.250 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.109 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.482 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.434 Zit.