OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.03.2026, 20:39

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Toward Inclusive AI: Building a Bangla Chatbot Using GPT-Neo and Optimized BPE Tokenizer

2025·0 Zitationen
Volltext beim Verlag öffnen

0

Zitationen

2

Autoren

2025

Jahr

Abstract

An advancement in the field of Ai and natural language processing (NLP) is the leveraging the GPT-Neo architecture to develop an interactive chatbot. This study highlights the significance of providing NLP tools for Bangla, spoken by over 230 million people but underrepresented in computational linguistics, using GPT-2 by employing GPT-Neo for its superior language understanding and generation capabilities. The research involves curating a custom dataset from various Bangla sources to reflect the language’s complexities, followed by a rigorous pretraining and fine-tuning process with a conversational dataset aimed at enhancing the chatbot’s ability to produce contextually coherent and fluent dialogues. A notable contribution is the creation of an advanced Byte Pair Encoding (BPE) tokenizer optimized for Bangla, significantly improving the model’s efficiency in language processing. The paper details the methodology for dataset preparation, model training, and tokenizer development, positioning the chatbot as a pivotal tool for Bangla language processing and setting the stage for further efforts to make the AI more inclusive of linguistically diverse communities. Finally it discusses the potential applications, ethical considerations, and directions for future research.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

AI in Service InteractionsArtificial Intelligence in Healthcare and EducationDigital Mental Health Interventions
Volltext beim Verlag öffnen