OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.05.2026, 00:58

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Integrating transformers to outperform classical and deep learning in ChatGPT-in-education sentiment analysis with an explainable web app

2026·0 Zitationen·Discover Artificial IntelligenceOpen Access
Volltext beim Verlag öffnen

0

Zitationen

4

Autoren

2026

Jahr

Abstract

Artificial intelligence (AI) is reshaping education by enabling personalized learning and increasing student engagement. The rapid adoption of tools such as ChatGPT, however, raises questions about efficacy, academic integrity, and ethics. This study aims to fill a critical gap by providing a systematic comparative evaluation of classical, deep learning, and transformer-based models for sentiment analysis of ChatGPT-related educational discourse, identifying the best-performing approach, and deploying it in an explainable, user-friendly web application for non-technical stakeholders. This study examines public sentiment regarding ChatGPT in education by using a comparative framework that includes transformer-based model, deep learning, and classical machine learning approaches. Previous research in this area has mainly depended on small-scale surveys or static datasets, often without utilizing modern natural language processing (NLP) techniques. This study aims to fill those gaps by gathering 236,275 tweets related to ChatGPT in education and implementing a systematic benchmarking process. Neutral tweets were removed, resulting in a binary classification of positive and negative labels. The data was divided into an 80/20 split for training and testing. A Naive Bayes model with TF-IDF vectorization served as the classical baseline. For the deep learning tier, Long Short-Term Memory (LSTM) and Bidirectional LSTM (Bi-LSTM) models with Word2Vec embeddings were used. The transformer tier was represented by a fine-tuned DistilBERT model. The models were evaluated based on accuracy, precision, recall, and F1-score. The fine-tuned DistilBERT achieved the highest accuracy at 98.81%, indicating a strong performance of transformer-based models compared to classical machine learning and deep learning approaches in this dataset. In contrast, the LSTM model achieved 95.40%, the Bi-LSTM reached 94.66%, and the Naive Bayes model recorded an accuracy of 81.77%. A Streamlit-based web application was developed that integrates LIME-based explainability to make sentiment predictions accessible to educators and policymakers. However, it is important to note that this study is limited to English-language Twitter data and relies on a single-domain, single train-test split; therefore, the results may not fully generalize beyond this context. While these findings are promising, broader generalization warrants further investigation. Future research should explore multilingual datasets, additional social media platforms, and more advanced transformer architectures to enhance generalizability and expand applicability in diverse educational contexts.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationSocial Media in Health EducationHate Speech and Cyberbullying Detection
Volltext beim Verlag öffnen