Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

GPT-3.5 for Data Augmentation in Automatic Essay Scoring: A Preliminary Analysis

2025·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Machine learning models are susceptible to the dataset used during its training. Dealing with limited or imbalanced datasets is challenging, and a commonly adopted approach to mitigate this limitation is data augmentation. For example, expanding the training set in a computer vision problem may involve rotation and resizing images; however, this task is more complex when dealing with textual data. This work investigates the use of GPT-3.5 for data augmentation in a dataset of argumentative essay texts from the National High School Exam (ENEM), which is used as a selection criterion for entry into public universities in Brazil. More specifically, we adopted traditional Natural Language Processing (NLP) techniques for essay scoring and compared the results with and without the data augmentation. Our results show that the long argumentative essays generated by GPT in the data augmentation process did not improve the performance of NLP models. Moreover, GPT could not adequately classify its synthetic data, suggesting poor quality of the generated data, and did not outperform NLP models in classifying real data.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationComputational and Text Analysis MethodsOnline Learning and Analytics

Volltext beim Verlag öffnen

GPT-3.5 for Data Augmentation in Automatic Essay Scoring: A Preliminary Analysis

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen