OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 18.03.2026, 02:00

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

GPT-3.5 for Data Augmentation in Automatic Essay Scoring: A Preliminary Analysis

2025·0 ZitationenOpen Access
Volltext beim Verlag öffnen

0

Zitationen

7

Autoren

2025

Jahr

Abstract

Machine learning models are susceptible to the dataset used during its training. Dealing with limited or imbalanced datasets is challenging, and a commonly adopted approach to mitigate this limitation is data augmentation. For example, expanding the training set in a computer vision problem may involve rotation and resizing images; however, this task is more complex when dealing with textual data. This work investigates the use of GPT-3.5 for data augmentation in a dataset of argumentative essay texts from the National High School Exam (ENEM), which is used as a selection criterion for entry into public universities in Brazil. More specifically, we adopted traditional Natural Language Processing (NLP) techniques for essay scoring and compared the results with and without the data augmentation. Our results show that the long argumentative essays generated by GPT in the data augmentation process did not improve the performance of NLP models. Moreover, GPT could not adequately classify its synthetic data, suggesting poor quality of the generated data, and did not outperform NLP models in classifying real data.

Ähnliche Arbeiten