Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Using GPT‐4 to Augment Imbalanced Data for Automatic Scoring
1
Zitationen
3
Autoren
2025
Jahr
Abstract
Abstract Machine learning‐based automatic scoring faces challenges with imbalanced student responses across scoring categories. To address this, we introduce a novel text data augmentation framework that leverages GPT‐4, a generative large language model specifically tailored for imbalanced datasets in automatic scoring. Our experimental dataset consisted of student‐written responses to four science items. We crafted prompts for GPT‐4 to generate responses, especially for minority scoring classes, enhancing the dataset. We then fine‐tuned DistilBERT for automatic scoring based on the augmented and original datasets. Model performance was assessed using accuracy, precision, recall, and F 1 metrics. Our findings revealed that incorporating GPT‐4‐augmented data significantly improved model performance, particularly in terms of precision and F 1 scores. Interestingly, the extent of improvement varied depending on the specific dataset and the proportion of augmented data used. Notably, we found that a varying amount of augmented data (20%‐40%) was required to achieve stable improvement in automatic scoring. Comparisons with models trained on additional student‐written responses suggest that GPT‐4 augmented models align with those trained on student data. This research highlights the potential and effectiveness of data augmentation techniques, utilizing generative large language models like GPT‐4, in addressing imbalanced datasets within automatic assessment.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.102 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.468 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.