Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Corpus-Based Evaluation Models for Quality Assurance Of AI-Generated ESL Learning Materials

2022·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2022

Jahr

Abstract

This study addresses the problem that AI-generated ESL learning materials can appear fluent yet vary in accuracy, level appropriateness, and coherence, weakening quality assurance for large-scale cloud and enterprise deployment. The purpose was to develop and validate a corpus-based evaluation model that links corpus indicators to stakeholder quality judgments. Using a quantitative cross-sectional, case-based design, N = 120 evaluators assessed M = 80 AI-generated texts across four categories (reading passages, dialogues, grammar explanations, and practice prompts) using a five-point Likert instrument. Key dependent variables were overall QA and subscales for accuracy, clarity, coherence, level appropriateness, and pedagogical usefulness; key independent variables were readability control index, lexical appropriacy score, cohesion score, lexical diversity (HD-D), and grammar error rate (errors per 100 words). Analyses used descriptive statistics, Cronbach’s alpha, Pearson correlations, and multiple regression with text-type stability checks. Overall perceived quality was acceptable (overall QA M = 3.84, SD = 0.53), with clarity highest (M = 3.96) and accuracy lowest (M = 3.72). Reliability was strong (overall α = .91). Corpus to human alignment was substantial: readability control correlated with level appropriateness (r = .61), cohesion with coherence (r = .58), lexical appropriacy with clarity (r = .52) and usefulness (r = .49), and grammar error rate with accuracy (r = −.67), all p < .001. A five-predictor regression model predicted overall QA (F (5,74) = 21.64, p < .001; R² = .59; Adj. R² = .56), with grammar error rate the strongest predictor (β = −.41), followed by readability (β = .29), cohesion (β = .24), and lexical appropriacy (β = .21); performance remained stable across text types (R² = .52–.61). Implications are that organizations can operationalize QA as automated gates for error density, readability bands, cohesion thresholds, and vocabulary profile alignment, then reserve human review for borderline cases to improve safety, consistency, and turnaround time in enterprise content workflows. Average indicators were overall readability 0.64, lexical appropriacy 0.71, cohesion 0.59, lexical diversity 0.82, and grammar error rate 2.40 per 100 words.

Autoren

Institutionen

Independent University(BD)

Themen

Artificial Intelligence in Healthcare and EducationText Readability and SimplificationEthics and Social Impacts of AI

Volltext beim Verlag öffnen

Corpus-Based Evaluation Models for Quality Assurance Of AI-Generated ESL Learning Materials

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen