Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Natural Language Processing and Generative AI in the Automated Scoring and Feedback of Reflective Writing in Medical Education: A Validity and Fairness Analysis
0
Zitationen
4
Autoren
2025
Jahr
Abstract
<title>Abstract</title> <italic>The study explored the application of Natural Language Processing (NLP) and generative AI tools in assessing reflective writing submitted by medical students in Ghana. It evaluated the validity, fairness, and cultural alignment of AI-generated feedback by comparing AI-generated scores with human rater assessments and analyzing demographic group discrepancies. A total of 180 reflective essays were sampled, with an equal number (n = 60) collected from each university</italic>. <italic>Quantitative methods included Cohen’s Kappa and Intraclass Correlation Coefficients (ICC) to assess inter-rater agreement, while logistic regression and multiple regression models examined potential biases across gender, university affiliation, and English proficiency. Qualitative data were gathered through interviews with students and faculty to explore perceptions of fairness, trust, and the AI’s capacity to capture cultural and linguistic nuances. Results indicated that the AI system demonstrated strong inter-rater reliability, with Cohen’s Kappa values of 0.74 (AI vs Rater 1) and 0.76 (AI vs Rater 2), and ICC values of 0.78 and 0.80, respectively. Human raters showed higher agreement with each other (Cohen’s Kappa = 0.81, ICC = 0.85). However, significant discrepancies were found across demographic groups, particularly for English proficiency, where lower proficiency students tended to receive higher AI scores than human raters (log-odds of 0.45, p = 0.001). Thematic analysis of qualitative interviews revealed concerns over the lack of empathy in AI feedback, misalignment with cultural and linguistic nuances, and mixed levels of trust in AI-generated assessments. These findings suggest that while AI holds promise for improving efficiency in assessment, careful attention must be given to its limitations in fairness and cultural sensitivity. The study concluded with recommendations for improving AI systems through contextual adaptation, hybrid assessment models, faculty training, and regular bias audits to ensure equitable and effective use of AI in educational settings.</italic>
Ähnliche Arbeiten
The qualitative content analysis process
2008 · 21.614 Zit.
Making sense of Cronbach's alpha
2011 · 13.700 Zit.
Standards for Reporting Qualitative Research
2014 · 10.978 Zit.
Health professionals for a new century: transforming education to strengthen health systems in an interdependent world
2010 · 5.688 Zit.
Audit and feedback: effects on professional practice and healthcare outcomes
2012 · 5.491 Zit.