Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Validation of the Use of a Large Language Model for Detecting Sentiment in Student Course Evaluation
0
Zitationen
11
Autoren
2026
Jahr
Abstract
Background and Objectives: The use of large language models and natural language processing (NLP) in medical education has expanded rapidly in recent years. Because of the documented risks of bias and errors, these artificial intelligence (AI) tools must be validated before being used for research or education. Traditional and novel conceptual frameworks can be used. This study aimed to validate the application of an NLP method, bidirectional encoder representations from transformers (BERT) model, to identify the presence and patterns of sentiment in end-of-course evaluations from M3 (medical school year 3) core clerkships at multiple institutions. Methods: We used the Patino framework, designed for the use of artificial intelligence in health professions education, as a guide for validating the NLP. Written comments from de-identified course evaluations at four schools were coded by teams of two human coders, and human-human interrater reliability statistics were calculated. Humans identified key terms to train the BERT model. The trained BERT model predicted the sentiments of a set of comments, and human-NLP interrater reliability statistics were calculated. Results: A total of 364 discrete comments were evaluated in the human phase. The range of positive (30.6%–61.0%), negative (4.9%–39.5%), neutral (9.8%–19.0%), and mixed (1.7%–27.5%) sentiments varied by school. Human-human and human-AI interrater reliability also varied by school. Human-human and human-AI reliability were comparable. Conclusions: Several conceptual frameworks offer models for validation of AI tools in health professions education. A BERT model, with training, can detect sentiment in medical student course evaluations with an interrater reliability similar to human coders.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.245 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.100 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.466 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.429 Zit.
Autoren
Institutionen
- Rush University Medical Center(US)
- Michigan State University(US)
- University of Nebraska Medical Center(US)
- University of Toledo Medical Center(US)
- University of Kentucky(US)
- University of Kansas(US)
- Eastern Virginia Medical School(US)
- Old Dominion University(US)
- Rosalind Franklin University of Medicine and Science(US)
- Nova Southeastern University(US)