Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
An Explication and Classroom Field Study of the Virtual Human Interaction Lab’s Expert (VHIL-E) LLM
0
Zitationen
7
Autoren
2026
Jahr
Abstract
VHIL-E is the Virtual Human Interaction Lab's Expert, a large language model (LLM) representing the lab's research, teaching, and outreach on virtual and augmented reality. We first motivate the project and then present best practices for implementing a retrieval-augmented generation (RAG), specifically the "Seven Cs": <i>collecting</i>, <i>cleaning</i>, <i>classifying</i>, <i>chunking</i>, <i>creating</i> embeddings, <i>correlating</i> embeddings into an index, and <i>connecting</i> the index to an LLM. We collected academic publications, transcribed public talks, dedicated interviews and news articles about the lab's research, and various curriculum materials from Virtual People, the lecture course taught about the lab's research for over two decades, resulting in over 2.3 million words broken into ∼10,000 chunks. In study 1, we compared performance on a multiple-choice test of 231 questions between various implementations of the RAG system and traditional Base GPT models. Models generally performed well, between 83 and 90 percent, similar to student performance on course examinations. In study 2, an open-ended task implemented over 10 weeks in the Fall 2025 Virtual People course, students (<i>N</i> = 89) used VHIL-E to query and understand the course materials and logged hallucinations, which were "egregiously wrong answers." Students then chose the single worst wrong answer over 8 weeks. They then compared the Base/RAG Hybrid-which prioritized the RAG but allowed ChatGPT to consult its general intelligence-to the RAG Constrained, which limited substantive information to the embedded index. Allowing VHIL-E access to GPT produced more than twice as many hallucinations as constraining it to the index. We discuss implications for scholars who build and use RAG-based LLM applications.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.418 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.288 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.726 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.516 Zit.