Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Empowering Informal Caregivers of Persons with Early-Stage Dementia with Large Language Models: Challenges and Opportunities (Preprint)
0
Zitationen
4
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Acquiring relevant knowledge and support is essential for informal caregivers of individuals with early-stage dementia, including awareness, access, and use of comprehensive resources for both individuals with dementia and caregiver support. With appropriate strategies and early-stage support, informal caregivers can play a vital role in enhancing the well-being of individuals with dementia and potentially slowing its progression. While large language models (LLMs) can provide easy access to caregiving knowledge, the risks, perceived challenges, and ways to improve LLM-generated responses in practice remain underexplored. </sec> <sec> <title>OBJECTIVE</title> In this study, we aim to (1) examine the risks and perceived challenges of using a baseline ChatGPT-4o, an internet-accessible artificial intelligence (AI) model, for dementia caregiving support and (2) understand how an enhanced version of ChatGPT-4o, equipped with up-to-date dementia caregiving knowledge, can mitigate these risks and challenges. </sec> <sec> <title>METHODS</title> We compiled 32 representative questions from informal caregivers seeking guidance on early-stage dementia from a local office specializing in dementia services and researchers in the field. Next, we developed two conditions of ChatGPT-4o: C1, the baseline model available for public use, and C2, an experimental version enhanced through prompt engineering and grounded in a conceptual framework—drawn from health science and gerontology literature—designed to empower caregivers with early-stage dementia support. Using these conditions, we generated 64 responses—32 pairs corresponding to the questions. Twelve experts evaluated LLM-generated responses using validated tools measuring accuracy, reasoning, clarity, usefulness, trust, satisfaction, safety, harm, and relevance. A Mann-Whitney U test compared conditions. After the survey, we conducted interviews to explore experts’ perceived differences, remaining challenges, and design opportunities. Interviews were transcribed and analyzed using descriptive thematic analysis. </sec> <sec> <title>RESULTS</title> Responses in C2 showed significant improvements in three criteria—actionability, relevance, and perceived satisfaction—compared to C1. However, no significant differences were found in the remaining five: response accuracy, the model’s ability to understand the question, intelligibility, trustworthiness, response safety, and perceived harm. Qualitative analysis of interview results provided deeper insights into two areas: differences between responses from baseline and experimental conditions, and potential explanations for these differences. Twelve experts commented on dimensions including wordiness, level of detail, empathy, satisfaction, accuracy, relevance, and potential bias. While both models were seen as somewhat verbose, responses from the experimental model were generally viewed as more detailed, relevant, and actionable. Although accuracy was perceived as comparable across models, participants expressed greater satisfaction with the experimental model’s responses. </sec> <sec> <title>CONCLUSIONS</title> Results indicate that both conditions generated responses perceived as reasonable and intelligible. However, the experimental model offered more relevant, practical guidance on caregiving needs, providing specific information aligned with the 32 testing questions and actionable recommendations. This led to higher perceived satisfaction compared to the baseline model. </sec>
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.402 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.270 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.702 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.507 Zit.