Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating Large Language Models to Support Dementia Caregivers: Identifying Opportunities for Improvement
0
Zitationen
5
Autoren
2025
Jahr
Abstract
Abstract Awareness and access to the dementia caregiving resources is crucial for informal caregivers of people with early-stage dementia. Large language models (LLMs) offer easy access to caregiving information, but the risks, challenges, and ways to improve LLM-generated responses remain understudied. This mixed methods study evaluated LLMs, including the baseline ChatGPT-4o model and an enhanced version refined through prompt engineering grounded in health science and gerontology literature, to support informal dementia caregivers. This study aimed to assess key factors influencing preferred responses from LLMs and to identify related risks and challenges, thereby informing opportunities for improvement. Surveys and interviews with 12 stakeholders, including 10 healthcare professionals and 2 caregivers, were conducted to assess model responses to questions commonly asked by caregivers. The responses were assessed using validated multidimensional measures based on a human evaluation framework for LLMs in healthcare. Survey results showed the enhanced ChatGPT-4o model scored significantly higher in actionability, relevance, and satisfaction than the baseline ChatGPT-4o version. However, no significant differences were observed between the models in accuracy, understanding, intelligibility, trust, safety, or potential harm. Key themes from interview data influencing preferred responses included wordiness, in-depth content, empathy, actionability, accuracy, relevance, and bias. Overall, while both models’ responses were perceived as overly verbose, the enhanced model provided more comprehensive and caregiver-centered information than the baseline ChatGPT-4o. These findings suggest that LLMs can deliver more practical and satisfying guidance for dementia caregiving. Incorporating domain-specific frameworks into model design may enable scalable, evidence-based support for caregivers in real-world settings.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.422 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.300 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.734 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.519 Zit.