Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Examination of the Quality and Readability of Chatbot Responses to Patient Questions: A Synthesis of Recent Studies (Preprint)
0
Zitationen
2
Autoren
2025
Jahr
Abstract
<sec> <title>BACKGROUND</title> Patient use of chatbots to obtain medical information has been anticipated with both optimism and pessimism. The simplicity of asking questions and receiving immediate answers has prompted investigators to examine the quality and readability of chatbot responses. We sought to review the current results at this nascent stage in chatbot development. </sec> <sec> <title>OBJECTIVE</title> To evaluate current data. </sec> <sec> <title>METHODS</title> We searched multiple databases to identify studies that evaluated response quality using the DISCERN instrument; designed to assess written material intended for patients. From these studies, we extracted the DISCERN scores, the number of words used in questions, the number of questions asked, the number of evaluators, and, if recorded, the readability of the responses. We also examined a measure of the rank of the journals in which the studies were published. We combined these parameters in a multiple linear regression model to determine potential associations with response quality. </sec> <sec> <title>RESULTS</title> We identified 32 studies that conducted 57 tests using multiple chatbots. The average number of words in chatbot prompts ranged from 6 to 41, and the number of questions ranged from 3 to 119. As the response quality increased, readability decreased. Forty-two percent of tests produced average responses ranked as “good” or higher, and only one response test was below college-level readability. An increased DISCERN score was associated with increased prompt words and questions in simple linear regression. In a multiple linear regression model, higher DISCERN scores were associated with the number of questions, three or more evaluators, inversely with the journal rank, but not with the number of prompt words. </sec> <sec> <title>CONCLUSIONS</title> The variable quality and poor readability of chatbot responses to patient questions reinforces pessimism about their role. However, the principles of prompt engineering (the art of asking questions) have yet to be rigorously applied. Therefore, we remain optimistic that response quality and readability will improve. </sec>
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.239 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.428 Zit.