OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 13.03.2026, 11:02

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Artificial intelligence in health education: a comparative study of chatbots responding to hypertension patient questions

2025·2 Zitationen·European Journal of Preventive CardiologyOpen Access
Volltext beim Verlag öffnen

2

Zitationen

6

Autoren

2025

Jahr

Abstract

Abstract Background AI-powered chatbots, using Large Language Models, may effectively answer questions from patients with hypertension, providing responses that are accurate, empathetic, and easy to read. This study evaluates the performance of three such chatbots in delivering quality responses. Due to staffing shortages within the healthcare system, chatbots may be a viable alternative for diagnosing or educating patients about diseases, particularly chronic conditions most prevalent in the community, such as hypertension. Methods One hundred questions were randomly selected from the Reddit forum r/hypertension and submitted to three publicly available chatbots (ChatGPT-3.5, Microsoft Copilot, Gemini), anonymized as A, B, and C. Two independent medical professionals assessed the accuracy and empathy of their responses using Likert scales. Additionally, 300 responses were analyzed with the WebFX readability tool to measure various readability indices. This tool evaluates the text based on various readability scales, including the Flesch Kincaid Reading Ease Score, Flesch Kincaid Grade Level, Gunning Fog Score, Smog Index, Coleman Liau Index , and Automated Readability Index. Results In total, 300 responses were evaluated. Our findings indicate that Chatbot A consistently produces the most extended responses compared to other chatbots. Utilizing the Flesch-Kincaid Reading Ease scale, it is evident that all chatbot responses are crafted with advanced language. Notably, Chatbot’s A responses are the most challenging to comprehend. The Flesch-Kincaid Grade Level results further reveal that Chatbot’s A responses are the most sophisticated, employing language typically associated with college-level writing. Chatbot B and Chatbot C achieved identical scores regarding the Gunning Fog Score and SMOG Index. However, Chatbot A attained the highest scores again, underscoring its propensity to generate highly advanced responses. The Coleman-Liau Index and Automated Readability Index scores also corroborate the high reading comprehension level required for Chatbot’s A responses, highlighting their complexity and advanced nature. The readability indicator values for all scales significantly differed among chatbots (Table 1). Figure 1 presents a pie chart illustrating the distribution of question categories and subcategories. Conclusions The study indicates that while all chatbots can produce professional responses, their readability varies significantly. These findings underscore the potential of AI chatbots in patient education. However, they also highlight the urgent need for further optimization to enhance the comprehensibility of their outputs. While high readability levels are suitable for medical professionals, they can pose challenges for laypersons who may need to become more familiar with medical terminology. This discrepancy underscores the need for future research to evaluate and optimize the readability of chatbot responses.Tab. 1Readability comparisonFig. 1Frequency of questions

Ähnliche Arbeiten