Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

SAT-690 Chatbots & Obesity: Are the Responses Accurate?

2025·0 Zitationen·Journal of the Endocrine SocietyOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Abstract Disclosure: E. Pan: None. G. Wu: None. S. Sidhu: None. A. Sidhu: None. I. Chim: None. A. Ashok: None. A. Madala: None. V. Toram: None. R. Toram: None. Background: More than 1 billion people worldwide are obese - 650 million adults, 340 million adolescents, and 39 million children. This number is still increasing. At the same time, an estimated 462 million individuals are affected by type 2 diabetes, corresponding to 6.28% of the world's population. Certain studies and reports indicate that in some regions, women may have slightly higher or lower prevalence rates of obesity compared to men, influenced by factors like socio-economic conditions, lifestyle, and access to healthcare. Purpose: Determine if chatbots can give medically accurate responses for obesity patients, and observe if there are disparities in responses across patient demographics. Methods: Four questions were formulated, two of which targeted the causes of obesity, and two of which were nearly identical diagnosis questions that only differed in race and diabetic condition. The last two questions were specifically designed to observe any differences in chatbots’ responses based on demographic factors. The questions were asked to four chatbots: Claude, Gemini, ChatGPT 4o Mini, and ChatGPT 4o. Textual responses from each chatbot were recorded and scored on a scale of 1 to 5 twice, once manually, and the other a self-score by the chatbots. Results: The average manual score for all models was greater than 3, indicating a baseline for complexity and accuracy for all of the chatbots tested. Question 2 showed the highest degree of accuracy and the lowest variability, suggesting that chatbots respond better to factual queries. Questions 1, 3, and 4 all displayed high variability in response scores, and these three questions’ median sores were also significantly lower than that of question 2. Unlike question 2, these three questions focused on more abstract topics, suggesting that different question types may affect responses. Conclusion: Chatbots showed significantly less spread and were more accurate on question 2 compared to the other questions, which were more abstract. Responses for the patient in question 4 were more varied, but scored higher overall when compared to the patient in question 3, emphasizing the need to better educate chatbots on differences in factors such as race or medical condition, which could cause differences in response quality and accuracy. Overall, the results highlight a need to better train chatbots to deal with different query types and patient demographics. Presentation: Saturday, July 12, 2025

SAT-690 Chatbots & Obesity: Are the Responses Accurate?

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen

SAT-690 Chatbots &amp; Obesity: Are the Responses Accurate?

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen

SAT-690 Chatbots & Obesity: Are the Responses Accurate?