OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 10.04.2026, 18:38

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Artificial Intelligence as a Guiding Voice: Evaluating Chatbots for Users with Visual Impairments

2026·0 Zitationen
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2026

Jahr

Abstract

The article presents a comparative analysis of two chatbots: ChatGPT and DeepSeek as assistive tools for individuals with visual impairments. The objective of the present study was to evaluate and compare the performance, reliability, and operating costs of both technologies in a simulated voice-controlled online shopping environment. The study followed a scenario in which 30 healthy participants performed a series of typical tasks using voice commands in the context of functioning in a simulated online store. A combination of predefined commands and free speech was utilised, enabling the evaluation of the models' behaviour in both predictable and improvised situations. A substantial advantage was demonstrated by ChatGPT in terms of performance, both during the process of model initialisation and in the generation of responses. The average response time of ChatGPT was almost four times shorter than that of DeepSeek for the first command variant and about 3.5 times shorter for the second variant. In addition, ChatGPT showed an average model initialisation time almost 20 times shorter than DeepSeek. In contrast, the DeepSeek model demonstrated a marked economy in terms of resource consumption, with a total API usage cost of $0.76 compared to $2.21 for ChatGPT Both reliability, measured as the number of correct responses, errors, and unexpected failure cases, as well as the compliance with user instructions, which was utilised to assess the correctness of assigning user commands to predefined functions and objects, were comparable between the two models. ChatGPT generated 537 correct and 63 incorrect responses, while DeepSeek generated 536 correct and 64 incorrect responses, with a small number of unexpected failure cases (four for ChatGPT and two for DeepSeek). Statistical analysis using the Wilcoxon rank-sum test showed that the differences in response times and initialisation times between models were statistically significant (p-value < 0.001). The findings of this study have the potential to make a substantial contribution to the development of systems that support individuals with visual impairments.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

AI in Service InteractionsArtificial Intelligence in Healthcare and EducationDigital Mental Health Interventions
Volltext beim Verlag öffnen