Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparison of the accuracy and reliability of ChatGPT-4o and Gemini in answering HIV-related questions
0
Zitationen
2
Autoren
2025
Jahr
Abstract
<title>Abstract</title> Bacground: This is the first study to evaluate the accuracy and reliability of the ChatGPT and Gemini chatbots' on HIV. Methods A total of 156 questions about HIV in 3 different categories (CDC, guideline and social media) were asked to both ChatGPT and Gemini. The chatbots' answers were scored on a scale of 1 to 4 (1 = completely wrong, 4 = completely correct) by two different infectious disease experts. The reproducibility of both chatbots was also analysed. Results The mean score of the answers generated for all questions was 3.69 ± 0.72 for ChatGPT and 3.55 ± 0.81 for Gemini (p = 0.051). The rate of completely correct answers was 81.4% for ChatGPT and 71.8% for Gemini (p = 0.045). ChatGPT answered guideline questions with lower accuracy than CDC questions (47.9% vs. 97.1%, p = 0.000) and social media questions (47.9% vs. 94.9%, p = 0.000). Similarly, Gemini answered guideline questions with lower accuracy than CDC questions (35.4% vs. 88.4%, p = 0.000) and social media questions (35.4% vs. 87.2%, p = 0.000). Considering the questions according to the topics, the lowest accuracy rate for both chatbots was in the subject of ‘Prevention and Treatment’ (67.2% for ChatGPT, 54.7% for Gemini). The reproducibility of the answers was 94.8% for ChatGPT and 90.3% for Gemini. Conclusion ChatGPT and Gemini answered CDC and the social media questions with high accuracy. However, both chatbots need improvement for guideline questions and questions on “Prevention and Treatment”. Therefore, these applications need to be improved for the use of healthcare professionals.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.239 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.095 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.463 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.428 Zit.