Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Benchmarking AI chatbots: assessing their accuracy in identifying hijacked medical journals

2025·2 Zitationen·Diagnosis

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

OBJECTIVES: The challenges posed by questionable journals to academia are very real, and being able to detect hijacked journals would be valuable to the research community. Using an artificial intelligence (AI) chatbot may be a promising approach to early detection. The purpose of this research is to analyze and benchmark the performance of different AI chatbots in identifying hijacked medical journals. METHODS: This study utilized a dataset comprising 21 previously identified hijacked journals and 10 newly detected hijacked journals, alongside their respective legitimate versions. ChatGPT, Gemini, Copilot, DeepSeek, Qwen, Perplexity, and Claude were selected for benchmarking. Three question types were developed to assess AI chatbots' performance in providing information about hijacked journals, identifying hijacked websites, and verifying legitimate ones. RESULTS: The results show that current AI chatbots can provide general information about hijacked journals, but cannot reliably identify either real or hijacked journal titles. While Copilot performed better than others, it was not error-free. CONCLUSIONS: Current AI chatbots are not yet reliable for detecting hijacked journals and may inadvertently promote them.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationAI in Service InteractionsMisinformation and Its Impacts

Volltext beim Verlag öffnen

Benchmarking AI chatbots: assessing their accuracy in identifying hijacked medical journals

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen