Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Benchmarking AI chatbots: assessing their accuracy in identifying hijacked medical journals
2
Zitationen
3
Autoren
2025
Jahr
Abstract
OBJECTIVES: The challenges posed by questionable journals to academia are very real, and being able to detect hijacked journals would be valuable to the research community. Using an artificial intelligence (AI) chatbot may be a promising approach to early detection. The purpose of this research is to analyze and benchmark the performance of different AI chatbots in identifying hijacked medical journals. METHODS: This study utilized a dataset comprising 21 previously identified hijacked journals and 10 newly detected hijacked journals, alongside their respective legitimate versions. ChatGPT, Gemini, Copilot, DeepSeek, Qwen, Perplexity, and Claude were selected for benchmarking. Three question types were developed to assess AI chatbots' performance in providing information about hijacked journals, identifying hijacked websites, and verifying legitimate ones. RESULTS: The results show that current AI chatbots can provide general information about hijacked journals, but cannot reliably identify either real or hijacked journal titles. While Copilot performed better than others, it was not error-free. CONCLUSIONS: Current AI chatbots are not yet reliable for detecting hijacked journals and may inadvertently promote them.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.578 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.470 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.984 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.814 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.