Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
FROM ALGORITHMS TO EVIDENCE: ASSESSING ARTIFICIAL INTELLIGENCE CHATBOTS AND DRUG DATABASES FOR DETECTING CARDIO-DIABETIC DRUG INTERACTIONS
0
Zitationen
6
Autoren
2025
Jahr
Abstract
Objective: Electronic drug information resources are widely accessible and commonly used by healthcare professionals for identifying drug-drug interactions (DDIs). With the rapid advancements in artificial intelligence (AI), AI-powered chatbots have demonstrated their potential in detecting DDIs. However, variations exist in the scope, completeness, and consistency of information provided by different resources. This study aims to conduct a comparative evaluation of drug interaction databases and AI chatbots to assess their reliability in DDI identification. Methods: A total of three databases, namely Lexicomp, Drugs.com, DrugBank and AI-powered chatbots such as ChatGPT, Copilot and Gemini were used for comparative evaluation. The percentage of interactions that had an entry in each drug information resource was used to score each resource for scope. For each resource that described clinical effects, severity, mechanism, clinical management, and risk factors, a completeness score was calculated. The consistency of the information was assessed using the Fleiss' Kappa (κ) score, estimated with the Statistical Package for the Social Sciences (SPSS), version 29.0 (IBM, USA). Results: A total of 150 drug pairs were selected in the present study. The scope score was highest (100%) for Lexicomp, ChatGPT and Gemini. The completeness score was highest (100%) in all the AI-powered chatbots, followed by Drugs.com (90%) and Lexicomp (85.2%). Fleiss' kappa coefficient was used to determine the inter-resource agreement on DDI severity classification and the overall agreement was categorized as fair (κ=0.28, p<0.001). Cohen’s kappa coefficients were calculated to evaluate pairwise agreement among the resources and the overall mean kappa coefficient (κ=0.51, p<0.01) indicated a moderate level of agreement among the resources. Conclusion: Significant differences amongst the resources were observed in terms of severity classification. Using Lexicomp as reference, accuracy assessment was done and variable sensitivity, specificity, and predictive values among resources were observed. A moderate overall agreement in the inter-resource agreement on DDI presence-absence, with traditional databases showed stronger pairwise agreement than AI chatbots.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.292 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.143 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.539 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.452 Zit.