OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 24.03.2026, 13:54

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

FROM ALGORITHMS TO EVIDENCE: ASSESSING ARTIFICIAL INTELLIGENCE CHATBOTS AND DRUG DATABASES FOR DETECTING CARDIO-DIABETIC DRUG INTERACTIONS

2025·0 Zitationen·International Journal of Applied PharmaceuticsOpen Access
Volltext beim Verlag öffnen

0

Zitationen

6

Autoren

2025

Jahr

Abstract

Objective: Electronic drug information resources are widely accessible and commonly used by healthcare professionals for identifying drug-drug interactions (DDIs). With the rapid advancements in artificial intelligence (AI), AI-powered chatbots have demonstrated their potential in detecting DDIs. However, variations exist in the scope, completeness, and consistency of information provided by different resources. This study aims to conduct a comparative evaluation of drug interaction databases and AI chatbots to assess their reliability in DDI identification. Methods: A total of three databases, namely Lexicomp, Drugs.com, DrugBank and AI-powered chatbots such as ChatGPT, Copilot and Gemini were used for comparative evaluation. The percentage of interactions that had an entry in each drug information resource was used to score each resource for scope. For each resource that described clinical effects, severity, mechanism, clinical management, and risk factors, a completeness score was calculated. The consistency of the information was assessed using the Fleiss' Kappa (κ) score, estimated with the Statistical Package for the Social Sciences (SPSS), version 29.0 (IBM, USA). Results: A total of 150 drug pairs were selected in the present study. The scope score was highest (100%) for Lexicomp, ChatGPT and Gemini. The completeness score was highest (100%) in all the AI-powered chatbots, followed by Drugs.com (90%) and Lexicomp (85.2%). Fleiss' kappa coefficient was used to determine the inter-resource agreement on DDI severity classification and the overall agreement was categorized as fair (κ=0.28, p<0.001). Cohen’s kappa coefficients were calculated to evaluate pairwise agreement among the resources and the overall mean kappa coefficient (κ=0.51, p<0.01) indicated a moderate level of agreement among the resources. Conclusion: Significant differences amongst the resources were observed in terms of severity classification. Using Lexicomp as reference, accuracy assessment was done and variable sensitivity, specificity, and predictive values among resources were observed. A moderate overall agreement in the inter-resource agreement on DDI presence-absence, with traditional databases showed stronger pairwise agreement than AI chatbots.

Ähnliche Arbeiten