Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Clinical Knowledge and Reasoning Abilities of AI Large Language Models in Pharmacy: A Comparative Study on the NAPLEX Exam
16
Zitationen
4
Autoren
2023
Jahr
Abstract
Abstract Objective This study aims to evaluate the capabilities and limitations of three large language models (LLMs) – GPT-3, GPT-4, and Bard, in the field of pharmaceutical sciences by assessing their pharmaceutical reasoning abilities on a sample North American Pharmacist Licensure Examination (NAPLEX). We also analyze the potential impacts of LLMs on pharmaceutical education and practice. Methods A sample NAPLEX exam consisting of 137 multiple-choice questions was obtained from an online source. GPT-3, GPT-4, and Bard were used to answer the questions by inputting them into the LLMs’ user interface. The answers provided by the LLMs were then compared with the answer key. Results GPT-4 exhibited superior performance compared to GPT-3 and Bard, answering 78.8% of the questions correctly. This score was 11% higher than Bard and 27.7% higher than GPT-3. However, when considering questions that required multiple selections, the performance of each LLM decreased significantly. GPT-4, GPT-3, and Bard only correctly answered 53.6%, 13.9%, and 21.4% of these questions, respectively. Conclusion Among the three LLMs evaluated, GPT-4 was the only model capable of passing the NAPLEX exam. Nevertheless, given the continuous evolution of LLMs, it is reasonable to anticipate that future models will effortlessly pass the exam. This highlights the significant potential of LLMs to impact the pharmaceutical field. Hence, we must evaluate both the positive and negative implications associated with the integration of LLMs in pharmaceutical education and practice.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.336 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.207 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.607 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.476 Zit.