Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
6ER-040 Development of a python script for automated data extraction from pharmaceutical articles and comparison of ChatGPT 4.5 and 5.0 model performance
0
Zitationen
7
Autoren
2026
Jahr
Abstract
<h3>Background and Importance</h3> Automating the extraction of scientific data represents a key challenge to accelerate literature reviews in pharmacy. In summer 2025, a preliminary study demonstrated that a conversational agent (ChatGPT 4.0) could extract data from pharmaceutical articles with an average accuracy of 85 ± 13% based on 23 predefined criteria used for populating the <i>Impact Pharmacie</i> platform.<sup>1</sup> <h3>Aim and Objectives</h3> To develop a Python script interfaced with the ChatGPT API and compare the data extraction accuracy of models 4.5 and 5.0, in both French (Fr) and English (En), against human analysis. <h3>Material and Methods</h3> A descriptive comparative study was conducted on 26 interventional pharmacy studies. A Python script, based on a standardised 23-question prompt, generated responses from ChatGPT 4.5 and 5.0. Five pharmacy students independently assessed compliance (accurate, inaccurate, incomplete) for each question according a standardised operating procedure. The outputs generated by ChatGPT were organised into an Excel spreadsheet (Microsoft, Seattle, WA, USA) for subsequent analysis. <h3>Results</h3> For ChatGPT 4.5, mean compliance reached 72 ± 9% (Fr) and 77 ± 8% (En). The lowest-performing items (< 60%) concerned the formulation of secondary objectives, intervention description, and study duration. ChatGPT 5.0 significantly improved performance, achieving 86 ± 8% (Fr) and 87 ± 9% (En). Differences between languages were minimal (<2%). Model 5.0 showed enhanced identification of study design, evaluation parameters, and methodological limitations. As an example, the query processed with ChatGPT 5.0 required 3,422,364 tokens, corresponding to a total computation time of 380 minutes and a cost of CAD $18.18. Remaining errors were mainly related to secondary objectives and intervention duration. Non-compliance may be partly attributed to an insufficient token limit or to instructions that were not specific enough regarding the expected level of detail in the responses. <h3>Conclusion and Relevance</h3> Integrating a Python script with the ChatGPT API enables reliable, bilingual automated extraction of scientific data from article PDFs. ChatGPT 5.0 demonstrated a 13 to 15-point improvement in compliance compared with version 4.5, supporting its use in the development of standardised pharmaceutical data analysis tools. The findings support the potential automation of updates to the Impact Pharmacie platform. <h3>References and/or Acknowledgements</h3> 1. Impact Pharmacie: Home [Internet]. [cited 2025 Oct 8]. Available from: https://impactpharmacie.org/index.php?p=greeter.php <h3>Conflict of Interest</h3> No conflict of interest
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.545 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.436 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.935 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.589 Zit.