Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

6ER-040 Development of a python script for automated data extraction from pharmaceutical articles and comparison of ChatGPT 4.5 and 5.0 model performance

2026·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

<h3>Background and Importance</h3> Automating the extraction of scientific data represents a key challenge to accelerate literature reviews in pharmacy. In summer 2025, a preliminary study demonstrated that a conversational agent (ChatGPT 4.0) could extract data from pharmaceutical articles with an average accuracy of 85 ± 13% based on 23 predefined criteria used for populating the <i>Impact Pharmacie</i> platform.<sup>1</sup> <h3>Aim and Objectives</h3> To develop a Python script interfaced with the ChatGPT API and compare the data extraction accuracy of models 4.5 and 5.0, in both French (Fr) and English (En), against human analysis. <h3>Material and Methods</h3> A descriptive comparative study was conducted on 26 interventional pharmacy studies. A Python script, based on a standardised 23-question prompt, generated responses from ChatGPT 4.5 and 5.0. Five pharmacy students independently assessed compliance (accurate, inaccurate, incomplete) for each question according a standardised operating procedure. The outputs generated by ChatGPT were organised into an Excel spreadsheet (Microsoft, Seattle, WA, USA) for subsequent analysis. <h3>Results</h3> For ChatGPT 4.5, mean compliance reached 72 ± 9% (Fr) and 77 ± 8% (En). The lowest-performing items (< 60%) concerned the formulation of secondary objectives, intervention description, and study duration. ChatGPT 5.0 significantly improved performance, achieving 86 ± 8% (Fr) and 87 ± 9% (En). Differences between languages were minimal (<2%). Model 5.0 showed enhanced identification of study design, evaluation parameters, and methodological limitations. As an example, the query processed with ChatGPT 5.0 required 3,422,364 tokens, corresponding to a total computation time of 380 minutes and a cost of CAD $18.18. Remaining errors were mainly related to secondary objectives and intervention duration. Non-compliance may be partly attributed to an insufficient token limit or to instructions that were not specific enough regarding the expected level of detail in the responses. <h3>Conclusion and Relevance</h3> Integrating a Python script with the ChatGPT API enables reliable, bilingual automated extraction of scientific data from article PDFs. ChatGPT 5.0 demonstrated a 13 to 15-point improvement in compliance compared with version 4.5, supporting its use in the development of standardised pharmaceutical data analysis tools. The findings support the potential automation of updates to the Impact Pharmacie platform. <h3>References and/or Acknowledgements</h3> 1. Impact Pharmacie: Home [Internet]. [cited 2025 Oct 8]. Available from: https://impactpharmacie.org/index.php?p=greeter.php <h3>Conflict of Interest</h3> No conflict of interest

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationSocial Media in Health EducationDigital Mental Health Interventions

Volltext beim Verlag öffnen

6ER-040 Development of a python script for automated data extraction from pharmaceutical articles and comparison of ChatGPT 4.5 and 5.0 model performance

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen