Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Creating a biomedical knowledge base by addressing GPT inaccurate responses and benchmarking context
4
Zitationen
25
Autoren
2024
Jahr
Abstract
We created GNQA, a generative pre-trained transformer (GPT) knowledge base driven by a performant retrieval augmented generation (RAG) with a focus on aging, dementia, Alzheimer’s and diabetes. We uploaded a corpus of three thousand peer reviewed publications on these topics into the RAG. To address concerns about inaccurate responses and GPT ‘hallucinations’, we implemented a context provenance tracking mechanism that enables researchers to validate responses against the original material and to get references to the original papers. To assess the effectiveness of contextual information we collected evaluations and feedback from both domain expert users and ‘citizen scientists’ on the relevance of GPT responses. A key innovation of our study is automated evaluation by way of a RAG assessment system (RAGAS). RAGAS combines human expert assessment with AI-driven evaluation to measure the effectiveness of RAG systems. When evaluating the responses to their questions, human respondents give a “thumbs-up” 76% of the time. Meanwhile, RAGAS scores 90% on answer relevance on questions posed by experts. And when GPT-generates questions, RAGAS scores 74% on answer relevance. With RAGAS we created a benchmark that can be used to continuously assess the performance of our knowledge base. Full GNQA functionality is embedded in the free GeneNetwork.org web service, an open-source system containing over 25 years of experimental data on model organisms and human. The code developed for this study is published under a free and open-source software license at https://git.genenetwork.org/gn-ai/tree/README.md .
Ähnliche Arbeiten
UCSF Chimera—A visualization system for exploratory research and analysis
2004 · 46.998 Zit.
SciPy 1.0: fundamental algorithms for scientific computing in Python
2020 · 35.598 Zit.
Clustal W and Clustal X version 2.0
2007 · 28.864 Zit.
The REDCap consortium: Building an international community of software platform partners
2019 · 22.672 Zit.
Array programming with NumPy
2020 · 20.643 Zit.
Autoren
- Shelby S. Darnell
- Rupert W. Overall
- Andrea Guarracino
- Vincenza Colonna
- Flavia Villani
- Erik Garrison
- Arun Isaac
- Priscilla Muli
- Frederick Muriuki Muriithi
- Alexander Kabui
- Munyoki Kilyungi
- Felix Lisso
- Adrian Kibet
- Brian Muhia
- Harm Nijveen
- Siamak Yousefi
- David G. Ashbrook
- P.-S. Huang
- G. Edward Suh
- Muhammad Umar
- Christopher Batten
- Hao Chen
- Śaunak Sen
- Robert W. Williams
- Pjotr Prins
Institutionen
- University of Tennessee Health Science Center(US)
- Humboldt-Universität zu Berlin(DE)
- University College London(GB)
- University of Nairobi(KE)
- African Institute for Development Policy(KE)
- Strathmore University(KE)
- Pwani University(KE)
- Nairobi Hospital(KE)
- Wageningen University & Research(NL)
- Cornell University(US)
- Nvidia (United States)(US)
- Nvidia (United Kingdom)(GB)