Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Crowdsourcing a Training Dataset of Question-and-Answer Pairs for AI-Enabled Health Information Tools on Sexually Transmitted Infections: Protocol for a Cross-Sectional Exploratory Survey Study (Preprint)
0
Zitationen
8
Autoren
2024
Jahr
Abstract
<sec> <title>BACKGROUND</title> Sexually transmitted infections are a significant public health concern, particularly in sub-Saharan Africa, where their prevalence remains high. Promoting awareness and reducing stigma are essential strategies for addressing this challenge, but those affected often have limited access to accurate and culturally appropriate health information. Therefore, innovative solutions are essential to enhance sexual health literacy and encourage informed health-seeking behaviors. Artificial intelligence (AI)–enabled tools, such as chatbots, have emerged as promising avenues for delivering accurate and accessible health information. However, their potential is constrained by the lack of contextualized datasets, which are crucial for ensuring their effectiveness and relevance to diverse populations. </sec> <sec> <title>OBJECTIVE</title> This study aims to develop an open access, contextualized dataset of question-and-answer pairs on sexual health and sexually transmitted infections to support the development and training of digital and AI-enabled health information tools. </sec> <sec> <title>METHODS</title> Using a crowdsourcing approach, questions are being collected from participants aged ≥15 years via online platforms, paper-based submissions, and in-person interactions at public events across sub-Saharan Africa. Each question will be anonymized and reviewed by medical professionals who will provide accurate, evidence-based answers. The dataset will then undergo processing, including cleaning and tagging for AI training, ensuring adherence to findability, accessibility, interoperability, and reusability principles. The final dataset will be published as open access. </sec> <sec> <title>RESULTS</title> Data collection began on June 12, 2024, and is ongoing. The data collection process was piloted in Kigali, Rwanda, where 132 questions were collected. As of August 2025, the study had collected over 5620 question-and-answer pairs. The collected data are undergoing a simultaneous rigorous data processing phase in collaboration with health workers who provide evidence-based answers to the questions and new questions based on their experience in the clinic. The data cleaning and processing will enhance the utility of the data for AI applications. </sec> <sec> <title>CONCLUSIONS</title> The final dataset will be published as open access in 2025, contributing to the development of AI-driven health tools and promoting public health literacy. </sec> <sec> <title>INTERNATIONAL REGISTERED REPORT</title> DERR1-10.2196/70005 </sec>
Ähnliche Arbeiten
An interactive web-based dashboard to track COVID-19 in real time
2020 · 11.125 Zit.
The Analysis of Spatial Association by Use of Distance Statistics
1992 · 5.858 Zit.
Detecting influenza epidemics using search engine query data
2008 · 4.376 Zit.
A spatial scan statistic
1997 · 3.993 Zit.
Earthquake shakes Twitter users
2010 · 3.659 Zit.