Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating the performance of a generative AI model in assessing qualitative health research articles adherence to objective reporting standards
0
Zitationen
4
Autoren
2026
Jahr
Abstract
As qualitative research increasingly informs patient-centred care, rapid assessment of existing evidence to meet research guidelines is needed to inform practice settings. We evaluate the performance of Claude, a generative AI model, in assessing qualitative articles adherence to a consensus-based reporting guideline. The Consolidated Criteria for Reporting Qualitative Research (COREQ), commonly used in qualitative research, is used as a reference criteria list to test the performance of Claude. 15 articles from a systematic scoping review were extracted for analysis. Structured prompts were applied to Claude to evaluate if each criterion in COREQ is met for each article. Two independent reviewers checked model results for concordance and accuracy. The F1, balanced accuracy (BA) scores, Matthews correlation coefficient (MCC) and other performance metrics were tabulated at the criterion, criterion domain, and article level. 4 main categories were identified from performance results, namely: (1) balanced (6/32 criteria, 18.75%), (2) under-reported (2/32, 6.25%), (3) mixed errors (9/32, 28.13%), and (4) information limited (15/32, 46.88%) clusters. Results show heterogeneity amongst different clusters of criteria. While balanced criteria perform consistently across a range of metrics, criteria in under- or over-reported clusters require targeted prompt adjustments. Limited information criteria require a larger sample of articles to verify results. Clearly defined criteria outperformed criteria that were broadly defined or requires interpretation. Segmenting criteria into performance clusters allow researchers to identify areas of incongruence, so that specific strategies to modify prompts may be utilised for any given set of research articles. Customised approaches that are expertly crafted can allow for the rapid extraction of valuable insights that may inform patient-centred recommendations and practice guidelines.
Ähnliche Arbeiten
The PRISMA 2020 statement: an updated guideline for reporting systematic reviews
2021 · 87.332 Zit.
Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement
2009 · 82.929 Zit.
The Measurement of Observer Agreement for Categorical Data
1977 · 77.362 Zit.
Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement
2009 · 63.124 Zit.
Measuring inconsistency in meta-analyses
2003 · 61.792 Zit.