Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating large language models for selection of statistical test for research: A pilot study

2024·5 Zitationen·Perspectives in Clinical ResearchOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Background: In contemporary research, selecting the appropriate statistical test is a critical and often challenging step. The emergence of large language models (LLMs) has offered a promising avenue for automating this process, potentially enhancing the efficiency and accuracy of statistical test selection. Aim: This study aimed to assess the capability of freely available LLMs - OpenAI's ChatGPT3.5, Google Bard, Microsoft Bing Chat, and Perplexity in recommending suitable statistical tests for research, comparing their recommendations with those made by human experts. Materials and Methods: A total of 27 case vignettes were prepared for common research models with a question asking suitable statistical tests. The cases were formulated from previously published literature and reviewed by a human expert for their accuracy of information. The LLMs were asked the question with the case vignettes and the process was repeated with paraphrased cases. The concordance (if exactly matching the answer key) and acceptance (when not exactly matching with answer key, but can be considered suitable) were evaluated between LLM's recommendations and those of human experts. Results: = 0.0059. Conclusion: The LLMs, namely, ChatGPT, Google Bard, Microsoft Bing, and Perplexity all showed >75% concordance in suggesting statistical tests for research case vignettes with all having acceptance of >95%. The LLMs had a moderate level of agreement among them. While not a complete replacement for human expertise, these models can serve as effective decision support systems, especially in scenarios where rapid test selection is essential.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMeta-analysis and systematic reviewsMental Health via Writing

Volltext beim Verlag öffnen

Evaluating large language models for selection of statistical test for research: A pilot study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen