Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

EXAMINING THE ROLE OF CHATGPT IN THE EVALUATION OF SCIENTIFIC ABSTRACTS

2024·0 Zitationen·British journal of surgeryOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Abstract Background Reviewing abstracts for inclusion in scientific conferences involves substantial time and effort. Recent innovations in Large Language Model (LLM)s, such as OpenAI’s ChatGPT, have the potential to reduce the burden on reviewers. This study aimed to test that potential by programmatically grading abstracts using custom-trained LLMs. Methods Abstracts and reviewer grades of American Hernia Society (AHS – 2021) and American Plastic Surgery Association (ASPS 2021–2022) abstracts were obtained. Two new models of the state-of-the-art ChatGPT-3.5-turbo-1106 were fine-tuned on corpuses of these abstracts and grades. We trained for three epochs with a standard loss function and four inputs: a “system” directive, the grading rubric, the abstract, and the average reviewer grade on a five-point scale. The models were evaluated with the 2023 abstracts and grades from both societies. The trained models graded the abstracts and ranked them using three custom algorithms: QuickSort, which iteratively compared two abstracts at a time; Quartile, which ranked four; and Bridging, which combined batches of fifteen. Mean differences and Spearman correlation coefficients were calculated between actual and predicted grades/rankings. Results The trained models successfully followed the 2023 rubrics and produced in less than 20 minutes predicted grades centered near the actual: the mean difference in groups was 0.0719 ± 0.881 for AHS and 0.104 ± 0.944 for ASPS. However, variance was high and the predicted rankings had at best weak rank correlations with the actual rankings. Conclusion Custom-trained LLMs present a novel method for evaluating abstracts efficiently, but further research is needed to fine-tune the models for more precise results.

EXAMINING THE ROLE OF CHATGPT IN THE EVALUATION OF SCIENTIFIC ABSTRACTS

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen