Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Can we trust the evaluation on ChatGPT?

2023·64 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2023

Jahr

Abstract

ChatGPT, the first large language model with mass adoption, has demonstrated remarkableperformance in numerous natural language tasks. Despite its evident usefulness, evaluatingChatGPT's performance in diverse problem domains remains challenging due to the closednature of the model and its continuous updates via Reinforcement Learning from HumanFeedback (RLHF). We highlight the issue of data contamination in ChatGPT evaluations, with a case study in stance detection. We discuss the challenge of preventing data contamination and ensuring fair model evaluation in the age of closed and continuously trained models.

Autoren

Institutionen

Indiana University Bloomington(US)

Themen

Topic ModelingNatural Language Processing TechniquesArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Can we trust the evaluation on ChatGPT?

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen