Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Large Language Models as Assessors: On the Impact of Relevance Scales
0
Zitationen
6
Autoren
2025
Jahr
Abstract
We systematically investigate how different scales, and their conversions, affect LLMs’ ability to provide reliable pointwise relevance judgments across multiple prompting strategies and model sizes. Using a popular TREC collection, we compare model outputs with both crowd and expert annotations, analyzing alignment, stability, and signs of potential data contamination.
Ähnliche Arbeiten
2019 · 31.478 Zit.
Techniques to Identify Themes
2003 · 5.367 Zit.
Answering the Call for a Standard Reliability Measure for Coding Data
2007 · 4.055 Zit.
Basic Content Analysis
1990 · 4.045 Zit.
Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
2013 · 3.035 Zit.