OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 06.05.2026, 19:04

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

ChatGPT-simulated sentence plausibility in event contexts, with teens, younger and older adults, in fiction and newspaper texts

2026·0 Zitationen·Frontiers in Language SciencesOpen Access
Volltext beim Verlag öffnen

0

Zitationen

1

Autoren

2026

Jahr

Abstract

The purpose of this study was to determine to what extent the large language model (LLM) would produce simulations that are close enough to human-based world knowledge to serve as pilot data for human experimentation: LLMs are developing rapidly, and if they become sufficiently accurate databases of human world knowledge, this would open up interesting opportunities for empirical research; with their advent we may have the opportunity of accessing a comprehensive model of world knowledge. This claim was assessed by simulating human plausibility ratings and their variation depending on (i) the presence vs. absence of an event description, (ii) the age of LLM-simulated participants (Pilot 1, Pilot 2, and Experiment 1a), and (iii) LLM-simulated participant expectations of distinct text sources/genres (Experiment 1b). In four pilot studies and two main experiments, ChatGPT-4o/5 plausibility ratings were simulated from the graphical user interface using written prompts, factorial designs, Latin-square counterbalanced lists, and N = 200 simulated participants per between-participant factor level. In this way, an experiment setup much like that for in-laboratory experiments with human participants was simulated. As a baseline, plausibility ratings generated via the LLM chat interface were compared against human plausibility ratings reported in prior research. Overall, ChatGPT produced simulated ratings that, on average, were higher for plausible than implausible sentences, and were higher when an event description supported the event conveyed by the target sentence. The model also revealed fine-grained differences depending on simulated participant age, context-sentence relations, and genre. These results can be used to guide the formulation of testable hypotheses for future research with human participants.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationNeurobiology of Language and BilingualismTopic Modeling
Volltext beim Verlag öffnen