OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 15.03.2026, 07:03

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

AI-Generated Messaging for Life Events Using Structured Prompts: A Comparative Study of GPT With Human Experts and Machine Learning

2025·0 Zitationen·IEEE AccessOpen Access
Volltext beim Verlag öffnen

0

Zitationen

14

Autoren

2025

Jahr

Abstract

Large Language Models (LLMs) are increasingly used to generate social media posts, emails, narratives, and other forms of communication, yet few studies systematically assess how well these outputs align with their original prompt intent. This study evaluates the effectiveness of a zero-shot structured narrative prompt for generating 24,000 life event messages(birth, death, hiring, and firing events) using OpenAI’s GPT-4. A manually tagged sample of 2880 messages shows that 87.43% align with the intended life event when framed as X (formerly Twitter) posts. To scale this evaluation, we train nine machine learning (ML) models, including BERT, Keras-based architectures, eXtreme Gradient Boosting, Random Forest, and Support Vector Machines, per life event type, resulting in 36 binary classifiers. We apply an ensemble approach based on simple majority agreement to classify the remaining 21,120 messages. Manual validation of a 1% sample (n = 212) confirms a 90.57% match rate with human reviewers, with binomial tests confirming statistical significance above a 75% threshold across all event types (<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">P</i>-values ranging from 0.00025 to 0.0115). While valid messages are reliably identified, classifying invalid cases remains more challenging. This work offers a reproducible framework for validating AI-generated messaging and provides practical guidance for prompt-based applications. Limitations include the narrow event scope, exclusive use of English text, and reliance on structured prompts, which may not generalize to open-ended tasks.

Ähnliche Arbeiten