Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Interactive LLMs with Human-in-the-Loop for Ethical Content Generation

2025·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Large Language Models (LLMs) have shown impressive breakthroughs in reasoning and fluency, yet face the risks of ethics, including bias, misinformation, and even toxicity. HITL is an effective step to achieving human compatibility between model behavior and human values. This article includes a literature review of HITL alignment methods: Reinforcement Learning from Human Feedback (RLHF), Direct Preference Optimization (DPO), Constitutional AI (CAI), and Reinforcement Learning from AI Feedback (RLAIF), as well as proposes an Interactive Alignment Lifecycle that incorporates human supervision along the LLM development pipeline. Combining research by 2020-2025, the results reveal that as much as 71% of harmful outputs and 50% of beneficial content can be reduced or increased accordingly with interactive feedback. But evaluator bias, scalability, and over-refusal are still problems. Future studies need to focus on optimal hybrid feedback, evolving constitutions, and standardized governance frameworks of sustainable ethical alignment.

Autoren

Institutionen

Al Ain University(AE)

Themen

Explainable Artificial Intelligence (XAI)Ethics and Social Impacts of AIArtificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

Interactive LLMs with Human-in-the-Loop for Ethical Content Generation

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen