OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 12.04.2026, 10:59

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Big Data Versus Big GPU: Evolving Requirements and Governance Dynamics of AI Training Data

2025·2 Zitationen·International Journal of Digital Law and GovernanceOpen Access
Volltext beim Verlag öffnen

2

Zitationen

3

Autoren

2025

Jahr

Abstract

Abstract Pre-trained large language models (LLMs), epitomized by ChatGPT, have leveraged a cornucopia of “big data” to attain substantial leaps in artificial intelligence (AI). Whereas the diminishing returns from pre-training and the depletion of available training data have become evident, the post-training scaling law bolstered by “big GPU” has surfaced as an overriding strategy. Since 2024, post-trained models exemplified by o1 and DeepSeek-R1 have been widely acclaimed as successes in logic-intensive fields like advanced scientific problem-solving, serving as a bellwether for artificial general intelligence (AGI). Driven by the two cardinal elements of computing power and task-specific datasets, the data training processes of post-trained models exhibit more erratic and uncontrollable tendencies, which may be a menace to core societal domains and precipitate systemic friction vis-à-vis the existing data governance derived from pre-trained models. At this watershed moment, this article aims to conduct a comprehensive comparison of training data paradigms between pre-trained and post-trained models and to further develop cogent and favorable governance responses to mitigate emerging risks. Consequently, data security must be established as a prerequisite for AI development, and a lifecycle-based governance framework for AI training data in blended models can be introduced in the metamorphosis toward “bigger AI models”.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Ethics and Social Impacts of AIExplainable Artificial Intelligence (XAI)Artificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen