Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Can AI Predict Scientific Research? a Case Study on the NLP Domain
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Using literature embedding and cluster analysis methods, this study proposes a predictive framework to track the evolution of the academic literature. We used Doc2Vec to create semantic embedments for ACL papers from 2018 to 2021, and then used K-means clustering to identify key papers within those clusters. These key papers were used as hints for a large language model (llm) to predict the content of the 2022 publication. The accuracy of these predictions was assessed by comparing the AI-generated content to the actual 2,022 papers, using two main criteria: surface similarity (measured by cosine distance between embedments) and semantic consistency (assessed using BERTScore). Empirical validation shows that the LLM-generated predictions achieve a significant amount of surface similarity (0.82 mean cosine) and moderate semantic consistency (0.79 BERTScores) with the actual 2022 publication, but show a diminished ability to predict entirely new methodological paradigms.
Ähnliche Arbeiten
2019 · 31.639 Zit.
Techniques to Identify Themes
2003 · 5.381 Zit.
Answering the Call for a Standard Reliability Measure for Coding Data
2007 · 4.071 Zit.
Basic Content Analysis
1990 · 4.045 Zit.
Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts
2013 · 3.061 Zit.