Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Scaling research aim identification: Language models for classifying scientific and societal‐oriented studies
2
Zitationen
5
Autoren
2025
Jahr
Abstract
Abstract The classification of research according to its aims has been a longstanding focus in the fields of quantitative science studies and R&D statistics. Since 1963, the Organization for Economic Co‐operation and Development (OECD) has employed a classical distinction among basic, applied, and experimental research. Building on this framework, our previous work highlighted the utility of differentiating between scientific and societal progress as two primary research objectives. This distinction enabled the quantitative analysis of scientific publication abstracts and the development of an automated method for large‐scale classification. In the current study, we systematically evaluate text classification techniques, including traditional text mining models, classification tools, BERT‐based language models, and decoder‐only large language models (LLMs) such as ChatGPT. Our findings show that the fine‐tuned GPT‐4o‐mini model performs the best among single‐model approaches. However, traditional and BERT‐based models outperform in certain fine‐grained classification tasks. Leveraging majority voting strategies to incorporate their strengths yields performance comparable to closed‐source GPT models. A case study on 10 biomedical journals further validates the method, demonstrating strong alignment between journal scopes, model predictions, and outputs generated by the fine‐tuned GPT‐4o‐mini model. These results highlight the robustness and practical effectiveness of the proposed methodology for nuanced research aim classification.