OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 18.03.2026, 18:30

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Similarity Metric for Data Optimization and Efficient Training of Reactive Machine Learning Force Fields for Hydrocarbon Radiolysis

2025·2 Zitationen·Journal of Chemical Theory and Computation
Volltext beim Verlag öffnen

2

Zitationen

3

Autoren

2025

Jahr

Abstract

Radiolysis is a common approach to sterilize polymers, chemically modify them for upcycling, and accelerate their decomposition for recycling purposes. Reactive molecular dynamics (MD) simulations provide a powerful tool to generate atomic-level trajectories of the reactive processes and quantify radiolytic chemical degradation pathways. For this, machine learning (ML) surrogate models for reactive force fields with quantum mechanical accuracy are now widely used, which require ML training data sets that can provide information on atomic environments for target chemical systems. However, radiolysis chemistry can be highly complex and diverse, which poses significant challenges for generating training data to parametrize ML models. In this regard, we developed a method for optimizing the training data set using a cosine similarity metric to help guide training set selection for radiolysis of polyethylene, a model hydrocarbon polymer, as well as to enhance the transferability of our reactive ML force field (MLFF) to a variety of molecular and polymeric systems. Our approach performs atom-by-atom comparisons between local atomic environments to pinpoint important data points associated with rare and localized events, such as radiolysis damage within structures. We apply this approach to train the Chebyshev Interaction Model for Efficient Simulation (ChIMES) MLFF model, which expresses the atomic interaction potentials in terms of linear combinations of many-body Chebyshev polynomials. We first show that our method can reduce our training set size by ∼70% while improving overall accuracy compared to more standard MD model fitting approaches. We then validate our optimum model against diverse hydrocarbon simulation data, including simple alkanes and systems with unsaturated carbon bonds, over a wide range of thermodynamic conditions. Finally, we use our ChIMES model to perform MD simulations of radiolytic damage with large-scale systems that help avoid system size effects. Overall, our approach yields an MD force field that retains most of the accuracy of the underlying quantum method while yielding many orders of improvement in computational efficiency. Our efforts will have impact on future hydrocarbon polymer radiolysis studies, where the chemical details of the polymer-radiation interactions can have a strong effect on the resulting products observed in experiments.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Machine Learning in Materials ScienceComputational Drug Discovery MethodsArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen