-

Combining Programmable Potentials and Neural Networks for Materials Problems

Ryan Mohr

Allan Avila

Soham Gosh

Ananta Bhattarai

Muqiao Yang

Xintian Feng

Martin Head-Gordon

Ruslan Salakhutdinov

Maria Fonoberova

Igor Mezic´

the nanosecond time scale requires the sequential evaluation of roughly a million energy and force evaluations, each Force field methods can be used in molecular dynamics simu- of which requires about 100 seconds using 8 cores. This ilcatailoonrs mtoacteormiaplustyesatenmd si.nfHeorwmeavcerro,-tlheveseel pfororcpeerfitieelsdosfccahnemof-- equates to approximately 0.1 billion seconds for a simulaten predict erroneous values due to not being accurate at the tion, or over 3 years. quantum level. The ideal would be able to simulate the sys- Our approach constructs a generative model by taking tem from first principles (quantum-level) and scale up to get classical molecular dynamics potentials (such as Lennardthe material properties. However, this is computationally in- Jones potentials) which capture the correct long-range befeasible. In this paper, we develop a Programmable Potentials havior and combines them with small neural networks which methodology that utilizes high-fidelity QM/MM data sets to encode the quantum level logic and correct the near-range iownlofoorgkrymsuthasaemtsaoeulnetcocomudlaaintrigcdayflunlynamclteiiaocrsnns(tMhcoeDnq)taupianonittneugnmtis-amlle.avTlellhniisnetumerraealtchtnoieodtn-- tbheehacvlaiossr.icTalhemeonleccoudlianrgpfoutenncttiiaolnss. Omuurltaipplpicroataicvhelyis mteormdiefyd logic of the system from the data and uses these encoding Programmable Potentials with neural networks (PP-NN). functions to modify standard MD potentials, like Lennard- The formulation of Programmable Potentials (Thakur, Jones, which capture the correct long range behavior, but do Mohr, and Mezic´ 2016) prior to this work relied on the repoorly in the near range. We test the method on the adsorp- searcher to manually determine and specify the interaction tion of hydrocarbons in a zeolite framework. We show that the logic and encode this in a boolean algebra. This was a time encoding functions, and the resulting model, generalize well consuming process and possible only for simple reactions. owuittshiddeattahferior mtraainsiinmgpsleert haynddrtohcaatrbenocnosduicnhg afsunCcHtio4ncsantrabiunieldd The idea we present here is to learn this logic automatgood models for more complex hydrocarbons such as C2H6 ically by leveraging neural networks and tunable proximand C3H8. ity functions, thereby replacing the “hand-tuned” encoding functions. An additional advantage of using neural networks as the encoding functions, and implementing them in packIntroduction ages such as TensorFlow or PyTorch, allows one to leverage the auto-differentiation routines of those frameworks, so that moving from training to deployment in a molecular dynamics simulation bypasses an error-prone, manual differentiation step to get the force fields. We applied our methods to the materials science problem of hydrocarbons binding in a zeolite framework, both undoped (MFI) and doped (HMFI). Doped zeolites are nano-porous doped silica catalysts that are the world's most widely used catalytic framework for cracking, dehydrogenation, and oligomerization of hydrocarbons (Primo and Garcia 2014). The classic example of doping is replacement of a small fraction Si by Al and an accompanying proton to create a solid acid catalyst (Corma 1995). An illustration is shown in Figure 1. Doping a zeolite turns a nonfunctional material into a functional material, where catalyst function is controlled by short-range electronic effects from doping, as well as shape-selective effects (pore size and

We seek to use data from small quantum mechanical simulations to build generative models at the molecular scale in order to predict bulk material properties. This is not achievable with current methods. Modeling and solving a system on the level of the Schrodinger equation scales exponentially with the number of electrons in the system, making it infeasible for any system of reasonable size. Density functional theory and periodicity assumptions on the system (socalled QM/MM methods) mitigate this problem somewhat, but are still infeasible for many systems (Jensen 2007) . For example, using QM/MM methods for direct simulation of molecular dynamics (MD) for zeolite-catalyzed reactions on geometry) (Smit and Maesen 2008) , and long-range effects (remote charges, and polarization) from the extended framework (Mansoor et al. 2018).

Methods

Pure force field methods often do not capture a potential field correctly. While giving the correct form at long range, the short range energies can be incorrect. Ideally, one would like to use a full quantum mechanics simulation, but this is intractable except for small systems. QM/MM hybrid approaches are introduced to relax these limitations. As mentioned above however, these methods still suffer from computation intractability if one would like to simulate them over a reasonable time scale, which would be needed to extract macroscale properties of the material.

Programmable Potentials is a force field methods that takes into account the reaction logic of the system, quantum or otherwise, given a data set. Since it is a force field method, this makes it amenable for molecular dynamics simulations. Using small, high-fidelity QM/MM data sets gives a force field model that can be evaluated outside its training set, setting the stage for long time-scale simulations that are still informed by quantum level information.

The Programmable Potential model takes the form of EP P =

X X j k>j

VjkSjk (1) where Vkj is a pairwise potential between atoms j and k, such as the Lennard-Jones potential in Eq. (3), and Sjk is an encoding function that encodes the interaction logic of the system. In bond-breaking reaction systems, the encoding functions are usually restricted to have values between 0 and 1 and act as a switch that turns individual bonds on or off. In static systems, like binding of guest molecules such as CH4 in a host zeolite framework, the encoding function values are not squashed in this way.

In original formulation of Programmable Potentials (Thakur, Mohr, and Mezic´ 2016) , the coarse-level logic of the system was encapsulated in the encoding functions. This logic was first constructed using a Boolean algebra generated by indicator functions on intervals [0; ). Once the logic was constructed, the indicator functions were replaced with smooth variants (see Eq.(2)) whose parameters were then optimized over a training data set. This process could be quite complicated as the system moved past very simple reactions and would require a lot of man-hours and ingenuity. The formation in this paper replaces these hand-tuned encoding functions with encoding functions utilizing small neural networks to automatically learn the interaction logic. The neural networks used were densely connected networks with, generally, no more 8 layers with neurons ranging between 8 and 64 neurons for a layer.

For reactions between a small number of atoms, we can apply an encoding function for each interaction. However, for the zeolite problem, using an encoding function for every interaction is infeasible. This would require optimizing thousands of neural networks simultaneously due to the number of atoms in the zeolite framework. Therefore, we learned an encoding function for each unique interactiontype and limited them to zeolite atoms close to the hydrocarbon. For example, every silicon-carbon interaction used the same encoding function. This reduced the number of encoding functions that needed to be trained from thousands to 4 (for the undoped zeolite). This also enabled transfer learning to other systems by learning a “universal” encoding function for each interaction type.

For each unique atom pair type interaction, we use one neural network to learn the corresponding encoding function. For example, in the undoped zeolite problem, we have four separate neural networks in total, encoding SSiC, SSiH, SOC, SOH respectively. The networks take the pairwise distance as input, and outputs the modified Lennard-Jones potential. Recall that the modification for a pairwise potential only happens for atoms in the QM region. For those atoms outside QM region, we keep the original LJ pairwise potential for the energy calculation. The overall process is summarized in Algorithm 1.

An additional advantage of using neural networks as the encoding functions, and implementing them in frameworks such as TensorFlow or PyTorch, allows one leverage the auto-differentiation methods of those framework so that moving from training to deployment in a molecular dynamics simulation bypasses an error-prone, manual differentiation step to get the force fields.

Finally, the choice of the functional form (1), rather than a pure neural network, gives a generative model. Neural networks interpolate well, but often have difficulty extrapolating outside the data sets they were trained on. Our methods Algorithm 1 Programmable Potentials with Neural Networks /* Add pairwise potentials for atoms in QM region */ for atom j in hydrocarbon do for zeolite atom k in QM region do

Epp + Vkj (rkj ) Skj (rkj ) /* Modify pairwise potentials for atoms outside QM region */ for atom j in hydrocarbon do for zeolite atom k not in QM region do

Epp + Vkj (rkj ) extrapolate well outside the training set.

The encoding functions Sjk are structured as follows. Pairwise distances between atoms are the inputs. For each pairwise distance input, a proximity function is formed. The proximity function has the form h(r; ; n) =

1 1 + ( r )n ; (2) where the parameter controls the cutoff distance and the parameter n controls how fast the cutoff is. These are both tunable parameters allowing the model to determine the correct scales during optimization. Figure 2 shows proximity functions at different parameter values n with a fixed cutoff distance = 1. The proximity functions and their negations (1 h) are then fed into a neural network which combines them to learn the interaction logic from the data set. Figure 3 sketches the basic architecture of the encoding function. We call this architecture Programmable Potentials with Neural Networks (PP-NN). We compare this new architecture with the baseline model consisting of pure Lennard-Jones interactions.

In addition to comparing the new PP-NN architecture with the baseline LJ model, we also compare this with a variety of Programmable Potentials with manually specified encoding functions. As an example, consider developing a programmable potential for CH4 in the zeolite framework with , = 1 1 r manually specified interaction logic. The molecular dynamics (MD) baseline potential that we are utilizing is a 12-6 Lennard-Jones (LJ) potentials, shown below in equation (3) VLJ (rij ) = p i j "

Ri + Rj rij 12 2

Ri + Rj rij 6# ; (3) where rij is the interatomic distance between atom i and j, p i j is the minimum of the potential energy and Ri + Rj yield the distance at which the potential is minimal.

The baseline LJ potential is modified with encoding functions according to unique atom pair types. Therefore, we only develop encoding functions for Si-C, Si-H, O-C, O-H, Al-C, and Al-H interactions. We have selected an encoding function which only modifies the potentials of atoms whose interatomic distance is within a cut off radius. Furthermore, the encoding function depends only the interatomic distance between the zeolite and methane atom of interest. For example the LJ potential between a carbon and a zeolite atom will be modified by an encoding function that depends only on the interatomic distance between the carbon and zeolite atom. The analytic form of the resulting “hand-tuned” potential for a Si-C and O-H interaction below.

V (rSiC) = VLJ (rSiC) S(rSiC) = VLJ (rSiC) (1 h(rSiC)) V (rOH1 ) = VLJ (rOH1 ) S(rOH1 ) = VLJ (rOH1 ) (1 h(rOH1 )); where in the above equations h(r; ; n) is the standard proximity function that is used within the programmable potentials methodology.

As mentioned above, the programmable potential seeks to modify the LJ potentials of zeolite atoms that fall within a cut off radius of the hydrocarbon. This procedure essentially divides the entire system into two parts, where one is governed by the molecular dynamics interactions and the other is governed by the quantum mechanic interactions. However, the QM/MM data provided by Q-Chem was generated in a slightly different manner. Specifically, the QM region was defined by Q-chem to be composed of the closest silicon atom to the hydrocarbon along with eight additional neighbors of that silicon atom (4 oxygen and 4 silicon atoms). Therefore, the Q-Chem data was generated via a connectivity based QM region whereas our (hand-tuned) programmable potentials were generated via a radial cut off (4) QM region. The neural network programmable potential that we have developed is also a connectivity based potential. Therefore, we have also focused on developing connectivity based programmable potentials with hand-tuned logic to serve as additional baseline for the neural network programmable potentials.

We list the models tested for the undoped zeolite problem below. • Case 1: Pure Lennard-Jones (LJ) potential using QChem provided LJ parameters. No optimization. This case serves as the baseline. • Case 2: Pure Lennard-Jones potential with optimization of LJ parameters for carbon-hydrogen interactions. • Case 3: Radial distance based programmable potential using Q-chem provided LJ parameters and optimization of carbon-hydrogen encoding functions. • Case 4: Radial distance based programmable potentials with optimization of the LJ and encoding functions for carbon-hydrogen. • Case 5: Connectivity based programmable potential using Q-chem provided LJ parameters and optimization of carbon-hydrogen encoding functions. • Case 6: Connectivity based programmable potentials with optimization of the LJ parameters and encoding functions for C/H interactions. • Case 7 (NN in plots): Connectivity based programmable potential with neural network using Q-chem provided LJ parameters. This architecture is the focus of this paper.

Results

We found that our models generalize well outside the training set. For example, undoped MFI has 12 chemically distinguishable silicon atoms, called symmetry regions and labeled T1 to T12. For all models, only the T1 symmetry data was used for training. The 11 other symmetry regions were used for testing. Despite being trained on less than 10% of the data, the model beat a pure Lennard-Jones baseline on the T1 symmetry and also generalized well to the other symmetry regions. We compare the results with the true data using the mean relative error

n M RE = 1 X n i=1 jTi

Fij maxi(T ) mini(T ) ; (5) where Ti is the true energy of the system (from high-fidelity QM/MM computations) at sample location i and Fi is the energy of the model at sample location i.

We also tested how well our trained models generalized to other hydrocarbons. For example, we trained encoding functions using the CH4, undoped zeolite (MFI), T1 symmetry data. These trained encoding functions were then used to build a model for C2H6 in MFI. This new model beat a pure Lennard-Jones potential for C2H6+MFI on almost all the symmetry regions. With a small amount of retraining of the CH4-learned encoding functions using C2H6+MFI T1 symmetry data, the model beat a pure Lennard-Jones potential on all symmetry regions and was comparable or beat a Programmable Potentials model trained directly on the C2H6+MFI T1 symmetry data (see Figure 6). We also tested transferring the CH4 encoding functions to C3H8 (Figure 7). One lesson learned here was that the capacity of the neural network needed to match the complexity of the problem we wanted to transfer the learning to. For example, the first approach used fully-connected neural networks having 4 layers with 8 neurons each. While it adequately learned the CH4+MFI data (with good generalizations to the other symmetries), it did not do a good job of transferring to the C2H6+MFI data sets. Increasing the capacity of the network (5 layers with 16, 32, 64, 32, 16 neurons respectively) allowed good transferability. The conjecture is that the network needs to have enough capacity to match the complexity of the quantum logic of the problem to be transfered to. Currently, this is done in an ad-hoc manner. Having more formalized or precise heuristics or methods to determine such a match requires additional effort and could be a topic for a future research.

Finally, we tested transferring encoding functions from the undoped MFI framework to the doped HMFI framework, a more difficult problem (Figure 8). Doped zeolite is formed by replacing a small fraction of silicon atoms with aluminum atoms. These aluminum atoms present a challenge to transferring the encoding functions. This is due to the undoped MFI data sets not containing any information about how aluminum reacts with the hydrocarbons. The introduction of aluminum can be considered a black swan event from the viewpoint of the encoding functions. Despite this, the encoding functions generalized well from CH4+MFI (methane in undoped zeolite) to CH4+HMFI (methane in doped zeolite). Again the encoding functions were only trained using CH4+MFI T1 symmetry data and generalized well to the other symmetry regions of CH4 in doped zeolite (of which there are only 10 instead of 12).

Conclusions

We have presented the Programmable Potentials with Neural Networks (PP-NN) methodology in the context adsorbing hydrocarbons in zeolite frameworks. The methodology builds generative models out of classical molecular dynamics potentials—such as Lennard-Jones potentials, which give the correct long-range behavior—and small, targeted neural networks that learn the quantum level logic between atoms if given QM/MM data sets or other quantum mechanical data sets.

The learned encoding functions can be trained on simpler hydrocarbons and transmitted to more complex hydrocarbons and beat the baseline Lennard-Jones potential. Using the previous trained encoding functions as initial conditions for the more complex hydrocarbons, gives better performance than an encoding function that was randomly initialized and trained on the more complex hydrocarbons data set. This ability to transfer seems to require a higher capacity network than is needed to just learn the potential for the simpler hydrocarbon.

Looking forward, the computational catalysis community stands to benefit from the development of efficient, accurate and transferable machine-learning architectures such as we have developed in this program. This community has high standards for acceptable accuracy, because significant errors in the predicted energies enter observable quantities such as relative populations of binding sites, turnover frequencies, etc., in the exponential of Boltzmann factors or Arrenhius rate constants. Small errors in the relative binding energies can lead to erroneous predictions for gas adsorption, and likewise can even lead to erroneous predictions of reaction mechanisms. The goal in these communities is to have ML learned force fields that have the accuracy of quantum methods, but at the cost of force field methods. The Programmable Potentials neural network (PP-NN) methodology is a promising initial step towards this goal. Since, once trained, the PP-NN has an execution time on the order of classical force field methods, molecular dynamics simulations having quantum-level accuracy could be simulated in hours or days rather than years for a comparable QM/MM simulation.

Acknowledgements

This work was supported under DARPA contract HR001119-9-0026. Mansoor, E.; Van der Mynsbrugge, J.; Head-Gordon, M.; and Bell, A. T. 2018. Impact of long-range electrostatic and dispersive interactions on theoretical predictions of adsorption and catalysis in zeolites. Catal. Today 312(SI): 51–65.

Corma , A.

1995 . Inorganic Solid Acids and Their Use in Acid-Catalyzed Hydrocarbon Reactions . Chem. Rev . 95 ( 3 ): 559 - 614 .

Jensen , F.

2007 . Introduction to Computational Chemistry . John Wiley & Sons, second edition.

Smit , B. ; and Maesen, T. L. M. 2008 . Towards a molecular understanding of shape selectivity . Nature 451 ( 7179 ): 671 - 678 .

Thakur , G. S. ; Mohr, R.; and Mezic´, I. 2016 . Programmable Potentials: Approximate N-body potentials from coarse-level logic . Scientific Reports 6 ( 1 ): 33415 .