<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Combining Programmable Potentials and Neural Networks for Materials Problems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ryan Mohr</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Allan Avila</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soham Gosh</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ananta Bhattarai</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Muqiao Yang</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xintian Feng</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Head-Gordon</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ruslan Salakhutdinov</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Fonoberova</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Igor Mezic´</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>the nanosecond time scale requires the sequential evaluation of roughly a million energy and force evaluations, each Force field methods can be used in molecular dynamics simu- of which requires about 100 seconds using 8 cores. This ilcatailoonrs mtoacteormiaplustyesatenmd si.nfHeorwmeavcerro,-tlheveseel pfororcpeerfitieelsdosfccahnemof-- equates to approximately 0.1 billion seconds for a simulaten predict erroneous values due to not being accurate at the tion, or over 3 years. quantum level. The ideal would be able to simulate the sys- Our approach constructs a generative model by taking tem from first principles (quantum-level) and scale up to get classical molecular dynamics potentials (such as Lennardthe material properties. However, this is computationally in- Jones potentials) which capture the correct long-range befeasible. In this paper, we develop a Programmable Potentials havior and combines them with small neural networks which methodology that utilizes high-fidelity QM/MM data sets to encode the quantum level logic and correct the near-range iownlofoorgkrymsuthasaemtsaoeulnetcocomudlaaintrigcdayflunlynamclteiiaocrsnns(tMhcoeDnq)taupianonittneugnmtis-amlle.avTlellhniisnetumerraealtchtnoieodtn-- tbheehacvlaiossr.icTalhemeonleccoudlianrgpfoutenncttiiaolnss. Omuurltaipplpicroataicvhelyis mteormdiefyd logic of the system from the data and uses these encoding Programmable Potentials with neural networks (PP-NN). functions to modify standard MD potentials, like Lennard- The formulation of Programmable Potentials (Thakur, Jones, which capture the correct long range behavior, but do Mohr, and Mezic´ 2016) prior to this work relied on the repoorly in the near range. We test the method on the adsorp- searcher to manually determine and specify the interaction tion of hydrocarbons in a zeolite framework. We show that the logic and encode this in a boolean algebra. This was a time encoding functions, and the resulting model, generalize well consuming process and possible only for simple reactions. owuittshiddeattahferior mtraainsiinmgpsleert haynddrtohcaatrbenocnosduicnhg afsunCcHtio4ncsantrabiunieldd The idea we present here is to learn this logic automatgood models for more complex hydrocarbons such as C2H6 ically by leveraging neural networks and tunable proximand C3H8. ity functions, thereby replacing the “hand-tuned” encoding functions. An additional advantage of using neural networks as the encoding functions, and implementing them in packIntroduction ages such as TensorFlow or PyTorch, allows one to leverage the auto-differentiation routines of those frameworks, so that moving from training to deployment in a molecular dynamics simulation bypasses an error-prone, manual differentiation step to get the force fields. We applied our methods to the materials science problem of hydrocarbons binding in a zeolite framework, both undoped (MFI) and doped (HMFI). Doped zeolites are nano-porous doped silica catalysts that are the world's most widely used catalytic framework for cracking, dehydrogenation, and oligomerization of hydrocarbons (Primo and Garcia 2014). The classic example of doping is replacement of a small fraction Si by Al and an accompanying proton to create a solid acid catalyst (Corma 1995). An illustration is shown in Figure 1. Doping a zeolite turns a nonfunctional material into a functional material, where catalyst function is controlled by short-range electronic effects from doping, as well as shape-selective effects (pore size and</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        We seek to use data from small quantum mechanical
simulations to build generative models at the molecular scale in
order to predict bulk material properties. This is not
achievable with current methods. Modeling and solving a system
on the level of the Schrodinger equation scales
exponentially with the number of electrons in the system, making
it infeasible for any system of reasonable size. Density
functional theory and periodicity assumptions on the system
(socalled QM/MM methods) mitigate this problem somewhat,
but are still infeasible for many systems
        <xref ref-type="bibr" rid="ref2">(Jensen 2007)</xref>
        . For
example, using QM/MM methods for direct simulation of
molecular dynamics (MD) for zeolite-catalyzed reactions on
geometry)
        <xref ref-type="bibr" rid="ref3">(Smit and Maesen 2008)</xref>
        , and long-range effects
(remote charges, and polarization) from the extended
framework (Mansoor et al. 2018).
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <p>Pure force field methods often do not capture a potential
field correctly. While giving the correct form at long range,
the short range energies can be incorrect. Ideally, one would
like to use a full quantum mechanics simulation, but this
is intractable except for small systems. QM/MM hybrid
approaches are introduced to relax these limitations. As
mentioned above however, these methods still suffer from
computation intractability if one would like to simulate them
over a reasonable time scale, which would be needed to
extract macroscale properties of the material.</p>
      <p>Programmable Potentials is a force field methods that
takes into account the reaction logic of the system, quantum
or otherwise, given a data set. Since it is a force field method,
this makes it amenable for molecular dynamics simulations.
Using small, high-fidelity QM/MM data sets gives a force
field model that can be evaluated outside its training set,
setting the stage for long time-scale simulations that are still
informed by quantum level information.</p>
      <p>The Programmable Potential model takes the form of
EP P =</p>
      <p>X X
j k&gt;j</p>
      <p>VjkSjk
(1)
where Vkj is a pairwise potential between atoms j and k,
such as the Lennard-Jones potential in Eq. (3), and Sjk is
an encoding function that encodes the interaction logic of
the system. In bond-breaking reaction systems, the encoding
functions are usually restricted to have values between 0 and
1 and act as a switch that turns individual bonds on or off. In
static systems, like binding of guest molecules such as CH4
in a host zeolite framework, the encoding function values are
not squashed in this way.</p>
      <p>
        In original formulation of Programmable Potentials
        <xref ref-type="bibr" rid="ref4">(Thakur, Mohr, and Mezic´ 2016)</xref>
        , the coarse-level logic of
the system was encapsulated in the encoding functions. This
logic was first constructed using a Boolean algebra
generated by indicator functions on intervals [0; ). Once the logic
was constructed, the indicator functions were replaced with
smooth variants (see Eq.(2)) whose parameters were then
optimized over a training data set. This process could be
quite complicated as the system moved past very simple
reactions and would require a lot of man-hours and
ingenuity. The formation in this paper replaces these hand-tuned
encoding functions with encoding functions utilizing small
neural networks to automatically learn the interaction logic.
The neural networks used were densely connected networks
with, generally, no more 8 layers with neurons ranging
between 8 and 64 neurons for a layer.
      </p>
      <p>For reactions between a small number of atoms, we can
apply an encoding function for each interaction. However,
for the zeolite problem, using an encoding function for
every interaction is infeasible. This would require
optimizing thousands of neural networks simultaneously due to the
number of atoms in the zeolite framework. Therefore, we
learned an encoding function for each unique
interactiontype and limited them to zeolite atoms close to the
hydrocarbon. For example, every silicon-carbon interaction used the
same encoding function. This reduced the number of
encoding functions that needed to be trained from thousands to 4
(for the undoped zeolite). This also enabled transfer learning
to other systems by learning a “universal” encoding function
for each interaction type.</p>
      <p>For each unique atom pair type interaction, we use one
neural network to learn the corresponding encoding
function. For example, in the undoped zeolite problem, we have
four separate neural networks in total, encoding SSiC, SSiH,
SOC, SOH respectively. The networks take the pairwise
distance as input, and outputs the modified Lennard-Jones
potential. Recall that the modification for a pairwise potential
only happens for atoms in the QM region. For those atoms
outside QM region, we keep the original LJ pairwise
potential for the energy calculation. The overall process is
summarized in Algorithm 1.</p>
      <p>An additional advantage of using neural networks as the
encoding functions, and implementing them in frameworks
such as TensorFlow or PyTorch, allows one leverage the
auto-differentiation methods of those framework so that
moving from training to deployment in a molecular
dynamics simulation bypasses an error-prone, manual
differentiation step to get the force fields.</p>
      <p>Finally, the choice of the functional form (1), rather than
a pure neural network, gives a generative model. Neural
networks interpolate well, but often have difficulty
extrapolating outside the data sets they were trained on. Our methods
Algorithm 1 Programmable Potentials with Neural
Networks
/* Add pairwise potentials for atoms in QM region */
for atom j in hydrocarbon do
for zeolite atom k in QM region do</p>
      <p>Epp + Vkj (rkj ) Skj (rkj )
/* Modify pairwise potentials for atoms outside QM
region */
for atom j in hydrocarbon do
for zeolite atom k not in QM region do</p>
      <p>Epp + Vkj (rkj )
extrapolate well outside the training set.</p>
      <p>The encoding functions Sjk are structured as follows.
Pairwise distances between atoms are the inputs. For each
pairwise distance input, a proximity function is formed. The
proximity function has the form
h(r; ; n) =</p>
      <p>1
1 + ( r )n
;
(2)
where the parameter controls the cutoff distance and the
parameter n controls how fast the cutoff is. These are both
tunable parameters allowing the model to determine the
correct scales during optimization. Figure 2 shows proximity
functions at different parameter values n with a fixed cutoff
distance = 1. The proximity functions and their negations
(1 h) are then fed into a neural network which combines
them to learn the interaction logic from the data set. Figure 3
sketches the basic architecture of the encoding function. We
call this architecture Programmable Potentials with Neural
Networks (PP-NN). We compare this new architecture with
the baseline model consisting of pure Lennard-Jones
interactions.</p>
      <p>In addition to comparing the new PP-NN architecture with
the baseline LJ model, we also compare this with a variety
of Programmable Potentials with manually specified
encoding functions. As an example, consider developing a
programmable potential for CH4 in the zeolite framework with
, = 1
1
r
manually specified interaction logic. The molecular
dynamics (MD) baseline potential that we are utilizing is a 12-6
Lennard-Jones (LJ) potentials, shown below in equation (3)
VLJ (rij ) = p i j
"</p>
      <p>Ri + Rj
rij
12
2</p>
      <p>Ri + Rj
rij
6#
;
(3)
where rij is the interatomic distance between atom i and j,
p i j is the minimum of the potential energy and Ri + Rj
yield the distance at which the potential is minimal.</p>
      <p>The baseline LJ potential is modified with encoding
functions according to unique atom pair types. Therefore, we
only develop encoding functions for Si-C, Si-H, O-C, O-H,
Al-C, and Al-H interactions. We have selected an encoding
function which only modifies the potentials of atoms whose
interatomic distance is within a cut off radius. Furthermore,
the encoding function depends only the interatomic distance
between the zeolite and methane atom of interest. For
example the LJ potential between a carbon and a zeolite atom
will be modified by an encoding function that depends only
on the interatomic distance between the carbon and zeolite
atom. The analytic form of the resulting “hand-tuned”
potential for a Si-C and O-H interaction below.</p>
      <p>V (rSiC) = VLJ (rSiC) S(rSiC)
= VLJ (rSiC) (1
h(rSiC))
V (rOH1 ) = VLJ (rOH1 ) S(rOH1 )
= VLJ (rOH1 ) (1
h(rOH1 ));
where in the above equations h(r; ; n) is the standard
proximity function that is used within the programmable
potentials methodology.</p>
      <p>As mentioned above, the programmable potential seeks
to modify the LJ potentials of zeolite atoms that fall within
a cut off radius of the hydrocarbon. This procedure
essentially divides the entire system into two parts, where one is
governed by the molecular dynamics interactions and the
other is governed by the quantum mechanic interactions.
However, the QM/MM data provided by Q-Chem was
generated in a slightly different manner. Specifically, the QM
region was defined by Q-chem to be composed of the
closest silicon atom to the hydrocarbon along with eight
additional neighbors of that silicon atom (4 oxygen and 4
silicon atoms). Therefore, the Q-Chem data was generated via
a connectivity based QM region whereas our (hand-tuned)
programmable potentials were generated via a radial cut off
(4)
QM region. The neural network programmable potential that
we have developed is also a connectivity based potential.
Therefore, we have also focused on developing
connectivity based programmable potentials with hand-tuned logic
to serve as additional baseline for the neural network
programmable potentials.</p>
      <p>We list the models tested for the undoped zeolite problem
below.
• Case 1: Pure Lennard-Jones (LJ) potential using
QChem provided LJ parameters. No optimization. This case
serves as the baseline.
• Case 2: Pure Lennard-Jones potential with optimization
of LJ parameters for carbon-hydrogen interactions.
• Case 3: Radial distance based programmable potential
using Q-chem provided LJ parameters and optimization of
carbon-hydrogen encoding functions.
• Case 4: Radial distance based programmable potentials
with optimization of the LJ and encoding functions for
carbon-hydrogen.
• Case 5: Connectivity based programmable potential
using Q-chem provided LJ parameters and optimization of
carbon-hydrogen encoding functions.
• Case 6: Connectivity based programmable potentials with
optimization of the LJ parameters and encoding functions
for C/H interactions.
• Case 7 (NN in plots): Connectivity based programmable
potential with neural network using Q-chem provided LJ
parameters. This architecture is the focus of this paper.</p>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>We found that our models generalize well outside the
training set. For example, undoped MFI has 12 chemically
distinguishable silicon atoms, called symmetry regions and
labeled T1 to T12. For all models, only the T1 symmetry data
was used for training. The 11 other symmetry regions were
used for testing. Despite being trained on less than 10% of
the data, the model beat a pure Lennard-Jones baseline on
the T1 symmetry and also generalized well to the other
symmetry regions. We compare the results with the true data
using the mean relative error</p>
      <p>n
M RE = 1 X
n
i=1
jTi</p>
      <p>Fij
maxi(T )
mini(T )
;
(5)
where Ti is the true energy of the system (from high-fidelity
QM/MM computations) at sample location i and Fi is the
energy of the model at sample location i.</p>
      <p>We also tested how well our trained models generalized
to other hydrocarbons. For example, we trained encoding
functions using the CH4, undoped zeolite (MFI), T1
symmetry data. These trained encoding functions were then used
to build a model for C2H6 in MFI. This new model beat a
pure Lennard-Jones potential for C2H6+MFI on almost all
the symmetry regions. With a small amount of retraining of
the CH4-learned encoding functions using C2H6+MFI T1
symmetry data, the model beat a pure Lennard-Jones
potential on all symmetry regions and was comparable or beat
a Programmable Potentials model trained directly on the
C2H6+MFI T1 symmetry data (see Figure 6). We also tested
transferring the CH4 encoding functions to C3H8 (Figure 7).
One lesson learned here was that the capacity of the
neural network needed to match the complexity of the
problem we wanted to transfer the learning to. For example, the
first approach used fully-connected neural networks having
4 layers with 8 neurons each. While it adequately learned
the CH4+MFI data (with good generalizations to the other
symmetries), it did not do a good job of transferring to the
C2H6+MFI data sets. Increasing the capacity of the
network (5 layers with 16, 32, 64, 32, 16 neurons respectively)
allowed good transferability. The conjecture is that the
network needs to have enough capacity to match the complexity
of the quantum logic of the problem to be transfered to.
Currently, this is done in an ad-hoc manner. Having more
formalized or precise heuristics or methods to determine such
a match requires additional effort and could be a topic for a
future research.</p>
      <p>Finally, we tested transferring encoding functions from
the undoped MFI framework to the doped HMFI
framework, a more difficult problem (Figure 8). Doped zeolite is
formed by replacing a small fraction of silicon atoms with
aluminum atoms. These aluminum atoms present a
challenge to transferring the encoding functions. This is due
to the undoped MFI data sets not containing any
information about how aluminum reacts with the hydrocarbons.
The introduction of aluminum can be considered a black
swan event from the viewpoint of the encoding functions.
Despite this, the encoding functions generalized well from
CH4+MFI (methane in undoped zeolite) to CH4+HMFI
(methane in doped zeolite). Again the encoding functions
were only trained using CH4+MFI T1 symmetry data and
generalized well to the other symmetry regions of CH4 in
doped zeolite (of which there are only 10 instead of 12).</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>We have presented the Programmable Potentials with
Neural Networks (PP-NN) methodology in the context
adsorbing hydrocarbons in zeolite frameworks. The
methodology builds generative models out of classical molecular
dynamics potentials—such as Lennard-Jones potentials, which
give the correct long-range behavior—and small, targeted
neural networks that learn the quantum level logic between
atoms if given QM/MM data sets or other quantum
mechanical data sets.</p>
      <p>The learned encoding functions can be trained on
simpler hydrocarbons and transmitted to more complex
hydrocarbons and beat the baseline Lennard-Jones potential.
Using the previous trained encoding functions as initial
conditions for the more complex hydrocarbons, gives better
performance than an encoding function that was randomly
initialized and trained on the more complex hydrocarbons data
set. This ability to transfer seems to require a higher
capacity network than is needed to just learn the potential for the
simpler hydrocarbon.</p>
      <p>Looking forward, the computational catalysis community
stands to benefit from the development of efficient,
accurate and transferable machine-learning architectures such as
we have developed in this program. This community has
high standards for acceptable accuracy, because significant
errors in the predicted energies enter observable quantities
such as relative populations of binding sites, turnover
frequencies, etc., in the exponential of Boltzmann factors or
Arrenhius rate constants. Small errors in the relative binding
energies can lead to erroneous predictions for gas
adsorption, and likewise can even lead to erroneous predictions
of reaction mechanisms. The goal in these communities is
to have ML learned force fields that have the accuracy of
quantum methods, but at the cost of force field methods. The
Programmable Potentials neural network (PP-NN)
methodology is a promising initial step towards this goal. Since,
once trained, the PP-NN has an execution time on the order
of classical force field methods, molecular dynamics
simulations having quantum-level accuracy could be simulated in
hours or days rather than years for a comparable QM/MM
simulation.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This work was supported under DARPA contract
HR001119-9-0026.
Mansoor, E.; Van der Mynsbrugge, J.; Head-Gordon, M.; and
Bell, A. T. 2018. Impact of long-range electrostatic and
dispersive interactions on theoretical predictions of adsorption
and catalysis in zeolites. Catal. Today 312(SI): 51–65.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Corma</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>1995</year>
          .
          <article-title>Inorganic Solid Acids and Their Use in Acid-Catalyzed Hydrocarbon Reactions</article-title>
          .
          <source>Chem. Rev</source>
          .
          <volume>95</volume>
          (
          <issue>3</issue>
          ):
          <fpage>559</fpage>
          -
          <lpage>614</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Jensen</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Introduction to Computational Chemistry</article-title>
          . John Wiley &amp; Sons, second edition.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Smit</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ; and Maesen,
          <string-name>
            <surname>T. L. M.</surname>
          </string-name>
          <year>2008</year>
          .
          <article-title>Towards a molecular understanding of shape selectivity</article-title>
          .
          <source>Nature</source>
          <volume>451</volume>
          (
          <issue>7179</issue>
          ):
          <fpage>671</fpage>
          -
          <lpage>678</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Thakur</surname>
            ,
            <given-names>G. S.</given-names>
          </string-name>
          ; Mohr, R.; and Mezic´,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <year>2016</year>
          .
          <article-title>Programmable Potentials: Approximate N-body potentials from coarse-level logic</article-title>
          .
          <source>Scientific Reports</source>
          <volume>6</volume>
          (
          <issue>1</issue>
          ):
          <fpage>33415</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>