<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Macao, S.A.R.:</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>PhyloTransformer: A Self-supervised Discriminative Model for SARS-CoV-2 Viral Mutation Prediction Based on a Multi-head Self-attention Mechanism</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yingying Wu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shusheng Xu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shing-Tung Yau</string-name>
          <email>yau@math.harvard.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yi Wu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Harvard University, Center of Mathematical Sciences and Applications</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Shanghai Qi Zhi Institute</institution>
          ,
          <addr-line>Shanghai</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Tsinghua University, Institute for Interdisciplinary Information Sciences</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Houston</institution>
          ,
          <addr-line>Houston, U.S</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>1</volume>
      <fpage>9</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>In this article, we developed PhyloTransformer, a Transformer-based self-supervised discriminative model, which can model genetic mutations that may lead to viral reproductive advantage. We trained PhyloTransformer on 1,765,297 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences to infer fitness advantages, by directly modeling the nucleic acid sequence mutations. PhyloTransformer utilizes advanced techniques from natural language processing to enable eficient and accurate intra-sequence dependency modeling over the entire RNA sequence. prediction accuracy of novel mutations and novel combinations using our method and baseline models that only take local segments as input. We found that PhyloTransformer outperformed every baseline method with statistical significance. We also predicted the occurrence of mutations in each nucleotide of the receptor binding motif (RBM) and predicted modifications of  -glycosylation sites. We anticipate that the viral mutations predicted by PhyloTransformer may identify potential mutations of threat to guide therapeutics and vaccine design for efective targeting of future SARS-CoV-2 variants. COVID-19, SARS-CoV-2, spike protein, variants of concern, PhyloTransformer, self-supervised neural network Severe acute respiratory syndrome coronavirus 2 (SARS- the changing immune profile of the human population. a 10% mortality rate [3, 4]. Middle East respiratory syn- impact on fitness advantages [ 9]. However, the evolutionvirus exhibited relative evolutionary stasis for approxi- In addition, other SARS-CoV-2 mutations introduced an KDH@IJCIA 2023: The 6th international workshop on knowledge †These authors contributed equally.</p>
      </abstract>
      <kwd-group>
        <kwd>Prediction</kwd>
        <kwd>Mechanism</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>CoV-2) is the causative agent of Coronavirus disease 2019
(COVID-19). The unprecedented COVID-19 pandemic
is one of three major pathogenic zoonotic disease
outbreaks caused by  -coronaviruses in the past two decades
[1, 2]. Severe acute respiratory syndrome coronavirus
(SARS-CoV) emerged in 2002, infecting 8,000 people with
drome coronavirus (MERS-CoV) emerged in 2012 with
2,300 cases and a 35% mortality rate [5]. The third
outbreak, mediated by SARS-CoV-2, emerged in 2019 with
a mortality rate of 3.6% [6] and 219 million cases have
been reported as of October 2021.</p>
      <sec id="sec-1-1">
        <title>After the emergence of SARS-CoV-2 in late 2019, the</title>
        <p>mately 11 months. Since the end of 2020, SARS-CoV-2 has
consistently acquired approximately two mutations per
month [7] resulting in novel variants of concern (VOCs).</p>
        <p>As more individuals became vaccinated against
SARS[10], transmissibility [11, 12], angiotensin converting
enzyme 2 (ACE2) binding afinity [ 13], or antigenicity [14].
optimized trade-of to improve overall fecundity.
Heavily mutated lineages have also been reported, such as
the lineage B.1.1.298, which harbors the following four
amino acid substitutions: ΔH69–V70, Y453F, I692V, and
M1229I [15]. Some mutations may amplify other
mutations, providing an improved fitness advantage. For
example, the combination of E484K, K417N, and N501Y
results in the highest degree of conformational alterations
compared to either E484K or N501Y alone [16].
AccumuWe used the hCoV-19/Wuhan/WIV04/2019 sequence
(WIV04) as our reference sequence, which is the
oficial reference sequence employed by GISAID
(EPI_ISL_402124). WIV04 represented the consensus
of several early submissions for the  -coronavirus
responsible for COVID-19 [19], which was isolated by the
Wuhan Institute of Virology from a clinical sample of
a bronchoalveolar lavage fluid for RNA extraction and
metagenomic next-generation sequencing. The
consensus sequence was obtained by de novo assembly [20].</p>
        <p>Based on WIV04, we define a mutation as the change
Figure 1: PhyloTransformer prediction paradigm. in a nucleotide at a particular position that is diferent
from the reference sequence. We define a mutation at
immediate attention are circulating, which highlights the a particular position that only occurs in the testing set
urgent need to develop efective prevention and treat- but does not occur within the training set as a novel
mument strategies. tation, which signifies a mutation that is novel for the</p>
        <p>While vaccination has been the most important and training set. We define all the novel mutations over an
efective preventive measure, it is also facing challenges. RNA sequence as a novel combination, i.e., a combination
The mRNA vaccine BNT162b2 (Pfizer–BioNTech) has of mutations that do not occur in the training data. The
95% eficacy against COVID-19 [ 17]. However, the es- prediction of novel mutations aims to predict single
mutatimated efectiveness of the vaccine against the B.1.1.7 tions, while the prediction of novel combinations aims to
variant was 89.5% (95% CI, 85.9 to 92.3) at 14 or more predict a collection of single mutations that jointly occur
days after the second dose and 75.0% (95% CI, 70.5 to in a mutated sequence.
78.9) against the B.1.351 variant [18] at 14 or more days The prediction accuracies of novel mutations and novel
after the second dose. Several studies have characterized combinations were evaluated after the predicting
modmultiple mutations that change the antigenic phenotype. els PhyloTransformer, Local Transformer, and ResNet-18
Thus, these studies elucidate how these mutations afect converged. We first performed lag 1 autocorrelation to
antibody-mediated neutralization. Variants containing test the correlation between accuracy scores obtained
these mutations are potentially highly virulent and have from models that are one checkpoint apart. The
autoreceived much recent attention. However, it remains un- correlation tests were performed on small, medium, and
known whether more infectious variants exist along with large datasets for predicting novel mutations and novel
the likelihood that they will appear and transmit. De- combinations, with a total of 18 tests. We found no time
signing vaccines after a novel variant has emerged is not dependency between the 10 accuracy scores in each of
optimal because the variant could potentially compro- these 18 tests. For other classical machine learning
modmise existing vaccines and spread among the population. els, we repeated the experiment 10 times for each dataset.
Thus, more infections might generate further variants, The details are reported in Box 1C.
leading to a never-ending pandemic. In this section, we first evaluated
PhyloTransformer</p>
        <p>In order to win the race against the rapidly evolving generated predictions of novel mutations and novel
comSARS-CoV-2, an intelligent system capable of forecasting binations. Next, we compared the accuracy of each
prepotential VOCs before they actually appear is urgently re- diction with those obtained from baseline models. We
quired. Therefore, in order to infer fitness advantages, we then reported our predictions in the receptor binding
proposed PhyloTransformer, which models constraints motif (RBM). Finally, we predicted modifications of 
from natural sequences, including long-range dependen- glycosylation sites to help identify mutations associated
cies between positions. We hope that PhyloTransformer with altered glycosylation that might be favored
durcan be used to predict novel mutations and novel com- ing viral evolution. The detailed model architecture and
binations of mutations in SARS-CoV-2, as depicted in training process are reported in the Methodology section.
Fig. 1. Thus, we anticipate that when variants of high
consequence arise, existing vaccines based on PhyloTrans- Predicting Novel Mutations
former predictions will have already been developed that
target those strains.</p>
        <p>We evaluated the eficacy of PhyloTransformer to
predict novel mutations and compared it to baseline model
predictions from three datasets with diferent sizes
spanning diferent time frames. The prediction results are</p>
        <sec id="sec-1-1-1">
          <title>Smal DMaetdaisuimze</title>
          <p>PhyloTransformer
Large
Local Transformer
Smal
ResNet-18</p>
        </sec>
        <sec id="sec-1-1-2">
          <title>DMaetdaisuimze Large</title>
          <p>Random Forest
Combination
: Box 1 | Prediction Accuracy. A. Prediction accuracy of novel mutations from the small, medium, and large datasets based
on PhyloTransformer and the best baseline methods. B. Prediction accuracy of novel combinations trained with the small,
medium, and large datasets based on PhyloTransformer and the best baseline methods. The accuracy improvement for each
indicated model was calculated based on dividing the number of correct predictions by the expected number of correct random
guesses. C. Prediction accuracy of PhyloTransformer–and baseline method–generated predictions of novel mutations and novel
combinations. Sig. Phylo:  -value with respect to PhyloTransformer, compared to random guessing resulted in an accuracy of
0.26% with an SD = 0.012%. Sig. Local:  -value with respect to Local Transformer.
reported in Box 1. For each mutation, we masked the defined as the following:
wrahwicnhunculecoletoidtiedienitthweoruelfdermeuntcaetesetqou,aenndcewaensdelpercetdedicttehde Improvement ∶= Model Acc. .
nucleotide with the highest confidence as our prediction. Random Guessing Acc.
The prediction accuracy is the proportion of positions For the small dataset, there were 2.26 mutations on
averthat are predicted correctly among all novel positions in age with a standard deviation (SD) = 5.06; for the medium
the testing set. The prediction accuracy of random guess- dataset, there were 3.06 mutations on average with an
ing is exactly 1/3. We evaluated the prediction eficacy SD = 2.56; and for the large dataset, there were 8.75
mutaaveraged over 10 checkpoints after the convergence of tions on average with an SD = 2.87. For the small dataset,
PhyloTransformer, Local Transformer, and our baseline random guessing resulted in an accuracy of 13.30% with
models on three datasets with the variance marked either an SD = 1.12%; for the medium dataset, random
guessbelow or above. Next, we reported the model predictions ing resulted in an accuracy of 5.42% with an SD = 0.12%;
from each dataset, which is displayed in Box 1A. and for the large dataset, random guessing resulted in an</p>
          <p>We performed a two-sample  -test of proportions and accuracy of 0.26% with an SD = 0.012%. The predicted
found that for each model, the best prediction accuracy results are summarized in Box 1B, where the accuracy
of novel mutations from the large dataset among the 10 improvement value was defined as follows: given the
checkpoints significantly less than PhyloTransformer. Lo- dataset (small, medium, or large), take the number of
corcal Transformer had the best performance among base- rect predictions generated by the indicated model and
line models, but the average over 10 checkpoints was still divide that value by the expected number of correct
ran11% lower than that of PhyloTransformer on the large dom guesses.
dataset with statistical significance, as shown in Box 1C. We performed a two-sample  -test of proportions to
Table 1 reports the 20 novel mutations predicted by train- determine whether the accuracy of predicting novel
coming PhyloTransformer with greatest probability using the binations by PhyloTransformer significantly less than
large dataset. baseline models on the large dataset. The prediction
accuracy of PhyloTransformer among the 10 checkpoints
Predicting Novel Combinations was higher than that generated by all of the baseline
models with statistical significance. Local Transformer
was no longer the best baseline model, while ResNet-18
and random forest outperformed Local Transformer for
the task of predicting novel combinations.</p>
          <p>If a sequence in the testing set does not exist in the
training set, we compared it to the reference sequence, then
masked the mutated positions and generated predictions
at these positions. If the model predicts all the
mutations correctly in this sequence, we say that it predicted Predictions in the Spike Protein RBM
a novel combination correctly. The accuracy of predicting
novel combinations is the proportion of the number of SARS-CoV-2 infects human cells by binding of the
visequences whose combinations are predicted correctly ral surface protein spike to its receptor on human cells,
to all the sequences in the testing set. the ACE2 protein. Because of its role in viral entry, the</p>
          <p>The dificulty of predicting novel combinations changes RBD is a dominant determinant of zoonotic cross-species
as the size of the dataset changes, so we measure our transmission. Although SARS-CoV-2 does not cluster
prediction eficacy by accuracy improvement, which is within SARS and SARS-related coronaviruses, the RBD of
SARS-CoV and SARS-CoV-2 share structural similarities,
RBM. PhyloTransformer trained with the large dataset
predicted only two mutations. The first mutation was
predicted at amino acid 488, changing it from C to R,
which is closely adjacent to F486. The second mutation
was predicted at amino acid 497, changing it from F to
S, once again right next to P499. The close proximity
of the introduced mutations and predicted mutations
indicated that PhyloTransformer is potentially capable
of capturing meaningful genetic phenomena and can
generate efective predictions. Our prediction results are
probably due to their shared zoonotic ancestry. This sim- reported in Table 2.
ilarity implies convergent evolution for improved
binding to ACE2 between the SARS-CoV and SARS-CoV-2
RBDs. Therefore, we focused our predictions on the spike Prediction of Glycosylation Site
protein RBD. The total length of the SARS-CoV-2 spike Modifications
protein is 1,273 amino acids, and its structural features
are listed below:
The SARS-CoV-2 spike protein is heavily glycosylated.</p>
          <p>Viral glycosylation plays a vital role in viral pathobiology,
• A signal peptide is located at the N-terminus including antibody resistance, target recognition, viral
(1–13 residues). entry, and host immune modulation [23]. Glycosylation
• The S1 subunit (14–685 residues) is responsible sites facilitate immune evasion by shielding epitopes from
for receptor binding. The S1 subunit contains antibody neutralization; therefore, they are under
selecan N-terminal domain (14–305 residues), a C- tive pressure. Since glycosylation site modifications of
terminal domain 0 (306-330 residues), an RBD the SARS-CoV-2 spike protein will likely impact the
over(331-527 residues), a C-terminal domain 1 (528- all activities of SARS-CoV-2 replication and escape from
590 residues), and a C-terminal domain 2 (591-685 immune surveillance [24], we examined glycosylation
residues). site model predictions. We reported our results on the 
• The S2 subunit (686–1273 residues) is respon- glycosylation sites to help identify mutations associated
sible for receptor binding and membrane fu- with altered glycosylation that are favored during viral
sion. The S2 subunit contains cleavage sites evolution. PhyloTransformer predicted three mutations
(686-815 residues) at S1/S2 and S2’, a fusion of the following glycosylation sites: N122, N331, and
peptide (816–855 residues), a fusion peptide re- N343. Table 3 shows the predicted mutations in the spike
gion (856-911 residues), a heptapeptide repeat protein changing N to a diferent amino acid. Figure 3
sequence 1 (912–984 residues), a center helix summarizes the predicted mutations, including existing
(985-1034 residues), a connector domain (1035- mutations (left) and novel mutations (right), with
predic1080 residues), a connector domain 1 (1081- tions mutating away from amino acid N highlighted.
1147 residues), a heptapeptide repeat sequence 2
(1163–1213 residues), a transmembrane domain
(1213–1237 residues), and a cytoplasmic domain
(1237–1273 residues) [21].</p>
          <p>The spike protein RBM comprises amino acids 438 to
506. Yi et al. [22] compared the SARS-CoV-2 and
SARSCoV RBD afinity for hACE2 by creating single amino acid
substitution mutations in the SARS-CoV and SARS-CoV-2
RBM sequences. The authors found that receptor binding
was enhanced by introducing amino acid changes at P499,
Q493, F486, A475, and L455, which are all localized to the
Sites
Glycosy.</p>
          <p>Glycosy.</p>
          <p>Glycosy.</p>
          <p>N-mut.</p>
          <p>N-mut.</p>
          <p>N-mut.</p>
          <p>N-mut.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>Technical Background
In this section, we will briefly review the history of
sequence models that led to the development of
Transformer and then introduce our PhyloTransformer model.</p>
      <p>The recurrent neural network (RNN) is the standard
neural sequence model which extends the conventional
feedforward neural network with a recurrent hidden state
dependent on the previous timestep. RNN and its variants,
such as the long short-term memory (LSTM) [25] and the
gated recurrent unit (GRU) [26], have been widely
applied to important AI tasks, including language modeling
[27], speech recognition [28], handwriting recognition
[29], and machine translation [30]. However, RNNs are
dificult to train in practice since the gradients tend to
either vanish or explode as the sequence length increases
[31]. In addition, these models encode a source sequence
into a fixed-length vector, which becomes a bottleneck
when tackling particularly long sequences. Therefore,
the attention mechanism was introduced [32] to augment
RNNs with an additional variable-length representation
when encoding the input sequence. The attention
mechanism allows the model to only focus on a subset of the
input sequence for decoding. The Transformer model
comprises a purely attention-based network architecture
without RNN backbones to directly capture intra-position
dependencies via the self-attention mechanism [33]. In
self-attention, each sequence item has direct access to all
the other positions, which yields a more powerful global
representation of the sequence. This feature also inspires
biological applications due to the long-range interactions
of genetic sequences. However, the following challenges
in modeling mutations on RNA sequences remain:
• Length adaptation: most natural language
processing (NLP) models deal with sequence lengths
of a few hundred to a thousand, but the RNA
sequence of SARS-CoV-2 is much longer: the
genome of SARS-CoV-2 is 29,903 nucleotides in
length [34], and the spike protein has 3,819
nucleotides.
• Mutation sparsity: due to the proofreading
functions of coronaviruses [8], mutations in the
SARS-CoV-2 genome are rare. Our dataset shows
consistency in this regard.</p>
      <p>Regular Transformer scales quadratically with respect
to the input sequence length, and the sparsity of
mutations might lead to the generative Transformer model
overfitting the identical parts while ignoring the
mutations. Therefore, to adapt to biological problems and
address issues regarding genetic mutations, a new model
that tackles the length and sparsity issues commonly
encountered in existing deep neural network architectures
is required. To address these two challenges, we propose
PhyloTransformer, which is a linear time complexity
discriminative model based on the Transformer architecture.</p>
      <p>The time and space linearity are achieved by adopting
FAVOR+ from Performer [35], which performs an
unbiased fast attention approximation with low variance. The
mutation sparsity issue is addressed by directly modeling
the mutations using the MLM training objective from
BERT [36], which is a discriminative variant of
Transformer for supervised NLP tasks. A detailed description
of PhyloTransformer architecture is presented in the next
section.</p>
      <p>Model Development
We adopted a discriminative approach to model the
mutation probability at a particular position in the RNA
sequence. Let (  = | ) denote the probability of
the  th nucleotide changing to  given the reference
sequence  . We will demonstrate how to predict (  | )
by PhyloTransformer and other baseline models in this
section.</p>
      <p>The PhyloTransformer Model
The PhyloTransformer model adopts a
Transformerbased network, which utilizes the full spike sequence
of 3,819 nucleotides as input and generates output
mutation probabilities at particular positions. We followed
the MLM pre-training objective from BERT [36]. Note
that the attention mechanism in Transformer [33]
calculates attention matrices with a shape of  ×  (where  is
the length of the sequence) to capture the relationship
between nucleotides. In order to reduce the computation
complexity of the attention matrix, we adopted the
FAVOR+ technique from Performer [35], which performs
approximate attention computation in linear time. In the
following content, we first present the network
architecture of PhyloTransformer. Next, we introduce FAVOR+
for fast low-rank approximation of the regular full-rank
attention computation in linear time. Finally, the overall
training process will be discussed in detail.</p>
      <p>Bidirectional Transformer Encoder:
Let  =
( 1,  2, ...,   ) denote the reference sequence, where  
is the nucleotide at position  in the RNA sequence. We
ifrst applied trainable projections to map each   with its
position information to three embedding vectors,   ,  
and   , for attention computation. Suppose the dimension
of each embedding is  . The output of the attention layer
is computed by the following equation:</p>
      <p>Attention(,  ,  ) =  ⋅  =
softmax (
)</p>
      <p>
        (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )

√

where  ∈
ℝ ×
is the attention matrix.
      </p>
      <p>=
[ 1;  2; ...;   ],  = [ 1;  2; ...;   ], and  = [ 1;  2; ...,   ]
are embedding matrices in ℝ× , where   ,   , and   are
row vectors representing three embeddings. After the
attention layer is computed, we further applied a
feedforward layer with a residual connection. An attention
layer and a feed-forward layer compose a single
Transformer module. We stacked the</p>
      <p>Transformer modules
as the overall network architecture of our
PhyloTransformer model.</p>
      <p>FAVOR+:</p>
      <p>In the original attention mechanism, the</p>
      <p>
        2
time complexity of computing the attention layer by
Equation (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) is (
      </p>
      <p>) , which becomes computationally
intractable when  is large. The Performer [35] model
proposed kernelizable attention by deriving a mapping 
to decouple the attention matrix  into  ′ and  ′, where
 ′ = (  ),  ′ = (  ) and  ′,  ′ ∈ ℝ× ,  ≪  . In this

case, the attention layer can be computed by the
following equation:</p>
      <p>Attention(,  ,  ) =</p>
      <p>
        −1( ′(( ′)  )),
 =
diag( ′(( ′) 1 ))
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
where 1 is an all-ones vector of length  . Since  ′,  ′ ∈
ℝ× ,  ∈ ℝ × , the computation complexity decreases
to ( )
      </p>
      <p>with respect to a small constant  , making it
computationally feasible to handle particularly long
sequences such as RNA data.</p>
      <p>Training process:</p>
      <p>We denoted the reference sequence
as  = ( 1,  2, ...,   ) and the mutated sequence as
 = ( 1,  2, ...,   ), where   and   refer to the nucleotide
at position  . On average, there were 0.0592% mutations
in the small dataset, 0.0801% mutations in the medium
dataset, and 0.2291% mutations in the large dataset. These
numbers refer to the average number of   s that are
different from the number of   s in the respective dataset.
tions in  , and used the model to predict nucleotides in
 at those masked positions. Fig. 4 shows the workflow
of our model. Specifically, we first identified the set of
mutated positions   = ( 1, … ,   ), where  1, … ,   are
During the training process, we masked certain posi- termined by the equation:
smoamske mrauntdaotemdppoossiittiioonnss and</p>
      <p>C
C</p>
      <p>C
G …… G</p>
      <p>G …… A
Spike RNA Sequence</p>
      <p>Transformer
Layer</p>
      <p>X 6
A
A</p>
      <p>Mutated</p>
      <p>
        Sequence
A SReeqfeureennccee
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
several unchanged positions   = ( 1′, … ,   ′)such that
|  ∪  | = 1.5%. Next, we applied a masking function
  (  ) to each nucleotide   at the masked positions.
      </p>
      <p>Namely, ∀  ∈   ∪   , the masking function   changes
  (
  ) =</p>
      <p>
        &lt;  &gt;
⎧
⎨  
⎩Random({,  , , })
80%
10%
10%
of cases, (
        <xref ref-type="bibr" rid="ref4">4</xref>
        )
of cases,
of cases,
where &lt;  &gt;
      </p>
      <p>
        is a special masking token. The masking
function   acts on 1.5% of the entire nucleotides and
further randomly maps each nucleotide from this masking
subset to (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) a special token &lt;  &gt;
      </p>
      <p>
        (80% chance), (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) a
random substitution (10%), or (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) itself (10%).
      </p>
      <p>Denoting the masked sequence as  ̃, we encode  ̃
with stacked Transformer modules and represent each
nucleotide as a hidden vector ℎ from the model output.</p>
      <p>Next, the probability distribution of the  th nucleotide
position over {A, T, C, G} is computed as follows:
 (  | )̃ = softmax (  ℎ )</p>
      <p>∀ ∈   ∪   ,
where   are trainable parameters. The probability of
all the masked nucleotides is the following equation:
 ( |  )̃ =</p>
      <p>∏
∈  ∪</p>
      <p>(  | )̃
The model is optimized to minimize the negative log
probability over all the mutated sequences from the training
set 
with respect to diferent masking positions, as
de() = −
 ∈
∑    [log  ( |  )̃ ] .
() = −  ∈
{ ∈  ∪  [log  (  | )̃ ]} .</p>
      <p>Since most of the masked positions are mutated
positions, our model is trained to concentrate on mutation
predictions. Meanwhile, the randomly chosen positions
(i.e.,   ) also improved the robustness of our model.</p>
      <p>Local models
In addition to PhyloTransformer, which considers the
full sequence, we also examined baseline methods, which
predict (  | ) based on local segments from the spike
RNA sequence. There is a total of 3,819 nucleotides in
the spike sequence. We can obtain a local segment of
15 nucleotides centered around each nucleotide with
sequence padding. Thus, we can obtain 3,819 segments of
15 nucleotides from the full spike RNA sequence. The
center position of each segment is masked. We adopted
various classification methods (including neural models
and non-neural methods) to predict the center nucleotide
based on other nearby nucleotides. During the training
phase, we split all training spike RNA sequences into
segments and generated a local dataset with repeated
segments filtered out. The training process is shown in
Appendix A, where any classification method could be
used, such as the standard Transformer, ResNet-18, MLP,
logistic regression, KNN, random forest, and gradient
boosting.</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>The overall goal of our research is to train a
state-of-theart sequence model using existing viral genetic sequence
data to identify SARS-CoV-2 variants that may have
evolutionary advantages and become the emerging VOCs.</p>
      <p>In this paper, we developed the PhyloTransformer model,
a novel deep neural network with a multi-headed
selfattention mechanism. PhyloTransformer was subjected
to an advanced training methodology to predict potential
mutations that may lead to enhanced virus
transmissibility or resistance to antisera. Our computational platform
may be helpful in guiding the design of therapeutics and
vaccines for efective targeting of emerging SARS-CoV-2
VOCs, as well as novel mutants of other viruses that may
cause pandemics.</p>
      <p>Ethics Statement This research was based on the
SARS-CoV-2 sequences in the Global Initiative for
Sharing All Influenza Data (GISAID) database ( https://www.
gisaid.org/). There is no human information involved in
the data.
[13] T. N. Starr, A. J. Greaney, S. K. Hilton, D. Ellis, K. H. [22] C. Yi, X. Sun, J. Ye, L. Ding, M. Liu, Z. Yang, X. Lu,
Crawford, A. S. Dingens, M. J. Navarro, J. E. Bowen, Y. Zhang, L. Ma, W. Gu, et al., Key residues of
M. A. Tortorici, A. C. Walls, et al., Deep mutational the receptor binding motif in the spike protein of
scanning of SARS-CoV-2 receptor binding domain SARS-CoV-2 that interact with ace2 and
neutralizreveals constraints on folding and ace2 binding, ing antibodies, Cellular &amp; molecular immunology
Cell 182 (2020) 1295–1310. 17 (2020) 621–630.
[14] E. C. Thomson, L. E. Rosen, J. G. Shepherd, [23] K. J. Doores, The hiv glycan shield as a target for
R. Spreafico, A. da Silva Filipe, J. A. Wojcechowskyj, broadly neutralizing antibodies, The FEBS journal
C. Davis, L. Piccoli, D. J. Pascall, J. Dillen, et al., Cir- 282 (2015) 4679–4691.
culating SARS-CoV-2 spike N439K variants main- [24] D. Hofmann, S. Mereiter, Y. J. Oh, V. Monteil,
tain fitness while evading antibody-mediated im- R. Zhu, D. Canena, L. Hain, E. Laurent, C.
Grumunity, Cell 184 (2021) 1171–1187. ber, M. Novatchkova, et al., Identification of lectin
[15] J. Fonager, S. scientist Ria Lassaunière1, senior sci- receptors for conserved SARS-CoV-2 glycosylation
entist Jannik Fonager1, S. scientist Morten Ras- sites, bioRxiv (2021).
mussen, A. Frische, S. S. C. P. Strandh, S. scien- [25] S. Hochreiter, J. Schmidhuber, Long short-term
tist veterinarian Thomas Bruun Rasmussen, C. vet- memory, Neural computation 9 (1997) 1735–1780.
erinarian Anette Bøtner, C. V. A. Fomsgaard, Work- [26] K. Cho, B. Van Merriënboer, D. Bahdanau, Y.
Bening paper on SARS-CoV-2 spike mutations aris- gio, On the properties of neural machine
translaing in danish mink, their spread to humans and tion: Encoder-decoder approaches, arXiv preprint
neutralization data., ???? URL: https://files.ssi.dk/ arXiv:1409.1259 (2014).</p>
      <p>Mink-cluster-5-short-report_AFO2(2020). [27] T. Mikolov, M. Karafiát, L. Burget, J. Černockỳ,
[16] G. Nelson, O. Buzko, P. R. Spilman, K. Niazi, S. Ra- S. Khudanpur, Recurrent neural network based
bizadeh, P. R. Soon-Shiong, Molecular dynamic language model, in: Eleventh annual conference
simulation reveals e484k mutation enhances spike of the international speech communication
associRBD-ACE2 afinity and the combination of E484K, ation, 2010.</p>
      <p>K417N and N501Y mutations (501Y. V2 variant) in- [28] A. Graves, A.-r. Mohamed, G. Hinton, Speech
duces conformational change greater than n501y recognition with deep recurrent neural networks,
mutant alone, potentially resulting in an escape in: 2013 IEEE international conference on
acousmutant, BioRxiv (2021). tics, speech and signal processing, Ieee, 2013, pp.
[17] F. P. Polack, S. J. Thomas, N. Kitchin, J. Absalon, 6645–6649.</p>
      <p>A. Gurtman, S. Lockhart, J. L. Perez, G. P. Marc, E. D. [29] A. Graves, M. Liwicki, S. Fernández, R. Bertolami,
Moreira, C. Zerbini, et al., Safety and eficacy of the H. Bunke, J. Schmidhuber, A novel connectionist
BNT162b2 mRNA Covid-19 vaccine, New England system for unconstrained handwriting recognition,
Journal of Medicine (2020). IEEE transactions on pattern analysis and machine
[18] L. J. Abu-Raddad, H. Chemaitelly, A. A. Butt, Efec- intelligence 31 (2008) 855–868.
tiveness of the BNT162b2 Covid-19 Vaccine against [30] N. Kalchbrenner, P. Blunsom, Recurrent continuous
the B.1.1.7 and B.1.351 Variants, New England Jour- translation models, in: Proceedings of the 2013
connal of Medicine (2021). ference on empirical methods in natural language
[19] P. Okada, R. Buathong, S. Phuygun, processing, 2013, pp. 1700–1709.</p>
      <p>T. Thanadachakul, S. Parnmen, W. Wongboot, [31] Y. Bengio, P. Simard, P. Frasconi, Learning
longS. Waicharoen, S. Wacharapluesadee, S. Uttaya- term dependencies with gradient descent is
difimakul, A. Vachiraphan, et al., Early transmission cult, IEEE transactions on neural networks 5 (1994)
patterns of coronavirus disease 2019 (COVID-19) 157–166.
in travellers from wuhan to thailand, january 2020, [32] D. Bahdanau, K. Cho, Y. Bengio, Neural machine
Eurosurveillance 25 (2020) 2000097. translation by jointly learning to align and translate,
[20] P. Zhou, X.-L. Yang, X.-G. Wang, B. Hu, L. Zhang, arXiv preprint arXiv:1409.0473 (2014).</p>
      <p>W. Zhang, H.-R. Si, Y. Zhu, B. Li, C.-L. Huang, et al., [33] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,
A pneumonia outbreak associated with a new coro- L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin,
Atnavirus of probable bat origin, nature 579 (2020) tention is all you need, in: Advances in neural
in270–273. formation processing systems, 2017, pp. 5998–6008.
[21] Y. Huang, C. Yang, X.-f. Xu, W. Xu, S.-w. Liu, Struc- [34] D. Kim, J.-Y. Lee, J.-S. Yang, J. W. Kim, V. N. Kim,
tural and functional properties of SARS-CoV-2 spike H. Chang, The architecture of sars-cov-2
transcripprotein: potential antivirus drug development for tome, Cell 181 (2020) 914–921.</p>
      <p>COVID-19, Acta Pharmacologica Sinica 41 (2020) [35] K. M. Choromanski, V. Likhosherstov, D. Dohan,
1141–1149. X. Song, A. Gane, T. Sarlos, P. Hawkins, J. Q. Davis,</p>
      <sec id="sec-3-1">
        <title>A. Mohiuddin, L. Kaiser, D. B. Belanger, L. J. Colwell,</title>
        <p>A. Weller, Rethinking attention with performers,
in: International Conference on Learning
Representations, 2021. URL: https://openreview.net/forum?
id=Ua6zuk0WRH.
[36] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT:
Pre-training of deep bidirectional transformers for
language understanding, in: Proceedings of the
2019 Conference of the North American
Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1
(Long and Short Papers), Association for
Computational Linguistics, Minneapolis, Minnesota,
2019, pp. 4171–4186. URL: https://aclanthology.org/
N19-1423. doi:10.18653/v1/N19- 1423.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>A. Dataset Details</title>
      <sec id="sec-4-1">
        <title>We list the details of our three datasets in table 4.</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>B. Local Models</title>
    </sec>
    <sec id="sec-6">
      <title>C. Training details</title>
      <p>Start Date
01/01/2020
01/01/2020
01/01/2020
VOC Mutation
0.81%
0.14%
0.46%
2.38%
1.60%
18.51%
96.44%
19.34%
0.30%
Other Mutation
0.14%
0.81%
0.06%
0.05%
0.07%
0.24%
0.02%
0.36%
19.41%
VOC Mutation
1.67%
2.85%
3.46%
6.41%
8.78%
73.96%
99.39%
73.12%
2.47%</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.-L.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <article-title>Origin and evolution of pathogenic coronaviruses</article-title>
          ,
          <source>Nature Reviews Microbiology</source>
          <volume>17</volume>
          (
          <year>2019</year>
          )
          <fpage>181</fpage>
          -
          <lpage>192</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>De Wit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. Van</given-names>
            <surname>Doremalen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Falzarano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. J.</given-names>
            <surname>Munster</surname>
          </string-name>
          ,
          <article-title>SARS and MERS: recent insights into emerging coronaviruses</article-title>
          ,
          <source>Nature Reviews Microbiology</source>
          <volume>14</volume>
          (
          <year>2016</year>
          )
          <fpage>523</fpage>
          -
          <lpage>534</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Gutman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Abboud</surname>
          </string-name>
          ,
          <article-title>Orthopaedic considerations following COVID-19: lessons from the 2003 SARS outbreak</article-title>
          ,
          <source>JBJS reviews 8</source>
          (
          <year>2020</year>
          )
          <article-title>e20</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Hui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. I.</given-names>
            <surname>Azhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Madani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ntoumi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Dar</surname>
          </string-name>
          , G. Ippolito,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Mchugh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. A.</given-names>
            <surname>Memish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Drosten</surname>
          </string-name>
          , et al.,
          <article-title>The continuing 2019- nCoV epidemic threat of novel coronaviruses to global health-the latest 2019 novel coronavirus outbreak in wuhan, china</article-title>
          ,
          <source>International journal of infectious diseases 91</source>
          (
          <year>2020</year>
          )
          <fpage>264</fpage>
          -
          <lpage>266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Graham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Baric</surname>
          </string-name>
          , Recombination, reservoirs, and
          <article-title>the modular spike: mechanisms of coronavirus cross-species transmission</article-title>
          ,
          <source>Journal of virology 84</source>
          (
          <year>2010</year>
          )
          <fpage>3134</fpage>
          -
          <lpage>3146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Baud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nielsen-Saines</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Musso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pomar</surname>
          </string-name>
          , G. Favre,
          <article-title>Real estimates of mortality following COVID-19 infection, The Lancet infectious diseases 20 (</article-title>
          <year>2020</year>
          )
          <fpage>773</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Worobey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pekar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. B.</given-names>
            <surname>Larsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Nelson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Joy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rambaut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Suchard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. O.</given-names>
            <surname>Wertheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lemey</surname>
          </string-name>
          ,
          <article-title>The emergence of SARS-CoV2 in europe and north america</article-title>
          ,
          <source>Science</source>
          <volume>370</volume>
          (
          <year>2020</year>
          )
          <fpage>564</fpage>
          -
          <lpage>570</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E. C.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Blanc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vignuzzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Denison</surname>
          </string-name>
          ,
          <article-title>Coronaviruses lacking exoribonuclease activity are susceptible to lethal mutagenesis: evidence for proofreading and potential therapeutics</article-title>
          ,
          <source>PLoS pathogens 9</source>
          (
          <year>2013</year>
          )
          <article-title>e1003565</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>O. A. MacLean</given-names>
            , R. J.
            <surname>Orton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Singer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. L.</given-names>
            <surname>Robertson</surname>
          </string-name>
          ,
          <article-title>No evidence for distinct types in the evolution of SARS-CoV-2, Virus Evolution 6 (</article-title>
          <year>2020</year>
          )
          <article-title>veaa034</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Yurkovetskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. E.</given-names>
            <surname>Pascal</surname>
          </string-name>
          , C. TomkinsTinch,
          <string-name>
            <given-names>T. P.</given-names>
            <surname>Nyalile</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Baum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. E.</given-names>
            <surname>Diehl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dauphin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Carbone</surname>
          </string-name>
          , et al.,
          <article-title>Structural and functional analysis of the D614G SARS-CoV-2 spike protein variant</article-title>
          ,
          <source>Cell</source>
          <volume>183</volume>
          (
          <year>2020</year>
          )
          <fpage>739</fpage>
          -
          <lpage>751</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y. J.</given-names>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chiba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Halfmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ehre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kuroda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. H.</given-names>
            <surname>Dinnon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Leist</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schäfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nakajima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Takahashi</surname>
          </string-name>
          , et al.,
          <article-title>SARS-CoV-2 D614G variant exhibits eficient replication ex vivo and transmission in vivo</article-title>
          ,
          <source>Science</source>
          <volume>370</volume>
          (
          <year>2020</year>
          )
          <fpage>1464</fpage>
          -
          <lpage>1468</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Volz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. T.</given-names>
            <surname>McCrone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Price</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jorgensen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Á. O'Toole</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Southgate</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Johnson</surname>
            ,
            <given-names>B</given-names>
          </string-name>
          .
          <string-name>
            <surname>Jackson</surname>
            ,
            <given-names>F. F.</given-names>
          </string-name>
          <string-name>
            <surname>Nascimento</surname>
          </string-name>
          , et al.,
          <article-title>Evaluating the efects of SARSCoV-2 spike mutation D614G on transmissibility and pathogenicity</article-title>
          ,
          <source>Cell</source>
          <volume>184</volume>
          (
          <year>2021</year>
          )
          <fpage>64</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>Train End Date</source>
          <volume>03</volume>
          /20/
          <year>2020</year>
          04/22/
          <year>2020</year>
          02/17/2021 In Training Set Unmutated
          <volume>99</volume>
          .05%
          <fpage>99</fpage>
          .05%
          <fpage>99</fpage>
          .48%
          <fpage>97</fpage>
          .57%
          <fpage>98</fpage>
          .33%
          <fpage>81</fpage>
          .24%
          <fpage>3</fpage>
          .54%
          <fpage>80</fpage>
          .30%
          <fpage>80</fpage>
          .30%
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>