<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ELiRF-UPV at TASS 2019: Transformer Encoders for Twitter Sentiment Analysis in Spanish</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jose-Angel Gonzalez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Llu s-Felip Hurtado</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ferran Pla</string-name>
          <email>fplag@dsic.upv.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>VRAIN: Valencian Research Institute for Arti cial Intelligence Universitat Politecnica de Valencia</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>571</fpage>
      <lpage>578</lpage>
      <abstract>
        <p>This paper describes the participation of the ELiRF research group of the Universitat Politecnica de Valencia in the TASS 2019 Workshop, framed within the XXXV edition of the International Congress of the Spanish Society for the Processing of Natural Language. We present the approach used for the Monolingual InterTASS task of the workshop, as well as the results obtained and a discussion of them. Our participation has focused mainly on employing the encoders of the Transformer model, based on self-attention mechanisms, achieving competitive results in the task addressed.</p>
      </abstract>
      <kwd-group>
        <kwd>Twitter Sentiment Analysis Transformer Encoders</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Sentiment Analysis workshop at SEPLN (TASS) has been proposing a set of
tasks related to Twitter sentiment analysis in order to evaluate di erent
approaches presented by the participants. In addition, it develops free resources,
such as, corpora annotated with polarity, thematic, political tendency or aspects,
which are very useful for the comparison of di erent approaches to the proposed
tasks.</p>
      <p>
        In this eighth edition of the TASS [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], several tasks are proposed for global
sentiment analysis about di erent Spanish variants. The organizers propose two
di erent tasks: 1) Monolingual sentiment analysis and 2) crosslingual sentiment
analysis. In this way, in the rst task, only a speci c language can be used to train
and to evaluate the system; in contrast, in the second task, any combination of
the corpus can be used to train the systems. Thus, for both tasks, the organizers
provide ve di erent corpus of tweets written in Spanish variants from Spain,
Costa Rica, Peru, Uruguay and Mexico.
      </p>
      <p>
        This article summarizes the participation of the ELiRF-UPV team of the
Universitat Politecnica de Valencia only for the rst task. Our approach uses
2
state-of-the-art approaches that has provided competitive results in English
sentiment analysis and machine translation [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>The rest of the article is structured as follows. Section 2 presents a description
of the addressed task. In section 3 we describe the architecture of the proposed
system. Section 4 summarizes the conducted experimental evaluation and the
achieved results. Finally, some conclusions and possible future works are shown
in section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Task description</title>
      <p>The organization has de ned two subtasks: Task 1, monolingual SA, and Task
2, crosslingual SA. These tasks consists of assigning a global polarity to tweets
(N, NEU, NONE and P). In Task 1 only one Spanish variant can be used,
both for training and testing the system. In contrast, in Task 2 any
combination of Spanish variants can be considered with the only restriction that those
considered in the training set can not be used in the test set.</p>
      <p>For both subtasks, ve di erent corpora were considered for several Spanish
variants. First, the InterTASS-ES corpus (Spain) composed of a training
partition of 1125 samples, a validation of 581 samples and a test set consisting of 1706
samples. InterTASS-CR (Costa Rica) composed of 777 training samples, 390 for
validation and 1166 for testing. InterTASS-PE (Peru), formed by 966 samples
of training, 498 of validation and 1464 of test. InterTASS-UY (Uruguay), which
contains 943 training samples, 486 validation and 1428 tests. Finally,
InterTASSMX (Mexico), with 989 training samples, 510 validation and 1500 test samples.</p>
      <p>The tweet distribution according to their polarity in the InterTASS corpus
training sets is shown in Table 1.</p>
      <p>As can be seen in Table 1, the training corpora are unbalanced and they
have a bias to the N and P classes, except in the InterTASS-PE corpus, where
the most frequent class is N ON E. Moreover, the class N EU is always the least
populated except in the case of Uruguay.</p>
    </sec>
    <sec id="sec-3">
      <title>System</title>
      <p>In this section, we discuss the system architecture proposed to address the rst
task of TASS 2019 as well as the description of the resources used and the
preprocessing applied to the tweets.
3.1</p>
      <sec id="sec-3-1">
        <title>Resources and preprocessing</title>
        <p>
          In order to learn a word embedding model from Spanish tweets, we downloaded
87 million tweets of several Spanish variants. To provide the embedding layer of
our system with a rich semantic representation on the Twitter domain, we use
300-dimensional word embeddings extracted from a skip-gram model [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] trained
with the 87 million tweets by using Word2Vec framework [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Transformer Encoders</title>
        <p>
          Our system is based on the Transformer [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] model. Initially proposed for
machine translation, the Transformer model dispenses with convolution and
recurrences to learn long-range relationships. Instead of this kind of mechanisms, it
relies on multi-head self-attention, where multiple attentions among the terms of
a sequence are computed in parallel to take into account di erent relationships
among them.
        </p>
        <p>Concretely, we use only the encoder part in order to extract vector
representations that are useful to perform sentiment analysis. We denote this encoding
part of the Transformer model as Transformer Encoder. Figure 1 shows a
representation of the proposed architecture for sentiment analysis.</p>
        <p>
          The input of the model is a tweet X = fx1; x2; :::; xT : xi 2 f0; :::; V gg where
T is the maximum length of the tweet and V is the vocabulary size. This tweet
is sent to a d-dimensional xed embedding layer, E, initialized with the weights
of our embedding model. Moreover, to take into account positional information
we also experimented with the sine and cosine functions proposed in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
After the combination of the word embeddings with the positional information,
dropout [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] was used to drop input words with a certain probability p. On top
of these representations, N x transformer encoders are applied, which relies on
multi-head scaled dot-product attention with h di erent heads. To do this we
used an architecture similar to the one described in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. It includes the layer
normalization [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and the residual connections.
        </p>
        <p>Due to a vector representation is required to train classi ers on top of these
encoders, a global average pooling mechanism was applied to the output of the
last encoder, and it is used as input to a feed-forward neural network, with only
one hidden layer, whose output layer computes a probability distribution over
the the four classes of the task C = fP; N; N EU; N ON Eg.</p>
        <p>
          We use Adam as update rule with 1 = 0:9 and 2 = 0:999 and Noam [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] as
learning rate schedule with 5 warmup steps. The weighted cross entropy is used
as loss function. Only the class distribution of the Spanish variant is considered
to weight the cross entropy that is used for all language variants.
We xed some hyper-parameters to carry out the experimentation, concretely:
batch size = 32, dk = 64, dff = d and T = 50. Another hyper-parameters such
as p or warmup steps were set following some results obtained in preliminary
experiments to p = 0:7, warmup steps = 5 epochs and h = 8.
        </p>
        <p>
          Moreover, we compair our proposal, which is based on transformer encoders
(TE), with another deep learning systems such as Deep Averaging Networks
(DAN) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and Attention Long Short Term Memory Networks [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] (Att-LSTM)
that are commonly used in related text classi cation tasks obtaining very
competitive results. Concretely, these implementations are the systems proposed by
our team in the TASS2018 edition, which achieved very competitive results [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>In order to study how some system mechanisms (positional encodings) or
hyper-parameters (N x) a ect the results obtained in terms of macro-F1 (M F1),
macro-recall (M R), macro-precision (M P ) and Accuracy (Acc) we conducted
some additional experimentation. Concretely, we removed the positional
information and we used N x 2 f1; 2g encoders. All the con gurations were applied
only to the Spanish subtask and the best two con gurations are used also in the
remaining subtasks. All these results are shown in Table 2.</p>
        <p>As it can be seen in Table 2 for systems 1-TE-Pos and 2-TE-Pos on subtask
ES, the use of positional information decreases the system performance. This
seems to indicate that the positional information, represented by sine and cosine
functions added to the word embeddings, is useless to the classi er. However,
the results obtained by Att-LSTM, which takes into account the positional
information by its internal memory, obtains better results than the 1-TE-Pos and
2-TE-Pos in almost all the metrics. These results show that the way the
positional information is considered a ects the performance of the systems in this
task.</p>
        <p>MP</p>
        <p>MR</p>
        <p>M F1 Acc</p>
        <p>The best results in terms of M R are achieved by the 1-TE-NoPos model.
Due to this fact, the 1-TE-NoPos model outperforms 2-TE-NoPos model also
in terms of M F1, although the 2-TE-NoPos model achieves better results in
the M P measure. This behavior is observed in almost all the Spanish variants,
except on the MX subtask, where both models obtain similar results in terms of
M F1.</p>
        <p>Moreover, in the ES variant, several con gurations of the TE model
outperforms the systems proposed by our team in previous editions of TASS (DAN and
Att-LSTM) by a margin of 5 points of M F1, mainly due to the improvement
( 6 points) in terms of M R and M P (improvement of 3 points).</p>
        <p>
          In Table 3, the results at class level for each variant, obtained with our best
model (1-TE-NoPos), are shown. It is interesting to observe the improvements
achieved by our system for the class N ON E compared to our results in previous
editions for this class. Generally, the results for the class N are better than those
obtained on the other classes, except in the PE variant. In this case the N ON E
class is the easiest to detect due to this class is most frequent in the corpus.
The results for P class are generally better than those for classes N EU and
N ON E, except on the PE variant. As it is observed in all the previous editions
of TASS[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], the N EU class obtains the worse results.
        </p>
        <p>The confusion matrix of our best system (1-TE-NoPos) for the ES variant
is shown in Table 4. It is possible to see that the worse classi ed class (N EU )
is usually confused with the classes N and P . This seems to indicate that our</p>
        <p>J-A. Gonzalez et al.
model detects the presence of sentiment (positive or negative), but is unable to
detect when both classes are neutralized.</p>
        <p>Finally, the system 1-TE-NoPos was used for labeling the test set of each
variant. The results obtained by this model (M F1, M P , and M R) and the
ranking of our system in the competition are shown in Table 5. As it can be
seen, our system is ranked as rst for the ES subtask and second in all the
remaining variants.
We have proposed a system based on the encoder part of the Transformer
architecture in order to extract useful word representations that are discriminative to
perform sentiment analysis on tweets from several Spanish variants. The results
obtained by our system are very promising, being the rst or second ranked
system on almost all the Spanish variants. This is especially signi cant, considering
that these results have been obtained without an extensive experimentation on
the hyperparameters of the model and these hyperparameters were only tuned
on the ES subtask. This opens the door to future improvements by exploring
modi cations on the architecture and its hyperparameters.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <p>This work has been partially supported by the Spanish MINECO and FEDER
founds under project AMIC (TIN2017-85854-C4-2-R) and by the GiSPRO project
(PROMETEU/2018/176). Work of Jose-Angel Gonzalez is nanced by
Universitat Politecnica de Valencia under grant PAID-01-17.</p>
      <p>J-A. Gonzalez et al.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ambartsoumian</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popowich</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Self-attention: A better building block for sentiment analysis neural network classi ers</article-title>
          .
          <source>In: WASSA@EMNLP</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Ba</surname>
            ,
            <given-names>L.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiros</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.E.:
          <article-title>Layer normalization</article-title>
          .
          <source>CoRR abs/1607</source>
          .06450 (
          <year>2016</year>
          ), http://arxiv.org/abs/1607.06450
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D</given-names>
            <surname>az-Galiano</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.C.</surname>
          </string-name>
          , et al.:
          <article-title>Overview of tass 2019</article-title>
          .
          <article-title>CEUR-WS, Bilbao</article-title>
          , Spain (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hurtado</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pla</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>ELiRF-UPV en TASS 2017: Analisis de Sentimientos en Twitter basado en Aprendizaje Profundo (ELiRF-UPV at TASS 2017: Sentiment Analysis in Twitter based on Deep Learning)</article-title>
          .
          <source>In: Proceedings of TASS 2017: Workshop on Semantic Analysis at SEPLN, TASS@SEPLN</source>
          <year>2017</year>
          ,
          <article-title>co-located with 33nd SEPLN Conference (SEPLN</article-title>
          <year>2017</year>
          ), Murcia, Spain,
          <year>September 18th</year>
          ,
          <year>2017</year>
          . pp.
          <volume>29</volume>
          {
          <issue>34</issue>
          (
          <year>2017</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-1896
          <source>/p2 elirf tass2017.pdf</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hurtado</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pla</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>ELiRF-UPV en TASS 2018: Analisis de Sentimientos en Twitter basado en Aprendizaje Profundo (ELiRF-UPV at TASS 2018: Sentiment Analysis in Twitter based on Deep Learning)</article-title>
          .
          <source>In: Proceedings of TASS 2018: Workshop on Semantic Analysis at SEPLN, TASS@SEPLN</source>
          <year>2018</year>
          ,
          <article-title>co-located with 34nd SEPLN Conference (SEPLN</article-title>
          <year>2018</year>
          ), Sevilla, Spain,
          <year>September 18th</year>
          ,
          <year>2018</year>
          . pp.
          <volume>37</volume>
          {
          <issue>44</issue>
          (
          <year>2018</year>
          ), http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2172</volume>
          /p2 elirf tass2018.pdf
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hochreiter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Long short-term memory</article-title>
          .
          <source>Neural Comput</source>
          .
          <volume>9</volume>
          (
          <issue>8</issue>
          ),
          <volume>1735</volume>
          {1780 (Nov
          <year>1997</year>
          ). https://doi.org/10.1162/neco.
          <year>1997</year>
          .
          <volume>9</volume>
          .8.1735, http://dx.doi.org/10.1162/neco.
          <year>1997</year>
          .
          <volume>9</volume>
          .8.
          <fpage>1735</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Iyyer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manjunatha</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyd-Graber</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daume</surname>
            <given-names>III</given-names>
          </string-name>
          , H.:
          <article-title>Deep unordered composition rivals syntactic methods for text classi cation</article-title>
          .
          <source>In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)</source>
          . pp.
          <volume>1681</volume>
          {
          <fpage>1691</fpage>
          . Association for Computational Linguistics, Beijing, China (Jul
          <year>2015</year>
          ). https://doi.org/10.3115/v1/
          <fpage>P15</fpage>
          - 1162, https://www.aclweb.org/anthology/P15-1162
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Letarte</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paradis</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giguere</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laviolette</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Importance of self-attention for sentiment analysis</article-title>
          .
          <source>In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP</source>
          . pp.
          <volume>267</volume>
          {
          <fpage>275</fpage>
          . Association for Computational Linguistics, Brussels, Belgium (Nov
          <year>2018</year>
          ), https://www.aclweb.org/anthology/W18-5429
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2</source>
          . pp.
          <volume>3111</volume>
          {
          <fpage>3119</fpage>
          . NIPS'
          <volume>13</volume>
          , Curran Associates Inc.,
          <source>USA</source>
          (
          <year>2013</year>
          ), http://dl.acm.org/citation.cfm?id=
          <volume>2999792</volume>
          .
          <fpage>2999959</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Srivastava</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salakhutdinov</surname>
          </string-name>
          , R.:
          <article-title>Dropout: A simple way to prevent neural networks from overtting</article-title>
          .
          <source>Journal of Machine Learning Research 15</source>
          ,
          <year>1929</year>
          {
          <year>1958</year>
          (
          <year>2014</year>
          ), http://jmlr.org/papers/v15/srivastava14a.html
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Vaswani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shazeer</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parmar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uszkoreit</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaiser</surname>
          </string-name>
          , L.u.,
          <string-name>
            <surname>Polosukhin</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Attention is all you need</article-title>
          . In: Guyon,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Luxburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.V.</given-names>
            ,
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Vishwanathan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Garnett</surname>
          </string-name>
          ,
          <string-name>
            <surname>R</surname>
          </string-name>
          . (eds.)
          <source>Advances in Neural Information Processing Systems</source>
          <volume>30</volume>
          , pp.
          <volume>5998</volume>
          {
          <fpage>6008</fpage>
          . Curran Associates, Inc. (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>