<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Palomino-Ochoa at TASS 2020: Transformer-based Data Augmentation for Overcoming Few-Shot Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Palomino</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>José Ochoa-Luna</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidad Católica San Pablo</institution>
          ,
          <addr-line>Quinta Vivanco s/n St, Arequipa, Arequipa</addr-line>
          ,
          <country country="PE">Peru</country>
        </aff>
      </contrib-group>
      <fpage>171</fpage>
      <lpage>178</lpage>
      <abstract>
        <p>This paper describes the participation of the Department of Computer Science at Universidad Católica San Pablo (UCSP) for the TASS 2020 Workshop. We have developed sentiment analysis algorithms for the monovariant and multivariant challenges. In both cases, our approach is based on transfer learning using BERT language modeling. We also propose a procedure based on this language model to generate contextual data augmentation aimed to increase the training dataset and prevent overfitting. Our design choices allow us to achieve comparable state-of-the-art results regarding the TASS benchmark datasets provided.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Transformer</kwd>
        <kwd>Few-Shot Learning</kwd>
        <kwd>Data Augmentation</kwd>
        <kwd>Sentiment Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Deriving an efective algorithm for Spanish Twitter sentiment analysis has been long pursued
since the first TASS challenge in 2012 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Nowadays, despite recent advances in algorithms
(Deep Learning [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) and word encoding (embeddings [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]), the basic polarity detection task
has not been completely solved. Moreover, whereas it is usually claimed that a transfer learning
approach can solve any classification tasks in NLP, smoothly [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]; this is not usually the case
when we applied it to Spanish. Hence, the task becomes harder when several language variants
are provided. In fact, low Macro-F1 values were reported in a previous TASS workshop [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>In this paper we still rely on transfer learning to solve the classification task, but our approach
has been carefully designed bearing in mind that the Spanish language has several variants.</p>
      <p>
        In NLP it is common to use text input encoded as word embeddings. Those embeddings,
which allow us to encode semantic similarities among words, can be defined using several
approaches such as Word2vec [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Glove [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and FastText [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], to name a few. When we reuse
pre-trained word embeddings in several tasks, we are indirectly employing a transfer learning
scheme.
      </p>
      <p>
        Nowadays, this kind of encoding has evolved to a language model encoding. The idea is to
use language context in order to better encode words [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Overall the aim is to transfer the
knowledge encoded in the language model to a specific task, in this case polarity detection in
sentiment analysis.
      </p>
      <p>
        Our proposed classifier relies on three components: a multivariant Spanish corpus, a general
language model and a data augmentation step. Thus, we start by training a general language
model based on BERT [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] using a multivariant Spanish corpora. Such language model allows us
to learn general features of the Spanish language. The final prediction is enhanced using an
unsupervised data augmentation process.
      </p>
      <p>
        Those simple design choices allow us to obtain comparable state-of-the-art results regarding
the two subtasks of polarity classification task presented in the 2020 TASS challenge [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] (This
work is not covering the emotion detection task [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ])
      </p>
      <p>The rest of the paper is structured as follows. In Section 2, we describe the task at hand. In
Section 3, the system is explained. The experimental setup is described in Section 4. Results
and conclusions are presented, accordingly, in Section 5 and 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Task description</title>
      <p>The aim of the task 1 is the correct classification of sentiments (N:Negative, P: Positive, NEU:
Neutral) in tweets written in Spanish and variants. This year, the task has been sub-divided in 2
subtasks1:
• Subtask-1: Monovariant. In this challenge, we have to predict the sentiment in tweets of
5 Spanish variants included: Spain, Peru, Costa Rica, Uruguay and Mexico, and for each
one we have three datasets: Training, Validation and Test. Moreover and even when we
could use any additional corpora or linguistic resource, we have to submit the predictions
of every country Test dataset in a diferent file to the system evaluation.
• Subtask-2: Multivariant. Complementary to the subtask-1, we have been provided with
one additional Test dataset extracted from the diferent datasets of countries commented
before. Again, it is possible to use any corpora or linguistic resource.</p>
      <p>Furthermore, informal language used in tweets and lack of context because the limited
characters permitted in those add up several challenges to this competition.</p>
      <p>Finally, the metrics used in the evaluation system are the macro-averaged versions of
MacroPrecision, Macro-Recall and Macro-F1. However, Macro-F1 was used to rank the diferent
submits to the system.</p>
    </sec>
    <sec id="sec-3">
      <title>3. System</title>
      <p>
        We propose a single system for improving polarity classification in small datasets based on
the high performance Language Model (LM) referred to as BERT developed by Devlin et al [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
In addition we also use a clever BERT variant to produce contextual data augmentation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
Overall, our aim is to develop a robust approach able to be used indistinctly in diferent variants
of Spanish and get competitive results in several NLP tasks.
      </p>
      <p>
        In order to do so, we start by pre-training a general LM on a multivariant Spanish corpus
related to the target task. Subsequently, to enhance our small target dataset and prevent
overfitting, we use the LM to generate Unsupervised Data Augmentation using a novel technique
called Conditional BERT Contextual Augmentation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Finally, this augmented dataset is
used to fine-tune a classifier on top of previous LM. A general view of these stages is shown in
Figure 1 and will be detailed as follows.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Language Model</title>
        <p>
          Due to the huge impact of BERT in NLP, we decided to use it as our initial LM in its base form.
By doing so, we have fewer parameters to train and it is easier to fine-tune with less samples.
According the guidelines presented by Devil et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], that describes how to train a Transformer
Encoder for successful results, we use 2 objective tasks to pre-train our LM: Masked Language
Model and Next Sentence Prediction.
        </p>
        <p>• Masked Language Model. Mask some percentage of the input at random and then
predict those masked tokens.
• Next Sentence Prediction. Given a first sentence, the model must predict the one that
would follow.</p>
        <p>Also, this process is made using a free source dataset described in spanish-corpora repository2.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Unsupervised Data Augmentation</title>
        <p>
          Based on the Conditional BERT Contextual Augmentation algorithm [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], we fine-tune the
previous pre-trained LM using conditional MLM task on labeled target dataset. Once the LM
is fine-tuned, we take one tweet in our train dataset and randomly mask  words for later
predicting the masked words using the LM fine-tuned. This process is repeated for every tweet
in our dataset  times, where  and  are numbers chosen heuristically.
        </p>
        <p>Finally, we add up the new formed sentences into the original training dataset and perform
downstream task on it.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Classifier System</title>
        <p>In this final step we build a classifier, a simple fully connected layer, on top of the initial LM and
a Softmax which allow us to predict the correct class. However, in order to prevent overfitting,
we use the new enhanced dataset created using the aforementioned Semi-supervised Data
Augmentation process.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <p>A detailed description of hardware and software requirements for replicating our research
results. In addition, we describe the hyper-parameters tuned during experimentation that allow
us to train and converge without overfitting and regularization.</p>
      <sec id="sec-4-1">
        <title>4.1. Technical Resources</title>
        <p>The experiments were executed on Jupyter Notebooks running Python 3.7 kernel and PyTorch
1.4.0. Moreover, all models were trained on a GPU 2080 Ti with 11 GB GDDR6.</p>
        <p>For a complete description about dependencies we refer to our public project repository3.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Datasets</title>
        <p>
          The datasets used in this work are freely available to everyone to use.
4.2.1. Pre-Training
4.2.2. Target
The dataset used to pre-train the LM is a compilation of Large Spanish Unannotated Corpora
[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] including Spanish Wikipedia, Spanish portion of TED conferences and other resources.
The dataset used to fine-tune the LM of the unsupervised Data Augmentation and the final
classifier is provided by the competition committee and can be accessed via the web of the
challenge4.
        </p>
        <p>3https://github.com/dpalominop/atlas
4http://tass.sepln.org/2020</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Pre-processing</title>
        <sec id="sec-4-3-1">
          <title>All datasets were pre-processed regarding the following rules:</title>
          <p>1. The text was converted to lowercase and every accent mark was removed.
2. Repeated characters were replaced by single characters.
3. User references were transformed to a specific token.</p>
          <p>4. Useless spaces were removed.</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Pre-Training Language Model</title>
        <p>As will be shown in Table 3, using a multilingual LM decreases the performance of the system.
In contrast, if we use a monolingual LM, results are noticeably improved. Furthermore, we
have to use a LM trained on a similar dataset w.r.t. the target task and the best option was
training one using the general corpora described in section 4.2. The main hyper-parameters
used through this process are:
1. Backpropagation Trough Time (BPTT): 70
2. Weight Decay (WD): 1 − 2
3. The batch size (BS): 64</p>
        <sec id="sec-4-4-1">
          <title>1. Backpropagation Trough Time (BPTT): 70</title>
          <p>2. Weight Decay (WD): 1 − 2
3. The batch size (BS): 32
4. Number of randomly masked words (k): 4
5. Repetitions of the process (n): 2</p>
        </sec>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Fine-Tunning Language Model</title>
        <p>The main hyper-parameters used through this fine-tuning LM process are:</p>
      </sec>
      <sec id="sec-4-6">
        <title>4.6. Fine-Tunning Classifier</title>
        <p>Similar to the sub-section before, the main hyper-parameters used through this fine-tuning
process are:</p>
        <sec id="sec-4-6-1">
          <title>1. Backpropagation Trough Time (BPTT): 70</title>
          <p>2. Weight Decay (WD): 1 − 2
3. The batch size (BS): 16</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>Results for TASS 2020 Task 1 - Monovariant are shown in Table 1. Our submission (referred
to as daniel.palomino.paucar) was ranked 1st or 2nd in all variants of Spanish presented in
the competition among all competitors (M-F1 score).</p>
      <p>Furthermore, results for TASS 2020 Task 1 - Multivariant are shown in Table 2. Our
submission (referred to as daniel.palomino.paucar) was ranked 1st among all competitors
(M-F1 score).</p>
      <p>On the other hand, given the possibility to send 3 diferent submissions to the competition,
we decided to use that to test three variations of our system and perform ablation experiments
in order to get a better understanding of the competitive results obtained.</p>
      <p>The output of these experiments is shown in Table 3. Thus, we can observe a significant
positive impact when using a monolingual LM instead of a multilingual LM. Regarding the
M-F1 metric used, there is 10 percent diference between the two methods. Moreover, if we
include the additional step of Unsupervised Data Augmentation, the results further improve.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>We have presented a novel sentiment classification system based on BERT that includes an
additional step of Unsupervised Data Augmentation. The system has been applied to sentiment
analysis on Spanish tweets and its variants. Despite its simplicity, this approach allowed us
to be ranked 1st or 2nd on the TASS 2020 Task 1 - Multivariant and 1st on the TASS 2020
Task 1 - Monovariant.</p>
      <p>Furthermore, the ablation experiments have shown that a careful choice of the language
model can improve the results drastically. Thus, we have chosen to pre-train the language
model using a similar dataset to the target task. Moreover, the use of data augmentation allowed
us to further improve our previous results for most variants of the Spanish language. However,
the performance on the Mexican variant decreased after using this technique —probably due to
overfitting during the fine-tuning process.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work was funded by CONCYTEC-FONDECYT under the call E041-01 [contract number
34-2018-FONDECYT-BM-IADT-SE].</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Villena-Román</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>García-Morera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Moreno-García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ferrer-Ureña</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Serrano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalez-Cristobal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Westerski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Martínez-Cámara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>García-Cumbreras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>MartínValdivia</surname>
          </string-name>
          , L. López, Tass-workshop
          <source>on sentiment analysis at sepln</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Sentiment Analysis</article-title>
          and
          <string-name>
            <given-names>Opinion</given-names>
            <surname>Mining</surname>
          </string-name>
          , Morgan and Claypool Publishers,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Enriching word vectors with subword information, Transactions of the Association for Computational Linguistics 5 (</article-title>
          <year>2017</year>
          )
          <fpage>135</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          , in: C.
          <string-name>
            <surname>J. C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
            ,
            <given-names>K. Q.</given-names>
          </string-name>
          <string-name>
            <surname>Weinberger</surname>
          </string-name>
          (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          <volume>26</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2013</year>
          , pp.
          <fpage>3111</fpage>
          -
          <lpage>3119</lpage>
          . URL: http://papers.nips.cc/paper/ 5021-distributed
          <article-title>-representations-of-words-and-phrases-and-their-compositionality</article-title>
          . pdf.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global vectors for word representation</article-title>
          ,
          <source>in: Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          . URL: http://www.aclweb.org/anthology/D14-1162.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Howard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruder</surname>
          </string-name>
          ,
          <article-title>Universal language model fine-tuning for text classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics</article-title>
          , Melbourne, Australia,
          <year>2018</year>
          , pp.
          <fpage>328</fpage>
          -
          <lpage>339</lpage>
          . URL: https://www.aclweb.org/anthology/P18-1031.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Díaz-Galiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Vega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Casasola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chiruzzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Á. G.</given-names>
            <surname>Cumbreras</surname>
          </string-name>
          , E. MartínezCámara,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moctezuma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Ráez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A. S.</given-names>
            <surname>Cabezudo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Tellez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Graf</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. MirandaJiménez</surname>
          </string-name>
          , Overview of tass 2019:
          <article-title>One more further for the global spanish sentiment analysis corpus</article-title>
          , in: IberLEF@SEPLN,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Palomino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ochoa-Luna</surname>
          </string-name>
          ,
          <article-title>Advanced transfer learning approach for improving spanish sentiment analysis</article-title>
          , in: L.
          <string-name>
            <surname>Martínez-Villaseñor</surname>
            ,
            <given-names>I. Batyrshin</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Marín-Hernández</surname>
          </string-name>
          (Eds.),
          <source>Advances in Soft Computing</source>
          , Springer International Publishing, Cham,
          <year>2019</year>
          , pp.
          <fpage>112</fpage>
          -
          <lpage>123</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <article-title>BERT: pre-training of deep bidirectional transformers for language understanding</article-title>
          , CoRR abs/
          <year>1810</year>
          .04805 (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>García-Vega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Díaz-Galiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>García-Cumbreras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Montejo</given-names>
            <surname>Ráez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Jiménez Zafra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Martínez-Cámara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Murillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. Casasola</given-names>
            <surname>Murillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chiruzzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Moctezuma</surname>
          </string-name>
          , Sobrevilla, Overview of tass 2020:
          <article-title>Introduction emotion detection</article-title>
          ,
          <source>in: Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2020</year>
          ), volume
          <volume>2664</volume>
          <source>of CEUR Workshop Proceedings</source>
          , CEUR-WS, Málaga, Spain,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Plaza del Arco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Strapparava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Urena Lopez</surname>
          </string-name>
          , M. Martin,
          <string-name>
            <surname>EmoEvent:</surname>
          </string-name>
          <article-title>A multilingual emotion corpus based on diferent events</article-title>
          ,
          <source>in: Proceedings of The 12th Language Resources and Evaluation Conference</source>
          , European Language Resources Association, Marseille, France,
          <year>2020</year>
          , pp.
          <fpage>1492</fpage>
          -
          <lpage>1498</lpage>
          . URL: https://www.aclweb.org/anthology/2020.lrec-
          <volume>1</volume>
          .
          <fpage>186</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zang</surname>
          </string-name>
          , J. Han,
          <string-name>
            <surname>S</surname>
          </string-name>
          . Hu,
          <article-title>Conditional BERT contextual augmentation</article-title>
          , CoRR abs/
          <year>1812</year>
          .06705 (
          <year>2018</year>
          ). URL: http://arxiv.org/abs/
          <year>1812</year>
          .06705. arXiv:
          <year>1812</year>
          .06705.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cañete</surname>
          </string-name>
          ,
          <source>Compilation of large spanish unannotated corpora</source>
          ,
          <year>2019</year>
          . URL: https://doi.org/ 10.5281/zenodo.3247731. doi:
          <volume>10</volume>
          .5281/zenodo.3247731.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>