<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Cascade of Biased Two-class Classi ers for Multi-class Sentiment Analysis.</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Departamento de Ingenier a Informatica. Unversidad Catolica de Temuco.</institution>
          <country country="CL">Chile</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Software and Computing Systems. University of Alicante.</institution>
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we describe our participation in the Rest-Mex 2021 Sentiment Analysis Task. Our approach is based on an ensemble of BERTjBETO-based classi ers arranged in a cascade of binary models trained with a bias towards speci c classes with the aim of lowering the Mean Average Error. The resulting models were judged in the 2nd and the 3rd place according to the evaluation rule of the Mean Absolute Error.</p>
      </abstract>
      <kwd-group>
        <kwd>Sentiment Analysis Deep Learning Transformer Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>systems can predict the polarity of an opinion, published by a tourist about
places of Guanajuato, Mexico. The collection provided was obtained from the
tourists who shared their opinion on TripAdvisor between 2002 and 2020. Our
approach is based on Deep Learning Transformer Models, speci cally BERT
and BETO, applying a particular architecture and training strategies that we
describe in the next section. The resulting models were judged in 2nd and 3rd
places according to the evaluation rule of the Mean Absolute Error.
2</p>
      <p>Task and Data Description
2500
2000
1500
1000
500</p>
      <p>
        In this section, we describe the data provided by the organizers for this
subtask and its characterization. The corpus consists of 7.632 opinions shared, where
5.784 opinions are from national tourists (from Mexico) and 1.848 opinions come
from Iberoamerican tourists and the di erent results of our models. Each opinion
is classi ed as an integer, between [
        <xref ref-type="bibr" rid="ref1 ref5">1, 5</xref>
        ], where 1 represents the most negative
polarity and 5 the most positive. For each opinion, organizers also provided
information about nationality and gender. The organizers split the corpus 70% 30%
approximately. 70% of the data was delivered to the participants with complete
information about each opinion, speci cally 5194 opinions. 30% was reserved for
the nal testing of competing models. Analyzing the representation of each of
the classes, we detected that they had a high level of imbalance, with class 5 as
the majority class, with a total of 2; 688 instances, representing 51:75% of the
total, a great contrast with the class 1, for which only 80 instances were
provided, for the 1:54%. The presence of the rest of the classes is as follows: 1595
instances for class 4, 686 instances for class 2 and 155 instances for class 2, each
representing 30:71%, 13:21% and 2:98% respectively. This information can be
viewed in Fig. 1a. Our work takes as primary data, only the textual information
of the opinion, without taking into consideration other features. One aspect to
take into account given the architecture used is the length of each opinion since
our model is limited to 512 tokens. In Fig.1b, we show a histogram, where it can
be seen that the opinions processed meet this condition.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>System architecture.</title>
      <p>In this section we describe the two architectures, shown Fig.2, we explored for
the sentiment analysis task.
3.1</p>
      <p>BERT|BETO-based multi-class classi ers.</p>
      <p>
        This model is a multi-class classi er learning the ve categories simultaneously.
It is based on BERT [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] as feature extractor, ne-tuned for the text classi cation
downstream task. We evaluated two versions of the architecture depicted in Fig.
2a. In both cases, we leveraged transfer learning from pre-trained embeddings.
The rst one is the uncased version of BETO4 [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], which is a BERT model trained
on a spanish language corpus. The other pre-trained embedding we use is the
multilingual uncased version of BERT5. We aim to compare a model speci c for
Spanish to a multilingual one.
      </p>
      <p>The classi er comprised a dense layer with 768 hidden units and RELU
activation. Dropout with a rate of 0:2 and a dense layer with 5 units and linear
activation. For both BETO and BERT we use the base version, i.e. token
embedding of size 768 and max length of 512. As shows Fig. 2a the embedding of
the [CLS] token was used as the representation for the whole opinion. To
address the unbalanced problem, class weight was set proportional to the number
of instances in each category.</p>
      <p>
        This architecture has been evaluated by the authors of BERT [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for the
sentiment analysis task over the Stanford Sentiment Treebank dataset [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] achieving
state-of-the-art results at the time. This makes the architecture attractive as a
benchmark for the sentiment analysis task in Spanish.
3.2
      </p>
      <p>BERT|BETO-based two-class cascade classi ers.</p>
      <p>The other model we studied is an ensemble of binary classi ers arranged in
cascade, as shown in Fig. 2b. Cascading classi ers is the strategy leveraged
4 https://github.com/dccuchile/beto</p>
      <p>Available through HuggingFace library, model id:
'dccuchile/bert-base-spanishwwm-uncased'
5 https://github.com/google-research/bert/blob/master/multilingual.md
Available through HuggingFace library, model id: 'bert-base-multilingual-uncased'
Classifier
Dense
Dropout
Dense
[CLS] token
embedding
Feature Extractor</p>
      <p>Transformer</p>
      <p>Text
(a) BERT-based
multi-class.</p>
      <p>Text</p>
      <p>Stage 1
Binary</p>
      <sec id="sec-2-1">
        <title>Classifier</title>
        <p>Stage 2
Binary</p>
      </sec>
      <sec id="sec-2-2">
        <title>Classifier</title>
        <p>Stage 3
Binary</p>
      </sec>
      <sec id="sec-2-3">
        <title>Classifier</title>
        <p>Stage 4
Binary</p>
      </sec>
      <sec id="sec-2-4">
        <title>Classifier</title>
        <p>Text (category stage 1) c</p>
      </sec>
      <sec id="sec-2-5">
        <title>Text (category stage 2) c</title>
        <p>Text (category stage 3)
Category
stage 2
Category
stage 3</p>
      </sec>
      <sec id="sec-2-6">
        <title>Category </title>
        <p>stage 4</p>
        <p>c
(Category stage 4)
(b) Cascade classi er.</p>
        <p>
          c
by widely known frameworks such as the Viola-Jones one [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. In Sentiment
Analysis, it has been used by [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] to enrich the feature set of a classi er.
        </p>
        <p>We evaluated two di erent ways to use this architecture to solve the
vecategory classi cation problem. Let's denote the target category at stage i as
Ci for the sake of conciseness. The rst setup teach each classi er to tear apart
instances from one class from the rest. For the classi er at stage 1, C1 = f1g
while C1c = f2; 3; 4; 5g. The model at stage 2 learn to classify C2 = f2g and
C2c = f1; 3; 4; 5g and similarly for stage 3. For the last stage, we set C4 = f5g
and C4c = f1; 2; 3; 4g.</p>
        <p>The other setup explored biasing the classi ers so they tend to classify as
the stage target category the miss-classi ed instances from upstream classi ers.
In this case for stage 1 we have C1 = f1g and C1c = f2; 3; 4; 5g. For stage 2,
C2 = f1; 2g and C2c = f3; 4; 5g. Note that in this case, stage 2 will also consider
category 1 as the target class. We proceeded analogously for the other stages
except for the last one which is con gured as C4 = f5g and C4 = f1; 2; 3; 4gc.</p>
        <p>To classify an instance, we present it to the classi er at stage 1. If classi ed
as the stage target category, then we are done. Else, i.e classi ed as C1c, the
instance is presented to the next step classi er. This process is repeated for each
stage to the end.</p>
        <p>
          Each classi er is a binary version of the model described in section 3.1 all
of them trained separately. For each of these models, we evaluated the
multilanguage BERT [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and BETO [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], yielding four di erent approaches.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>After the data was processed, it was divided into 90% 10% for training and
validation. During the ne-tuning process, special attention was paid to di erent
evaluation criteria such as MAE and balanced accuracy.</p>
      <p>As a result of the experimentation, 4 new models were obtained. In Table 1,
we show in total 6 models, because we include benchmark models for BERT and
BETO respectively. Following, we will describe each of these 6 models. Table 1
shows models in descending order, based on the MAE values. At the bottom, we
can see the BERT Multi model, which is a BERT model trained on a multilingual
corpus, wet considered that model as our Benchmark. The rest of the labels of
each model can be interpreted using the following nomenclature: Multi indicates
that the model was trained with a multilingual corpus. Biased or Unbiased, refers
to what has been stated in Subsection 3.2.</p>
      <p>The BETO Biased model was the one that obtained the best result
considering its MAE value of 0:51, so it was selected as our primary submission, along
with it, the BETO Multi model was sent as secondary submission, with MAE
of 0:53. With the selection of BETO Multi model as a secondary submission, we
wanted to validate the two cascade variants proposed in this paper. As the nal
stage before submitting, the two selected models were trained with the total
available data.</p>
      <p>MAE RMSE Acc. F1 Rec. Prec.</p>
      <p>train val train val train val train val train val train val
BETO Biased (sub1) 0.03 0.51 0.06 0.65 0.95 0.44 0.92 0.40 0.95 0.44 0.90 0.40
BETO Unbiased 0.04 0.53 0.08 0.68 0.95 0.48 0.90 0.44 0.95 0.48 0.87 0.49
BERT Multi Biased 0.08 0.53 0.13 0.68 0.94 0.45 0.87 0.43 0.94 0.45 0.84 0.48
BERT Multi Unbiased 0.26 0.67 0.71 1.23 0.80 0.47 0.70 0.36 0.80 0.47 0.74 0.38
BETO Multi (sub2) 0.39 0.53 0.50 0.73 0.74 0.53 0.65 0.48 0.74 0.53 0.62 0.48
BERT Multi 0.62 0.70 0.91 1.05 0.59 0.51 0.45 0.40 0.59 0.51 0.41 0.37</p>
      <p>In this subtask, 8 teams competed, with 14 submissions in total. In the team
ranking, we were second, and in the submission ranking we were in second and
third place, our best-evaluated submission turned out to be the secondary one,
with 0:5451 of MAE. It should be noted that our best submission achieved the
best result in F-measure and Precision among all participants. A summary of
the competition can be seen in Table 2.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>In this paper, we have described the models proposed by UCT-UA in the
Sentiment Analysis subtask at Rest-Mex 2021. We presented two models, the results
Team Rank Sub. Rank
1st 1
2nd 2</p>
      <p>3
3rd 4
5
6
in our secondary submission were obtained from the model described in the
Results section as BETO Multi, this model the second-best result in the subtask
achieving 0:5451 of MAE. The results in our primary submission were obtained
from the model described in the Results section as BETO Biased, this model
the third-best result in the subtask achieving 0:5613 of MAE.</p>
      <p>Comparing to the models using BERT Multi, the results suggest that the
monolingual embedding is a better representation. However, this is consistent
with results from BERT team 6where for high-resource languages the
multilingual model may achieve the worst results respect the single-language model.
Moreover, this can be aggravated since the ne-tuning was done using
Spanish only thus degrading the multilingual representation spaces spawned by the
transformer.</p>
      <p>As future work, we are interested in evaluating if the multilingual models
can bene t from Tripadvisor reviews in di erent languages or topics. Also, we
would like to study multi-modal approaches that can leverage information from
the title or metadata of the review to boost the results.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This research work has been partially funded by the Generalitat Valenciana
(Conselleria d'Educacio, Investigacio, Cultura i Esport) and the Spanish
Government through the projects SIIA (PROMETEO/2018/089, PROMETEU/2018/089)
and LIVING-LANG (RTI2018-094653-B-C22), and the Vice Chancellor for
Research and Postgraduate Studies O ce of the Universidad Catolica de Temuco,
VIPUCT Project No. 2020EM-PS-08; FEQUIP 2019-INRN-03 of the
Universidad Catolica de Temuco.
6 https://github.com/google-research/bert/blob/master/multilingual.md</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agu</surname>
          </string-name>
          <article-title>ero-</article-title>
          <string-name>
            <surname>Torales</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abreu-Salas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopez-Herrera</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Deep learning and multilingual sentiment analysis on social media data: An overview</article-title>
          .
          <source>Applied Soft Computing</source>
          , vol
          <volume>107</volume>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Deep learning for sentiment analysis: A survey</article-title>
          .
          <source>In: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</source>
          . vol
          <volume>8</volume>
          , number 4 (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hurtado</surname>
            ,
            <given-names>L.F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Pla</surname>
          </string-name>
          , F. :
          <string-name>
            <surname>ELiRF-UPV at</surname>
            <given-names>TASS</given-names>
          </string-name>
          2019:
          <article-title>Transformer Encoders for Twitter Sentiment Analysis in Spanish</article-title>
          .
          <source>In: Proc. of IberLEF@ SEPLN</source>
          , (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Pastorini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pereira</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zeballos</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiruzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Etcheverry</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <string-name>
            <surname>RETUYT-InCo at</surname>
            <given-names>TASS</given-names>
          </string-name>
          2019:
          <article-title>Sentiment Analysis in Spanish Tweets</article-title>
          .
          <source>In: Proc. of IberLEF@ SEPLN</source>
          , (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gonzalez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pla</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Hurtado</surname>
          </string-name>
          , L.:
          <article-title>ELiRF-UPV at SemEval-2017 Task 4: sentiment analysis using deep learning</article-title>
          .
          <source>In: Proceedings of the 11th international workshop on semantic evaluation SemEval-2017</source>
          , (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Alvarez-Carmona</surname>
          </string-name>
          ,
          <article-title>Miguel A and Aranda, Ramon and Arce-Cardenas, Samuel and Fajardo-Delgado, Daniel and Guerrero-Rodr guez, Rafael and Lopez-Monroy, A. Pastor and Mart nez-Miranda, Juan and Perez-Espinosa, Humberto and Rodr guez-Gonzalez, Ansel: Overview of Rest-Mex at IberLEF 2021: Recommendation System for Text Mexican Tourism</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          , vol
          <volume>67</volume>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Calvo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gambino</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Cascading classi ers for Twitter sentiment analysis with emotion lexicons</article-title>
          .
          <source>In: Proc. Int. Conf. on Intelligent Text Processing and Computational Linguistics</source>
          , pp.
          <fpage>270</fpage>
          -
          <lpage>280</lpage>
          . (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Canete</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaperon</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuentes</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez</surname>
          </string-name>
          , J.:
          <article-title>Spanish pre-trained bert model and evaluation data</article-title>
          .
          <source>In: Proc. of PML4DC at ICLR</source>
          . (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . (
          <year>2018</year>
          ) https://doi.org/arXiv:
          <year>1810</year>
          .04805
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perelygin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chuang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.Y.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Potts</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Recursive deep models for semantic compositionality over a sentiment treebank</article-title>
          .
          <source>In Proceedings of the 2013 conference on empirical methods in natural language processing</source>
          . pp.
          <fpage>1631</fpage>
          -
          <lpage>1642</lpage>
          . (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Viola</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Rapid object detection using a boosted cascade of simple features</article-title>
          .
          <source>In. Proc. of the 2001 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition</source>
          . (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>