<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Javier Iranzo-Sanchez[</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>VRAIN at IroSva 2019: Exploring Classical and Transfer Learning Approaches to Short Message Irony Detection</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Valencian Research Institute for Arti cial Intelligence (VRAIN) Camino de Vera</institution>
          <addr-line>s/n. 46022 Valencia</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>0000</year>
      </pub-date>
      <volume>0002</volume>
      <fpage>322</fpage>
      <lpage>328</lpage>
      <abstract>
        <p>This paper describes VRAIN's participation at IroSvA 2019: Irony Detection in Spanish Variants task of the Iberian Languagues Evaluation Forum (IberLEF 2019). We describe the entire pre-processing, feature extraction, model selection and hyperparameter optimization carried out for our submissions to the shared task. A central part of our work is to provide an in-depth comparison of the performance of di erent classical machine learning techniques, as well as some recent transfer learning proposals for Natural Language Processing (NLP) classi cation problems.</p>
      </abstract>
      <kwd-group>
        <kwd>Natural Language Processing</kwd>
        <kwd>Irony Detection</kwd>
        <kwd>Learning</kwd>
        <kwd>Transfer</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        From the linguistic point of view, irony is a very interesting property of language.
As de ned in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], irony is the ability of expressing some speci c meaning by the
use of terms and words that, by its own, have the completely opposite meaning.
On the other hand, from a computational viewpoint, irony can be seen as an
important headache when performing natural language analysis tasks. For
example, in sentiment analysis, some of the most important features to determine the
text polarity are inferred from the words appearing in the text (e.g. negation,
n-grams, POS tagging, etc.) [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. With an ironic text, some of these features
should be smoothed in order to perform correctly the sentiment analysis. Irony
can be a problem when performing sentiment analysis from a text, this issue has
been directly observed in past International Workshop on Semantic Evaluation
(SemEval[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]) editions. In 2015, [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] two di erent datasets were considered, one
without sarcastic tweets and other containing sarcastic tweets. Systems
performance was considerably lower with the sarcastic dataset. In fact, irony detection
was proposed as a task in 2015 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and tackled for English language tweets in
2018 [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>
        This paper describes VRAIN's participation at IroSvA 2019: Irony
Detection in Spanish Variants [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] task of the Iberian Languages Evaluation Forum
(IberLEF 2019) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In this task we must identify ironic texts written in three
Spanish variants (from Spain, Mexico and Cuba). For Spain and Mexico
subtasks, we must detect ironic tweets and for Cuba subtask we are supposed to
detect ironic comments from a news website.
      </p>
      <p>We worked in each substask in isolation (only the Spanish tweets from Mexico
were used for the Mexico subtask and so on), but used the same approach and
pipeline in all 3 subtasks. Model selection and hyper-parameter optimisation
were individually carried out for each subtask.</p>
      <p>The rest of the paper is structured as follows. In Section 2 we explain the
feature extraction process carried out for this speci c task. In Section 3 we brie y
describe the system and present all the di erent approaches taken into account
to perform our comparison. In Section 4 we present the evaluation made in order
to compare the behaviour of the di erent models considered in this work. We
also compare the results obtained by our approach with all the di erent baselines
provided by the organisation. Finally, in Section 5 we summarise the conclusions
and the most important features of our work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Dataset and Feature Extraction</title>
      <p>We will rst begin by describing the structure of the competition's dataset. Table
1 contains statistics of the training and test datasets. Both Spain (es) and Mexico
(mx) variants have 10 di erent topics and the amount of tweets is di erent for
each topic. On the other hand, Cuba (cu) variant has only 9 di erent topics and
the amount of news per topic is also di erent. Regarding the training dataset,
for all the Spanish variants we have 2400 samples, thus the corpus is balanced
regarding each variant. On the other hand, the test dataset is made up of 600
samples for every Spanish variant.</p>
      <p>
        Having described the most important features of the dataset, we will now
focus on the feature extraction process. The text was tokenized using NLTK's [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
TweetTokenizer. Additionally, we experimented with substituting all occurrences
of hashtags, url, user mentions and numbers by a generic topic for each category,
but we nally decided against it since it decreased the model's performance.
      </p>
      <p>Each tweet was represented by a vector of counts of word n-grams. Using
counts directly instead of tf-idf performed better in our exploratory experiments.
The dataset contains additional information apart from the tweets themselves.
Speci cally, we are given the corresponding topic for each of the tweets. We
have tried two ways of leveraging this information. In the rst approach, which
we have called global-model, only one model is trained for each subtask, and a
one-hot vector encoding the topic is appended to every sample. Therefore, in
this approach, we have a single model per sub-task, trained with data from all
the topics.</p>
      <p>In the second approach, which we have called topic-model, we trained one
model per topic. Thus, at training time, we trained each of the individual models
using only data from one topic, and at inference time, for each of the tweets,
we used the predictions of the model that has been trained using the data of
the tweet's topic. The results of both approaches are compared and evaluated in
Section 4.
3</p>
    </sec>
    <sec id="sec-3">
      <title>System Description</title>
      <p>
        We will now describe the di erent models we tried for irony detection. In order
to select appropriate values for the hyperparameters of each model, we carry out
5-fold Cross Validation, and select the con gurations that obtained higher F1
(macro-averaged). Unless otherwise noted, methods are implemented using the
sklearn toolkit [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
3.1
      </p>
      <p>
        Classi cation approaches
{ Naive Bayes: The Naive Bayes approach is a well-known technique for
tackling many classi cation problems. A Multinomial distribution is used to
model P (xijc).
{ Support Vector Machines: Support Vector Machines [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] are Maximum
Margin Classi ers that have been shown to obtain good results in a variety
of tasks. We use a linear kernel, that has been shown to outperform other
non-linear kernels in text classi cation problems [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
{ Gradient Tree Boosting: Gradient Tree Boosting is a boosting technique
that consists in an ensemble of tree models built in a sequential way from
a set of weak learners. We have used the implementation available in the
XGBoost toolkit [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
{ Linear Models (fastText): fastText [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is a toolkit implementing a set of
linear architectures for text classi cation. The model based on the CBOW
architecture [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], has a word embedding matrix used to look up a
representation of each word in the text. The embeddings are summed and averaged into
a xed-sized vector, which is then fed into a softmax classi er. Additionally,
we have also trained a version using pre-trained word-embeddings, using a
publicly available dataset of 200-dimensional word embeddings trained on
Spanish tweets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
{ BERT: BERT [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is a pre-training methodology for Transformer models [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>BERT models are pre-trained on massive amounts of unsupervised text data,
and can then be used in a transfer-learning approach for other downstream
tasks. For this task, we have used the pre-trained BERT-Base Multilingual
Cased model, and ne-tuned it on the IroSvA data for 10 epochs.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Experimental Evaluation</title>
      <p>The results obtained by the di erent models are shown in Table 2.</p>
      <p>We can see a number of interesting results from the table. First, except for a
single case (Naive Bayes for the es variety), topic models obtain similar or worse
results than their global counterparts. Most likely due to the reduced number
of training data, the ne-grained approach of individually modelling each topic
seems counterproductive.</p>
      <p>In terms of the transfer learning approaches we tried, we have not been able
to leverage the knowledge obtained from the pre-trained tasks. The fastText
model using pre-trained embeddings does not improve the results of the base
fastText model, and the BERT model obtains results similar to the Naive Bayes
model.</p>
      <p>Overall, the best results are obtained by the SVM and Gradient Boosting
models. In order to further improve the results, we have constructed an Ensemble
of the SVM and Gradient Boosting models, whose predictions are the average
of the individual models' predictions. This obtains additional improvements in
the es and mx variants, and was the model submitted to the competition. Table
3 shows the performance of our model compared to the competition baselines.</p>
      <p>
        The results obtained by our model present signi cant variations depending
on the task. In the case of the es task, our model outperforms all baselines, and
in the mx, our system comes in second place behind the LDSE [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] baseline.
However, in the case of the cu task, our model is only able to beat the
majority baseline. We do not know the reasons for the signi cant performance drop
in the cu task between our internal experiments and the competition results,
although one possible culprit is the aforementioned domain mismatch between
the fes; mxg and cu tasks.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>This paper has described VRAIN's submission to IroSvA 2019. The di erent
experiments have shown that, under the current conditions, classical models
have an edge over some of the recent transfer-learning techniques that we tested.
We believe that the limiting factor is the lack of su cient training data for the
netuning step.</p>
      <p>
        Our submission, based on an Ensemble of SVM and Gradient Tree Boosting
models, obtains good results across the board, altough the performance could
be improved in the cu case. This has been achieved using non-task-speci c bag
of n-gram features. It is expected that these results could be further improved
with speci c features for irony detection, such as those from [
        <xref ref-type="bibr" rid="ref10 ref16">16,10</xref>
        ].
Acknowledgements The research leading to these results has received funding
from the European Union's Horizon 2020 research and innovation programme
under grant agreement no. 761758 (X5gon) and from the Valencian Government
grant for excellence research groups PROMETEO/2018/002.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. IroSvA 2019:
          <article-title>Irony detection in spanish variants</article-title>
          . http://www.autoritas.net/ IroSvA2019/, accessed:
          <fpage>2019</fpage>
          -07-05
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <article-title>Word embeddings trained with word2vec on 200 million spanish tweets using 200 dimensions</article-title>
          , http://new.spinningbytes.com/resources/wordembeddings/
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bird</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>NLTK: the natural language toolkit</article-title>
          .
          <source>In: ACL</source>
          <year>2006</year>
          ,
          <article-title>21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics</article-title>
          ,
          <source>Proceedings of the Conference</source>
          , Sydney, Australia,
          <fpage>17</fpage>
          -
          <issue>21</issue>
          <year>July 2006</year>
          (
          <year>2006</year>
          ), http://aclweb.org/anthology/P06-4018
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Xgboost: A scalable tree boosting system</article-title>
          .
          <source>In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , San Francisco, CA, USA,
          <year>August</year>
          13-
          <issue>17</issue>
          ,
          <year>2016</year>
          . pp.
          <volume>785</volume>
          {
          <issue>794</issue>
          (
          <year>2016</year>
          ), https://doi.org/10.1145/2939672.2939785
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Cortes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Support-vector networks</article-title>
          .
          <source>Machine Learning</source>
          <volume>20</volume>
          (
          <issue>3</issue>
          ),
          <volume>273</volume>
          {
          <fpage>297</fpage>
          (
          <year>1995</year>
          ), https://doi.org/10.1007/BF00994018
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .
          <volume>04805</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ghosh</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veale</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shutova</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barnden</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reyes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          : Semeval-2015 task 11:
          <article-title>Sentiment analysis of gurative language in twitter</article-title>
          .
          <source>In: Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT</source>
          <year>2015</year>
          , Denver, Colorado, USA, June 4-5,
          <year>2015</year>
          . pp.
          <volume>470</volume>
          {
          <issue>478</issue>
          (
          <year>2015</year>
          ), http://aclweb.org/anthology/S/S15/S15-2080.pdf
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Bag of tricks for e cient text classi cation</article-title>
          .
          <source>In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics</source>
          ,
          <string-name>
            <surname>EACL</surname>
          </string-name>
          <year>2017</year>
          , Valencia, Spain, April 3-
          <issue>7</issue>
          ,
          <year>2017</year>
          , Volume
          <volume>2</volume>
          :
          <string-name>
            <given-names>Short</given-names>
            <surname>Papers</surname>
          </string-name>
          . pp.
          <volume>427</volume>
          {
          <issue>431</issue>
          (
          <year>2017</year>
          ), https://aclanthology. info/papers/E17-2068/e17-
          <fpage>2068</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Grice</surname>
            ,
            <given-names>H.P.</given-names>
          </string-name>
          , et al.:
          <article-title>Logic and conversation</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hernandez-Far as</surname>
          </string-name>
          , I.,
          <string-name>
            <surname>Patti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Irony detection in twitter: The role of a ective content</article-title>
          .
          <source>ACM Trans. Internet Techn</source>
          .
          <volume>16</volume>
          (
          <issue>3</issue>
          ),
          <volume>19</volume>
          :1{
          <fpage>19</fpage>
          :
          <fpage>24</fpage>
          (
          <year>2016</year>
          ), https: //doi.org/10.1145/2930663
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Sentiment analysis and opinion mining</article-title>
          .
          <source>Synthesis lectures on human language technologies 5(1)</source>
          ,
          <volume>1</volume>
          {
          <fpage>167</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>E cient estimation of word representations in vector space</article-title>
          .
          <source>In: 1st International Conference on Learning Representations, ICLR</source>
          <year>2013</year>
          , Scottsdale, Arizona, USA, May 2-
          <issue>4</issue>
          ,
          <year>2013</year>
          , Workshop Track Proceedings (
          <year>2013</year>
          ), http://arxiv.org/abs/1301.3781
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ortega-Bueno</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , Hernandez Far as,
          <string-name>
            <given-names>D.I.</given-names>
            ,
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Montes-</surname>
          </string-name>
          y-Gomez,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Medina</given-names>
            <surname>Pagola</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.E.</surname>
          </string-name>
          :
          <article-title>Overview of the Task on Irony Detection in Spanish Variants</article-title>
          .
          <source>In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <article-title>co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN</article-title>
          <year>2019</year>
          ).
          <article-title>CEUR-WS.org (</article-title>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>VanderPlas</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Passos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cournapeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brucher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perrot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duchesnay</surname>
          </string-name>
          , E.:
          <article-title>Scikit-learn: Machine learning in python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          ,
          <volume>2825</volume>
          {
          <fpage>2830</fpage>
          (
          <year>2011</year>
          ), http://dl.acm.org/citation.cfm?id=
          <fpage>2078195</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franco-Salvador</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A low dimensionality representation for language variety identi cation</article-title>
          .
          <source>In: Proceedings of the 17th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing</source>
          <year>2016</year>
          ). LNCS, vol.
          <volume>9624</volume>
          , pp.
          <volume>156</volume>
          {
          <fpage>169</fpage>
          . Springer-Verlag (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Reyes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veale</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A multidimensional approach for detecting irony in twitter</article-title>
          .
          <source>Language Resources and Evaluation</source>
          <volume>47</volume>
          (
          <issue>1</issue>
          ),
          <volume>239</volume>
          {
          <fpage>268</fpage>
          (
          <year>2013</year>
          ), https: //doi.org/10.1007/s10579-012-9196-x
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Rosenthal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nakov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiritchenko</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ritter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          : Semeval-2015 task 10:
          <article-title>Sentiment analysis in twitter</article-title>
          .
          <source>In: Proceedings of the 9th international workshop on semantic evaluation (SemEval</source>
          <year>2015</year>
          ). pp.
          <volume>451</volume>
          {
          <issue>463</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Van Hee</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lefever</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoste</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Semeval-2018 task 3: Irony detection in english tweets</article-title>
          .
          <source>In: Proceedings of The 12th International Workshop on Semantic Evaluation</source>
          . pp.
          <volume>39</volume>
          {
          <issue>50</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Vaswani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shazeer</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parmar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uszkoreit</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez</surname>
            ,
            <given-names>A.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaiser</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polosukhin</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Attention is all you need</article-title>
          .
          <source>In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems</source>
          <year>2017</year>
          ,
          <fpage>4</fpage>
          -9
          <source>December</source>
          <year>2017</year>
          , Long Beach, CA, USA. pp.
          <volume>6000</volume>
          {
          <issue>6010</issue>
          (
          <year>2017</year>
          ), http://papers.nips.cc/paper/7181-attention
          <article-title>-is-all-you-need</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>X.:</given-names>
          </string-name>
          <article-title>A re-examination of text categorization methods</article-title>
          .
          <source>In: SIGIR '99: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August 15-19</source>
          ,
          <year>1999</year>
          , Berkeley, CA, USA. pp.
          <volume>42</volume>
          {
          <issue>49</issue>
          (
          <year>1999</year>
          ), https://doi.org/10.1145/312624.312647
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>