<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CISUC at IDPT2021: Traditional and Deep Learning for Irony Detection in Portuguese</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>CISUC, Department of Informatics Engineering, University of Coimbra</institution>
          ,
          <addr-line>Coimbra</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>These notes describe the participation of the CISUC team in the IDPT 2021 shared task. Irony detection was tackled as a text classi cation task, where both traditional and transformer-based (BERT) approaches were explored. The former performed ok, but not everything went well, and the results achieved by BERT were not evaluated, due to an issue with our o cial submissions. Nevertheless, we still discuss some of the options taken, identify important features, and present validation results in the training data.</p>
      </abstract>
      <kwd-group>
        <kwd>Irony Detection</kwd>
        <kwd>Portuguese</kwd>
        <kwd>Text Classi cation</kwd>
        <kwd>Transformers</kwd>
        <kwd>Logistic Regression</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Irony is a rhetorical device where interpretation should not be literal [18], because
its meaning diverges signi cantly from, and is often the opposite [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], of the
intended meaning. Irony detection is a subtask of Natural Language Processing
aiming at the automatic classi cation of texts as ironic or not, and is extremely
relevant for tasks like Sentiment Analysis and Opinion Mining [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. But irony
detection can be challenging, even for humans, who often rely on visual clues,
like facial expression or tone [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], for recognising irony. This is especially true
when irony is expressed through text only, despite studies on identifying textual
clues for irony detection [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Irony detection has been tackled by several Natural Language
Processing (NLP) researchers, who adopted di erent approaches. In 2018, there was
a SemEval task on Irony Detection in English Tweets [18] that covered the
binary classi cation of tweets as ironic or not. Best systems adopted a deep
learning approach, e.g., a densely LSTM neural network, based on pre-trained
static word embeddings, with syntactic and sentiment features [19]. But there
were also more traditional approaches, e.g., an ensemble classi er with Logistic</p>
      <p>
        Regression (LR) and a Support Vector Machine (SVM), considering pretrained
word and emoji embeddings, as well as handcrafted sentiment and word-based
features [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Since then, as it happened for other NLP tasks, pre-trained
language models such as BERT [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], or variations like RoBERTa [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], were exploited
for obtaining contextualized embeddings, which can be combined with a
classier, e.g, a recurrent Convolutional Neural Network with a LSTM layer [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        This paper describes the participation of a team from the Center of
Informatics and Systems of the University of Coimbra (CISUC) in the Irony Detection in
Portuguese (IDPT) task [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], included in the 2021 edition of the Iberian Languages
Evaluation Forum (IberLEF). This was the rst time we tackled irony detection,
but our interest follows previous work on text classi cation of Portuguese text,
speci cally emotions [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and humour [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Given that annotated datasets were made available by IDPT's organisation,
one with tweets and another with news, we tackled IDPT with a supervised
machine learning approach. Classi ers were learned from the training data and used
for classifying the test data, then submitted to be evaluated. Yet, our rst step
was to look at the data, in order to increase our sensibility with this domain.
In the process, we noted some patterns and learned about the sources of the
training data, which lead to a data cleaning process, described in Section 2.
Following this, we decided to explore both traditional text classi cation approaches
as well as a more recent deep learning approach. The former required us to set
some parameters, including the number of features, but it also enabled us to
analyse and learn about the most important features for irony, at least in the
provided datasets. For this approach, Section 3 provides some insights on the
previous process, including the inclusion of lexicon features. The same section
describes the deep learning approach, based on the popular transformer-based
architecture BERT [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. For both approaches, we present the results of validation
in the training data.
      </p>
      <p>Before concluding, Section 4 has a brief discussion on the o cial results of
the selected classi ers in IDPT. Unfortunately, due to due to our own mistake in
the submission process, classi cations by the BERT classi ers were not properly
evaluated, which made it impossible for us to know their real performance in the
test dataset. On the other hand, the performance of the traditional approach,
based on Logistic Regression, was good enough for an approach that could be
seen as a baseline. This conclusion is mostly based on the results in the news
dataset. From the performance in the tweets dataset it is hard to make
conclusions. Even though the majority of tweets was automatically classi ed as ironic,
according to the evaluation metrics, more than a half was not. Apparently, this
issue was common to all participants.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Data</title>
      <p>Our starting point was the data provided by IDPT's organisation, namely 15,213
tweets and 18,495 news documents, labelled as ironic (1) or non-ironic (0), which
we used for training our models. Test data comprised 300 unlabelled tweets and
300 news documents. For evaluation purposes, test data had to be submitted
with automatically-assigned labels.</p>
      <p>While analysing the aforementioned datasets, we immediately noticed what
could be a discrepancy between training and test data for tweets. Unlike the
test data, training data contained little to no emojis, hashtags (#), user
mentions (@), URLs, as well as no line breaks. Having in mind that this could have a
negative impact on the classi cation task, and that some of those features could
be relevant for irony detection, we tried to understand the di erences.</p>
      <p>
        During this process, we learned about the criteria adopted in the creation of
the training data, after reading some of the references provided by the
organisation [
        <xref ref-type="bibr" rid="ref15 ref5">5, 15</xref>
        ]. Speci cally, we found the dataset created in the scope of da Silva's
BSc thesis [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], which seemed to cover most of the tweets of the training data.
However, this dataset was available1 in a slightly di erent format, where some of
aforementioned missing items were either directly present in the textual content
or could be recovered from additional properties.
      </p>
      <p>One of such properties was the tweet ID, which enabled us to retrieve most of
the original tweets through Twitter's API. With this, we con rmed the
hashtagbased criteria adopted for automatically-labelling the dataset:
{ Ironic tweets were those containing the hashtags #ironia or #sarcasmo;
{ Non-Ironic tweets were those containing #economia, #politica or
#educac~ao.</p>
      <p>Based on da Silva's thesis, we made our own pre-processing of the dataset, which
included: the complete removal of all ve hashtags above; the normalisation of
user mentions and URLs, respectively replaced by @user and @link2. Table 1
illustrates this with tweets in the training dataset, provided by the
organisation, the original tweets as published on Twitter, and the result after our
preprocessing. Di erences towards the provided datasets was the inclusion of emojis,
the complete removal of hashtags used for non-ironic tweets (e.g., #economia),
as well as the normalisations.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Approaches</title>
      <p>This section describes both approaches adopted in our participation in the IDPT
task, a traditional machine learning approach, which could be seen as a baseline,
and a deep learning approach based on BERT. Moreover, for each approach,
validation results are presented and, for the traditional approach, we take a look
at important features considered for detecting irony.</p>
      <sec id="sec-3-1">
        <title>1 https://github.com/fabio-ricardo/deteccao-ironia</title>
        <p>
          2 We later noticed that using the `@' character was not the best option, because some
tokenizers split it from the following word. However, the impact for Portuguese
should still be minimal, because these words are in English.
Di erent traditional machine learning classi ers, implemented in the Python
library scikit-learn [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], were trained and validated in di erent splits of the
training data. For this purpose, documents were represented by TF-IDF
vectors, also resorting to scikit-learn's T dfTransformer. Portuguese stopwords in
the NLTK list were ignored in this process and di erent parameters were tested,
namely the n-gram range, maximum document frequency, minimum document
frequency, and maximum number of features. While experimenting in the
training datasets, we decided on setting:
{ N-gram range to 1 (unigrams), as we saw no improvements with bigrams;
{ The maximum document frequency to 0.5, meaning that tokens occurring
in more than half of the documents in the collection were ignored, for not
being discriminant enough;
{ The minimum document frequency to 3, meaning that tokens occurring in
only one or two documents were ignored, for not being frequent enough.
We also tested di erent values for the maximum number of features.
Cross-Validation Tables 2 and 3 report on the performance of three di
erent classi ers in a 10-fold cross-validation, respectively in the tweets and news
training datasets, using di erent numbers of features (500, 1,500 and 5,000).
The three classi ers used were Logistic Regression (LR), Naive Bayes (NB), and
Random Forest (RF), all white-box, and the metrics considered were: Balanced
Accuracy (BAcc), for being the o cial measure of IDPT; Precision, Recall, and
F1 score (F1).
        </p>
        <p>
          Achieved performances are interesting for a baseline. As expected,
performance is slightly higher for news, which should be more formal, than for tweets,
where several conventions are broken. But top F1 scores of 89% and 97% may
actually suggest that irony detection is not that hard, especially in formal text.
However, these performances are achieved with 5,000 features, which probably
leads to models that are over- tted to the training dataset. Therefore, having
in mind that the documents in the test datasets were extracted for a di erent
time period than the training, and there could be signi cant vocabulary di
erences (e.g., due to di erent trending topics), we decided to consider not more
than 1,500 features for our submission. Validation performances suggest that a
lower number of features has a bigger impact for NB and that, on the other
hand, LR is less a ected. In fact, for tweets, LR performs better with 1,500 than
with 5,000 features. Adding to the simplicity of LR and to its best performance
in the news dataset, we decided to use LR in our o cial IDPT runs.
Lexicon-based features We further decided to explore additional features
that we thought could be useful for irony detection, namely:
{ Concreteness and imageability scores, obtained from the Minho Word Pool
norms [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], where 3,800 Portuguese words have averages of such properties,
from 1 to 7, assigned by several judges;
{ Sentiment and emotion features, acquired from the NRC Emotion lexicon [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ],
where such features (0 or 1) are assigned to 14,182 English words through
crowdsourcing, then translated to other languages, including Portuguese.
This resulted in ten extra features, averaged for each document: Concreteness,
Imageability, Positive, Negative, Anger, Anticipation, Disgust, Fear, Joy, Trust.
Our intuition is that these could complement the TF-IDF features, because,
indirectly, they end up covering a larger vocabulary, more focused and independent
of the training data, and may thus lead to less over- tting. Since the entries in
the previous lexicons are all lemmas, for computing these features, documents
were rst lemmatized, using the Portuguese models of the Stanza [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] package.
        </p>
        <p>Table 4 shows the performance of the LR classi er using only the extra
features or adding them to the 1,500 TF-IDF features. When used alone, their
impact is irrelevant for the Tweets, but they seem to make a di erence for the
News. Alone, they achieve a F1 of 0.71, but when together with TF-IDF, F1
drops by 1 point.</p>
        <p>Our option for including these features anyway is further supported by an
analysis of their importance coe cient in an LR classi er that learned from them
only, and of their values in documents of di erent classes, especially in the News
dataset. For instance, ironic news express slightly more joy, negativity, disgust
and anticipation, and are also more imagetic.</p>
        <p>Feature Importance After training a LR classi er, each feature has an
importance coe cient, which can be useful for interpretation. Tables 5 show the
most important features when the previous classi er is trained in the training
datasets, as well as the number of documents where they occur and the
proportion classi ed as ironic. Some interesting insights can be observed. For instance,
most tweets with user mentions (normalised as `@user') are ironic, and so are
more \extreme" tweets that use words like `adoro', `tudo' or `nada'. As for the
news, many relevant features for irony are names of politicians, suggesting that
they are common targets of irony, or were during the time-span the data was
collected. Other features include words that typically appear before a citation,
namely `disse' and `explicou'.
3.2</p>
        <p>
          Transformer-Based Approach
BERT [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] is a transformer-based model widely used in Natural Language
Processing since its release, by Google. It is pretrained in two general language tasks,
masked language modelling and next sentence prediction, but can be ne-tuned
for other tasks, including text classi cation, which is our case.
Fine-tuning Our starting point was BERTimbau [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], i.e., BERT Base
Portuguese Cased (BERT-PT), a model with 110M parameters, pretrained by
Neuralmind, exclusively for (Brazilian) Portuguese. In order to ne-tune this model
for irony classi cation, we used the BertForSequenceClassi cation class of the
Transformers library3, which adds a classi cation head on top of BERT.
Parameters for this model were empirically selected, namely: batch size of 16 for the
tweets and 8 for the news, due to memory limitations; and Adam optimizer4 for
being the common option, with lr=2e 5 and eps=1e 8.
        </p>
        <p>Text Size We quickly came across a limitation on the text size, i.e., some
documents were longer than the maximum number of tokens that BERT could
handle (510 word pieces, plus the initial [CLS] and the nal [SEP] tokens).
Figures 1 show the distribution of documents according to the number of tokens in
both training sets. As expected, this is much more frequent in the news dataset,
as news articles tend to be longer than tweets. Still, after careful analysis and
deliberation, we assumed that the proportion of documents that exceeded the
limit of tokens was insigni cant and deemed that their absence would not
produce a noticeable change in the model's overall performance. This left us with
two choices: remove the longer documents from the dataset or truncate them.
We chose the latter for several reasons. The rst 510 tokens of each document
would still be relevant for irony detection and, this way, the classi er would learn
from all data. Moreover, documents in the test dataset could also exceed this
limit and we could not simply remove them.</p>
        <p>Validation results In order to select the aforementioned parameters, the
BERT-based classi er was validated in the training dataset. For this, we used
60% of the data for training, 10% for validation and 30% for test. Table 6
summarises the performance of the best models of each kind.</p>
        <p>Validation performances achieved with BERT are very high in both datasets
and outperform the already high F1 of the best traditional approaches, con
rming that BERT is a very powerful model. Additional experiments were performed</p>
      </sec>
      <sec id="sec-3-2">
        <title>3 See https://huggingface.co/transformers/model doc/bert.html 4 See https://huggingface.co/transformers/main classes/optimizer schedules.html</title>
        <p>with balanced versions of the datasets, obtained with undersampling, but they
did not lead to further improvements. We should, nevertheless, recall that these
results can be too over- tted to the training data.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>Despite all the experiments performed with BERT and our positive
expectations regarding their high performance, due to a mistake in our submission5,
it is impossible for us to take any conclusion on the real performance of the
BERT-based models and on its comparison to other approaches, including our
traditional approach. At least until the labels of the test data are not revealed.</p>
      <p>As for the traditional approach, o cial results are in Table 7. We recall that
the model used is based in LR, with 1,500 token features plus 10 lexicon-based
features. It achieved sixth position overall in the tweets dataset, but it was not
much di erent from other participants, nor from our BERT submissions where
the labels were shu ed. In fact, when analysing our labels, we note that the
majority of the tweets were classi ed as ironic, which results in a recall close to
100%. However, precision is lower than 50%, suggesting that about half of the
tweets were not ironic and should have not been classi ed as so.</p>
      <p>
        While the test labels are not revealed, it is not possible to make an error
analysis. However, we believe that performances in the tweet dataset were harmed
5 More precisely, our script had a switch for shu ing the data to label, which was on
when the test data was classi ed, meaning that the submitted labels were not in the
expected order for the o cial evaluation.
by the criteria adopted in the creation of the data, which we suspect to have
diverged between training and test. For instance, in da Silva's thesis [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], one of the
criteria for labelling tweets as ironic was the presence of the hashtag #ironia,
i.e., all the tweets using this hashtag were considered to be positive examples of
ironic tweets, and could be included in the training data as such. Yet, when we
search Twitter for the tweets of IDPT test data, all of them use the previous
hashtag, even if, according to the results, more than a half was not labelled as
ironic. This was probably the result of manual analysis, and should de nitely be
more accurate than da Silva's criteria, which would automatically label them as
ironic. However, this also means that the provided training data was misleading
and, as we have seen, classi ers trained on such data are not apt for correctly
labelling IDPT's tweet test data.
      </p>
      <p>On the news data, our traditional approach achieved the eighth position
overall, with a BAcc that was 12 points below the best run. The name of the
team that submitted the three best runs (TeamBERT4EVER) suggests that they
used BERT, which con rms that irony detection is one more task where BERT
is currently the way to achieve the state-of-the-art. Nevertheless, we highlight
that, in the traditional approach, a white-box model was used, which enabled us
to learn a bit more about how irony is expressed in the datasets, e.g., important
features, most of which we would not immediately associate with irony.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We have described the participation of the CISUC team in the IDPT 2021 shared
task. Despite our issues with the BERT models, the balance is still positive. This
participation lead to the application of known approaches to a new challenge, it
made us think about the relevance of irony, and taught us a little bit about the
way it is expressed in Portuguese.</p>
      <p>We tackled this challenge as a text classi cation task and explored both
traditional and deep learning approaches. As expected, deep learning seems to be the
best path to achieve top performances, and BERT is de nitely a solid model for
attempting at state-of-the-art results. Still, sometimes, learning about language
and how it works is at least as important as achieving the best performances,
and white-box models are much more accessible for this purpose. Unfortunately,
while the labels of the test data are not revealed, we cannot compare the
performance of the latter with our BERT-based approaches in a real scenario, and
thus not analyse the trade-o . The same happens to the comparison with the
runs of other teams. In the news test set, our LR-based approach was ranked
eight, and it will de nitely be interesting to learn about the other approaches,
once the proceedings of IDPT are published.</p>
      <p>Now that we had our rst contact with this topic, there are plenty of ideas for
future work. A possible direction would be studying to what extent it is possible
to learn a general classi er of irony, not suitable for a speci c type of text or
time-span. We did train a BERT-based model on both training datasets (tweets
and news), but it was one of the corrupt submissions. A train-validation-test in
the previous dataset lead to a surprisingly high performance, i.e., comparable
to the performance of the type-speci c models. But stronger conclusions can
only be taken once we actually test the model in the test data. Still, more than
learning from two (or more) di erent types of text, a general classi er would
also have to be trained in texts published in di erent time-spans. The latter
are relevant, because classi ers will learn from the used vocabulary and, during
speci c time periods, some entities (e.g., politicians, athletes, organisations) can
be or become a preferred target of irony, thus skewing the model's evaluation of
the words associated with these entities.</p>
      <p>Another interesting direction would be a deeper analysis of the actual impact
of di erent features, not only tokens, but also n-grams, case (upper or lower),
punctuation, emojis and lexicon-based features, among others. Besides
possibly improving the traditional approaches, some of those features could also be
appended to the inputs of the BERT-based classi ers.</p>
      <p>
        Finally, one could exploit other available corpora for irony detection, possibly
starting with a corpus of humour [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which typically resorts to irony. More data
could also be retrieved from Twitter, possibly relying on additional heuristics
for self-labelling (e.g., speci c emojis). However, given the speci cities of irony,
quality of data is especially important. Therefore, any automatically-created
dataset should be manually revised.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work is partially funded by national funds through the FCT {
Foundation for Science and Technology, I.P., within the scope of the project CISUC
{ UID/CEC/00326/2020 and by European Social Fund, through the Regional
Operational Program Centro 2020.
18. Van Hee, C., Lefever, E., Hoste, V.: SemEval-2018 task 3: Irony detection in
English tweets. In: Proceedings of The 12th International Workshop on Semantic
Evaluation. pp. 39{50 (2018)
19. Wu, C., Wu, F., Wu, S., Liu, J., Yuan, Z., Huang, Y.: THU NGN at SemEval-2018
task 3: Tweet irony detection with densely connected LSTM and multi-task
learning. In: Proceedings of 12th International Workshop on Semantic Evaluation. pp.
51{56. Association for Computational Linguistics, New Orleans, Louisiana (2018)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Carvalho</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sarmento</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Oliveira</surname>
          </string-name>
          , E.:
          <article-title>Clues for detecting irony in user-generated contents: oh</article-title>
          ...!
          <article-title>! it's" so easy";-</article-title>
          .
          <source>In: Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion</source>
          . pp.
          <volume>53</volume>
          {
          <issue>56</issue>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Corr^ea, U.B.,
          <string-name>
            <surname>dos Santos</surname>
            ,
            <given-names>L.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coelho</surname>
          </string-name>
          , L.,
          <string-name>
            <surname>de Freitas</surname>
            ,
            <given-names>L.A.</given-names>
          </string-name>
          :
          <article-title>Overview of the IDPT task on Irony Detection in Portuguese at IberLEF 2021</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Devlin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>M.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          : BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          . In: Procs.
          <article-title>of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)</article-title>
          . pp.
          <volume>4171</volume>
          {
          <fpage>4186</fpage>
          . Association for Computational Linguistics (
          <year>Jun 2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Duarte</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macedo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goncalo</surname>
            <given-names>Oliveira</given-names>
          </string-name>
          , H.:
          <article-title>Exploring emojis for emotion recognition in Portuguese text</article-title>
          .
          <source>In: Proceedings of 19th EPIA Conference on Arti cial Intelligence</source>
          ,
          <source>EPIA</source>
          <year>2019</year>
          ,
          <string-name>
            <surname>Vila</surname>
            <given-names>Real</given-names>
          </string-name>
          ,
          <source>Portugal, September 3-6</source>
          ,
          <year>2019</year>
          , Proceedings,
          <string-name>
            <surname>Part I. LNCS</surname>
          </string-name>
          /LNAI, vol.
          <volume>11805</volume>
          , pp.
          <volume>719</volume>
          {
          <fpage>730</fpage>
          . Springer (
          <year>September 2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. de Freitas,
          <string-name>
            <given-names>L.A.</given-names>
            ,
            <surname>Vanin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            ,
            <surname>Hogetop</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.N.</given-names>
            ,
            <surname>Bochernitsan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.N.</given-names>
            ,
            <surname>Vieira</surname>
          </string-name>
          , R.:
          <article-title>Pathways for irony detection in tweets</article-title>
          .
          <source>In: Proceedings of the 29th Annual ACM Symposium on Applied Computing</source>
          . pp.
          <volume>628</volume>
          {
          <issue>633</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Goncalo</given-names>
            <surname>Oliveira</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          , Clem^encio,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Alves</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Corpora and baselines for humour recognition in Portuguese</article-title>
          .
          <source>In: Proceedings of 12th International Conference on Language Resources and Evaluation</source>
          . pp.
          <fpage>1278</fpage>
          -
          <lpage>{</lpage>
          1285.
          <source>LREC</source>
          <year>2020</year>
          , ELRA, Marseille, France (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Grice</surname>
            ,
            <given-names>H.P.</given-names>
          </string-name>
          :
          <article-title>Logic and conversation</article-title>
          . In: Speech acts, pp.
          <volume>41</volume>
          {
          <fpage>58</fpage>
          .
          <string-name>
            <surname>Brill</surname>
          </string-name>
          (
          <year>1975</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Sentiment analysis and opinion mining</article-title>
          .
          <source>Synthesis lectures on human language technologies 5(1)</source>
          ,
          <volume>1</volume>
          {
          <fpage>167</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turney</surname>
          </string-name>
          , P.D.:
          <article-title>Crowdsourcing a word-emotion association lexicon 29(3</article-title>
          ),
          <volume>436</volume>
          {
          <fpage>465</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Pedregosa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Varoquaux</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gramfort</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Michel</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thirion</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grisel</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blondel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prettenhofer</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dubourg</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vanderplas</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cournapeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brucher</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perrot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duchesnay</surname>
          </string-name>
          , E.:
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          ,
          <volume>2825</volume>
          {
          <fpage>2830</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Potamias</surname>
            ,
            <given-names>R.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siolas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stafylopatis</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          :
          <article-title>A transformer-based approach to irony and sarcasm detection</article-title>
          .
          <source>Neural Computing and Applications</source>
          <volume>32</volume>
          (
          <issue>23</issue>
          ),
          <volume>17309</volume>
          {
          <fpage>17320</fpage>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Qi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Bolton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Manning</surname>
          </string-name>
          , C.D.:
          <article-title>Stanza: A Python natural language processing toolkit for many human languages</article-title>
          .
          <source>In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Rohanian</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taslimipoor</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Evans</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mitkov</surname>
          </string-name>
          , R.: WLV at SemEval
          <article-title>-2018 task 3: Dissecting tweets in search of irony</article-title>
          .
          <source>In: Proceedings of 12th International Workshop on Semantic Evaluation</source>
          . pp.
          <volume>553</volume>
          {
          <fpage>559</fpage>
          . Association for Computational Linguistics, New Orleans,
          <string-name>
            <surname>Louisiana</surname>
          </string-name>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Sarmento</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carvalho</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Oliveira</surname>
          </string-name>
          , E.:
          <article-title>Automatic creation of a reference corpus for political opinion mining in user-generated content</article-title>
          .
          <source>In: Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion</source>
          . pp.
          <volume>29</volume>
          {
          <issue>36</issue>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>da Silva</surname>
            ,
            <given-names>F.R.A.</given-names>
          </string-name>
          :
          <article-title>Detecca~o de ironia e sarcasmo em l ngua portuguesa: Uma abordagem utilizando deep learning</article-title>
          .
          <source>Tech. rep.</source>
          ,
          <string-name>
            <surname>Universidade Federal de Mato Grosso</surname>
          </string-name>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Soares</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costa</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Machado</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Comesan~a,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Oliveira</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.M.:</surname>
          </string-name>
          <article-title>The Minho Word Pool: Norms for imageability, concreteness, and subjective frequency for 3,800 Portuguese words</article-title>
          .
          <source>Behavior Research Methods</source>
          <volume>49</volume>
          (
          <issue>3</issue>
          ),
          <volume>1065</volume>
          {{
          <volume>1081</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Souza</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nogueira</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lotufo</surname>
          </string-name>
          , R.:
          <article-title>BERTimbau: Pretrained BERT models for Brazilian Portuguese</article-title>
          .
          <source>In: Proceedings of the Brazilian Conference on Intelligent Systems (BRACIS</source>
          <year>2020</year>
          ). LNCS, vol.
          <volume>12319</volume>
          , pp.
          <volume>403</volume>
          {
          <fpage>417</fpage>
          . Springer, Cham (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>