-

SiDi-NLP-Team at IDPT2021: Irony Detection in Portuguese 2021

Almeida Neto

onio Manoel dos Santos

Bruna Almeida

Azevedo

Camila de Araujo.

Nobrega

Fernando Antonio Asevedo

Henrico Bertini

Peinado

Luis Heriberto Osinaga

Bittencourt

Marciele de Menezes

da Silva

Nataly Leopoldina Patti

Cortes

Priscila Osorio

This paper presents the submission of SiDi-NLP team in IDPT 2021 - Irony Detection in Portuguese (IberLEF 2021). Irony detection is a challenging semantic task similar to other tasks such as Sentiment Analysis. Due to these similarities we performed experiments using algorithms that achieved state-of-the-art results for similar semantic tasks in Brazilian Portuguese with linguistic feature representation and pre-trained BERT models applied to the two shared task datasets { Tweet Dataset and News Dataset. The pre-trained BERT models outperformed the other classi ers achieving 1:00 accuracy and F1 in the Tweet Dataset, and 0:903 accuracy/0:900 in F1 for the News Dataset. We also discuss the results considering the results obtained in the shared task.

Irony Sarcasm

The Irony Detection in Portuguese (IDPT 2021) [ 4 ] is a shared task co-allocated at IberLEF 2021 [ 11 ] and presents competitors with datasets for identi cation of irony documents in two di erent domains { tweets and news. Some authors describe the ability to recognizing ironic sentences by humans as "relatively easy way although not always" [ 8 ]. The main goal of the task is to extract a label for a document as F alse when the whole document does not contain irony and

T rue otherwise. Due to the subjectivity of the task, there are several similarities between Irony Detection and other Natural Language Processing tasks, such as Sentiment Analysis and Hate Speech Detection. The main challenges in these scenarios are that the irony clues usually relate to di erent pragmatic contexts such as the period in which the text was published, external information, relation between writer and readers, and others.

Since the use of pragmatic feature for machine learning algorithms usually comprehends complex processes of modeling, we handled the problem as a text classi cation problem based on features related to Sentiment Analysis methods (Section 2.2). This idea was based on the hypothesis that ironic texts would share linguistic features with opinion texts, due to the nature of both tasks. Furthermore, in order to compare the e ciency of these features against most recent approaches for Natural Language Processing, we also performed experiments with a pre-trained BERT (Bidirectional Encoder Representations from Transformers) model that was built for the Portuguese Language { BERTimbau [13].

We have observed that our proposed linguistic feature has not outperformed the results of BERTimbau, at least, at the test dataset that was shared by the shared task organizers. It is important to note that in the o cial report by the organizers, we may see a kind of rotation between the results of the participants over the two datasets. It could be a suggestion of a huge di erence between the data distribution or, probably, di erent approaches were used and have presented good results over distinct contexts. 2 2.1

Experimental Setup Corpora

The two corpora used in this work contain texts on di erent topics written in Brazilian Portuguese language and are publicly available.

The rst corpora contain 15,212 tweets extracted from dataset used in an irony and sarcasm detection work [ 1 ]. The authors collected potentially ironic tweets containing the hashtags \#ironia" or \#sarcasmo" posted between August 10, 2014 and August 6, 2017. Others non-ironic tweets were collected considering random tweets about economics, politics, and education that do not contain the hashtags #ironia or #sarcasmo. Additionally, the authors included tweets collected by de Freitas et. al. [ 5 ] that were manually annotated by Portuguese language experts. This nal dataset has many more ironic sentences than non-ironic sentences, as shown the distribution of classes in g1. This dataset was free of words and expressions that could serve as tips for the model and interfere in the learning, such as links or \rt", \#ironia" or \#sarcastico" tags.

The second corpora contains 18,494 news extracted from Sensacionalista, The Piau Herald, and Estad~ao websites. The news were labeled according to the source website: news from Sensacionalista and The Piau Herald are sarcastic and, therefore, were labeled as ironic; news from Estad~ao were labeled as nonironic [ 10 ]. The distribution of classes is shown in Fig. 2. 2.2

Machine Learning Features

There are several similarities between Irony Detection and Sentiment Analysis due to the nature of the tasks. Following this intuition, we ran baseline experiments in the datasets using a well known text representation and pipelines that were proposed at [ 2 ].

The features presented in [ 2 ] are the same we used for our ML methods since they enable the classi ers to observe several features that may indicate the semantic alignment of the sentence. The representation is not decisive on the classi cation, but merely as input for classi ers. We describe all the features as follows: { Bag-of-Words (BoW): a BoW representation of the data with absence or presence as 0 and 1 respectively; { Presence of negations: in Sentiment Analysis, the negation presence usually indicates the inversion of a polarity. We used a list of Brazilian Portuguese negations such as \n~ao" and \nunca" in order to keep this aspect in our experiments even though the feature must not be as important for Irony as it is for Sentiment Analysis; { Emoticons: emoticon is a string that put together form a gure representation that contains semantic relevance for the sentence. We used a list of positive and negative emoticons to map the alignment of the token in the sentence; { Emojis: similarly to the emoticons, we also represented the emojis in the documents. The main di erence between them is that the emojis are alphanumeric characters that nowadays are interpreted by smartphones and browsers to be shown as gures. We also used a corpora with the polarity of each emoji in the document as a feature [ 9 ]; { Sentiment Lexicon: we also provided the ML methods with the count of positive and negative words in the sentence following Sentilex [ 12 ]; { Part-of-Speech tagging: we also counted the number of nouns, adverbs, verbs and adjectives using PoS tagged in NLPnet [ 7 ], the feature is specially relevant for identifying adjectives that are most frequent in opinion sentences and less frequent in factual information.

We used these features as input for ve ML classi ers { a Support Vector Machine (SVM), a Logistic Regressor (LR), MultiLayer Perceptron (MLP), Random Forest (RF) and Naive Bayes (NB). We have chosen these methods considering [ 2 ] and the results obtained in Sentiment Analysis for Brazilian Portuguese tweets. The parameters were not grid-searched and are the same as the best t the original work presents [ 3 ]. 2.3

BERT and BERTimbau

BERTimbau[13] is a pre-trained BERT model trained on the Portuguese language [13]. BERT [ 6 ] is a Transformer encoder architecture that learns contextual relations between words in a text to generate a language model. BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. This pre-trained BERT model can be ne-tuned with just one additional output layer to create speci c models for a wide range of tasks in NLP, without substantial architecture modi cations [ 6 ].

To create a Portuguese version of BERT, the authors of BERTimbau used data from brWaC [14], the largest open Portuguese corpus, which contains 2.68 billion tokens from 3.53 million documents (web pages). They train BERTimbau models on two sizes: Base (12 layers, 768 hidden dimensions, 12 attention heads, and 110M parameters) and Large (24 layers, 1024 hidden dimensions, 16 attention heads and 330M parameters) [13].

In this work, we ne-tuned the base version of BERTimbau to classify irony sentences. The parameters were not grid-searched and are the same as in the original work.

Results

Since we have observed a big variation over the two datasets and that IDPT organizers also have indicated the submissions of the results separately, we will also report the results individually here.

Table 1 and Table 2 present the values of Accuracy and F-score obtained by each method in each dataset. To facilitate the comparison of the results, the best F-score for each dataset was highlighted in bold.

Multilingual-BERT and BERTimbau stood out positively in F-score and Accuracy in both datasets. These methods also correctly predicted all test samples from Tweets dataset. Analyzing this dataset, it was possible to notice that the non-ironic examples had a limited vocabulary related to economic news, and for that reason, it is possible to say the models were biased by training data. Since BERTimbau did better at predicting the News dataset, it was de ned as a model to perform the nal predictions. 4

Conclusion

It is interesting to note the teams that have participating of the IDPT 2021 have not presented the equivalent results over the two datasets from the competition. For instance, the submissions of the team \TeamBERT4Ever` have the highest results on news, but they have not performed well on dataset with tweets. The same behavior can be observed on our submission, in which our relative results over news are better than other ones. For instance, the models not based on deep learning approaches also have reached good results at the datasets with news than in the another one.

Acknowledgement

We acknowledge SiDi for all the support with the infrastructure for experiments and work environment for the development of the work. 13. Fabio Souza, Rodrigo Nogueira, and Roberto Lotufo. Bertimbau: Pretrained bert models for brazilian portuguese. In Ricardo Cerri and Ronaldo C. Prati, editors, Intelligent Systems, pages 403{417, Cham, 2020. Springer International Publishing. 14. Jorge A. Wagner Filho, Rodrigo Wilkens, Marco Idiart, and Aline Villavicencio.

The brWaC corpus: A new open resource for Brazilian Portuguese. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018. European Language Resources Association (ELRA).

1. Fabio Araujo da Silva. Detecca~o de ironia e sarcasmo em l ngua portuguesa: uma abordagem utilizando deep learning . Bachelor's thesis , Universidade Federal de Mato Grosso, February 2018 .

2. Lucas

V Avanco

, Henrico B Brum, and

Nunes . Improving opinion classi ers by combining di erent methods and resources . XIII Encontro Nacional de Intelig^ encia Arti cial e Computacional (ENIAC) , pages 25 { 36 , 2016 .

Henrico

Bertini Brum and Maria das Gracas Volpe Nunes . Semi-supervised sentiment annotation of large corpora . In International Conference on Computational Processing of the Portuguese Language , pages 385 { 395 . Springer, 2018 .

Ulisses

Brisolara Corr^ea, Leonardo Pereira dos Santos, Leonardo Coelho, and Larissa A . de Freitas. Overview of the IDPT Task on Irony Detection in Portuguese at IberLEF 2021. Procesamiento del Lenguaje Natural , 67 , 2021 .

5. Larissa Astrogildo de Freitas, Aline Aver Vanin, Denise Nauderer Hogetop, Marco Nemetz Bochernitsan, and

Renata

Vieira . Pathways for irony detection in tweets . In Proceedings of the 29th Annual ACM Symposium on Applied Computing , SAC ' 14 , page 628 { 633 , New York, NY, USA, 2014 . Association for Computing Machinery .

Jacob

Devlin , Ming-Wei

Chang

Kenton

Lee ,

and Kristina

Toutanova . BERT: pretraining of deep bidirectional transformers for language understanding . CoRR , abs/ 1810 .04805, 2018 .

7. Erick R Fonseca, Joa~o Lu s G Rosa, and Sandra Maria Alu sio . Evaluating word embeddings and a revised corpus for part-of-speech tagging in portuguese . Journal of the Brazilian Computer Society , 21 ( 1 ):1{ 14 , 2015 .

Irazu

Hernandez-Far as , Jose-Miguel Bened , and

Paolo

Rosso . Applying basic features from sentiment analysis for automatic irony detection . In Iberian Conference on Pattern Recognition and Image Analysis , pages 337 { 344 . Springer, 2015 .

Petra

Kralj Novak , Jasmina Smailovic, Borut Sluban, and

Igor

Mozetic . Sentiment of emojis . PloS one , 10 ( 12 ): e0144296 , 2015 .

10. Gabriel Schubert Marten and Larissa Astrogildo de Freitas. The construction of a corpus for detecting irony and sarcasm in portuguese . In Proceedings of XVII Encontro Nacional de Intelig^ encia Arti cial e Computacional (ENIAC- 2020 ), pages 709 { 717 , Rio

Grande

, Brazil, 2020 .

11. Manuel

Montes

, Paolo Rosso, Julio Gonzalo, Ezra Aragon, Rodrigo Agerri, Miguel Angel Alvarez Carmona, Elena Alvarez Mellado, Jorge Carrillo de Albornoz, Luis Chiruzzo, Larissa Freitas, Helena Gomez Adorno, Yoan Gutierrez, Salud Mar a Jimenez Zafra , Salvador Lima, Flor Miriam Plaza de Arco, and

Mariona

Taule . In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021 ). CEUR Workshop Proceedings , 2021 .

12. Mario J Silva, Paula Carvalho, and Lu s Sarmento. Building a sentiment lexicon for social judgement mining . In International Conference on Computational Processing of the Portuguese Language , pages 218 { 228 . Springer, 2012 .