Introduction

An embedding-based approach for irony detection in Arabic tweets?

Leila Moudjari

Karima Akli-Astouati

0 0 University of Science and Technology Houari Boumediene, Research in Intelligent Computing, Mathematics and Applications, RIIMA laboratory

People write on a wide range of topics. Sometimes they express their disagreement in a sentence using sarcasm or irony. Irony represents an interesting way for opinion communication towards a particular target in social media. Therefore, it a ects the opinion analysis. In this paper, we present our results for the IDAT2019 Task: Irony Detection in Arabic Tweets. For this task, labeled data of Arabic tweets was shared. We summarize the methods, resources, and tools used with a focus on the techniques and resources that gave better results during tests.

irony detection machine learning feature selection classi cation cnn

Introduction Related work

In order to capture the irony expressed in a text we explore the embedding. A semantic layer which can provide more information about the context. Therefore, in this section we present some related work to irony detection and models of embedding. 2.1

Irony detection

In recent years, several approaches have been proposed to deal with irony detection in social media ([ 4 ], [ 2 ], [ 5 ]).

Irony detection has been treated as a problem of classication, where classi ers such as decision trees and support vector machine (SVM) give the best results ([ 3 ]).

Regarding Arabic, the attempts in which irony has been addressed in literature are fewer. According to [ 11 ], there are no automatic approaches to detect irony.

In [ 12 ], the authors manually analyzed the similarities and di erences between ironic expressions in English and Arabic. They used data from books, articles and Internet.

Karoui and al. in [ 6 ] used multiple features such as surface, sentiment, false assertion and exaggeration to infer the context needed to detect irony in Arabic social media texts.

Such works encourage the idea to explore sentiment analysis approaches and apply them to irony detection. 2.2

Embedding

Embedding provide a dense representation of the entity (words, characters, subwords) and their relative meanings.

Embeddings can be learned from text data and reused among machine learning algorithms. They can also be learned as part of tting a neural network on text data.

While the point of network training is to learn good parameters, word vector representations follow the notion that similar words are closer to each other [ 9 ].

There are many models o ered for learning word embedding from raw text. Among these are GloVe [ 10 ] and dependency-based word embedding [ 7 ]. The well-known and widely used word2vec model introduced by Mikolov in 2013 at Google ([ 8 ], [ 9 ]) was chosen for use.

Word2vec describes two architectures for computing continuous vectors representations, the skip-gram (SG) and Continuous Bag-Of-Words (CBOW). The former predicts the context-words from a given source word, while the latter performs the opposite and predicts a word according to its context window.

Embedding has shown their strengths in other classi cation tasks such as sentiment analysis. Therefore, we decided to apply it to the irony detection task, since we need to classify a text as ironic or non-ironic.

Our test showed that the CBOW model performs a little better compared to the SG model.

In the next section, we describe our models and the data used. 3

Methodology

The detection of irony is a di cult task that is complicated by the complexity of the Arabic text. In this section we describe the methodology we followed to resolve the problem that is raised. We rst start by describing the data used to train and test the models. 3.1

Data

The IDAT shared dataset [ 1 ] includes tweets related to di erent political issues and events related to the Middle East that was held during the years 2011 to 2018. Tweets are written in a formal language (standard Arabic) and in dialects of some Arab countries: Egypt, Gulf, Levantine (dialect of Syria, Lebanon and Palestine), and Maghrebi dialects. Table 1 shows the distribution of the tweets for the shared task of irony detection. Baseline : In every experiment, we used frequency Bag of Words (BoW) model as lexical features. We also performed standard text pre-processing by deleting user mentions, URLs, stop words and the words containing less than 3 characters.

To set a baseline for our models we used the SVM model which is widely used for classi cation tasks in general and the irony detection in particular.

Convolutional neural network (CNN) : Our both proposed models imple

ment CNN. However, each one uses di erent parameters. We used one convolution (conv1D) layer, followed by a max pooling layer and a dropout of 20%. Before the dense layer we used a atten layer.

In Fig.1 we give the general architecture of our CNN model. Word embedding model (M1) : We used keras Embedding layer 3. It requires that the input data is digitally encoded. The Embedding layer is initialized with random weights and will learn an embedding for all of the words in the training dataset.

For this rst proposed model we tried creating pre-trained vectors, but during the tests the use of the keras Embedding layer helped improve the results.

Sub-word embedding and feature selection (M2) : The sub-word embed

ding model relies on the best features that yield the best results. Therefore, we followed some steps to help us achieve this.

First, we relied on chi2 4 to extract relative features (url existence, word count, negative and positive word count, punctuation count, tag existence). In order to extract the positive and negative words, we relied on publicly available lexicons5.

The following graph (Fig. 2) shows the features that gave the best scores. Second, we created our sub-words in a BOW format.

Third, we used keras layer to create the embedding vectors.

Lastly, we merged the features array with the sub-words embedding array for each tweet.

The following table (2) gives the detailed parameters for each model: The tests concluded that these two models give the best results compared to the others (long short term memory, ...). The following section details these results. 3 https://keras.io/layers/embeddings/#embedding 4 https : ==scikit learn:org=stable=modules=generated=sklearn:f eatureselection:chi2:html 5 https://saifmohammad.com/WebPages/ArabicSA.html We tested several machine learning algorithms in order to evaluate and select the best performing one. Table 3 presents the results of the tests during the testing phase for each class and the o cial results. In terms of accuracy (A), macro-averaged F- score (F), precision (P) and recall (R).

As we can see the M2 model gave the best performance. These results were obtained by using a train-test con guration of a random test, not by using cross validation. These results are comparable to the ones obtained during the testing phase. Although we encountered a signi cant decrease in the systems performance in the o cial test, we believe that our system can be used for tasks such as irony or sarcasm detection.

Conclusion

In this article, we have proposed two models that can be used to identify ironic messages, incorporating various features to capture ironic statements. Our results revealed good classi cation performance on the training dataset, but a lower performance on the evaluation data, with a notable decrease in the F-score. We found that a majority of the systems we tested consistently provided slightly higher scores for the ironic texts. We believe that this is due to the fact that the dataset has more texts labeled as ironic.

In our future work we plan to study ways to improve our system for classication tasks. We also want to test the part-of-speech tagging to see whether it a ects the results. What are the words that are more ironic adjectives or adverbs? Such features and more are very interesting to explore.

1. Ghanem , B. , Karoui , J. , Benamara , F. , Moriceau , V. , Rosso , P. : Idat@ re2019: Overview of the track on irony detection in arabic tweets . In: Mehta P., Rosso

, Majumder

, Mitra

. (Eds.) Working Notes of the Forum for Information Retrieval Evaluation (FIRE 2019) . CEUR Workshop Proceedings. In: CEUR-WS.org, Kolkata, India, December 12 - 15 ( 2019 )

2. Ghosh , A. , Veale , T. : Ironymagnet at semeval-2018 task 3: A siamese network for irony detection in social media . In: Proceedings of The 12th International Workshop on Semantic Evaluation . pp. 570 { 575 ( 2018 )

3. Hernandez-Far as , I., Bened , J.M. , Rosso , P. : Applying basic features from sentiment analysis for automatic irony detection . In: Iberian Conference on Pattern Recognition and Image Analysis . pp. 337 { 344 . Springer ( 2015 )

4. Jia , X. , Deng , Z. , Min , F. , Liu , D. : Three-way decisions based feature fusion for chinese irony detection . International Journal of Approximate Reasoning ( 2019 )

5. Karoui , J. , Farah , B. , Moriceau , V. , Patti , V. , Bosco , C. , Aussenac-Gilles , N.: Exploring the impact of pragmatic phenomena on irony detection in tweets: A multilingual corpus study . In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1 ,

Long

Papers . pp. 262 { 272 ( 2017 )

6. Karoui , J. , Zitoune , F.B. , Moriceau , V. : Soukhria: Towards an irony detection system for arabic in social media . Procedia Computer Science 117 , 161 { 168 ( 2017 )

7. Levy , O. , Goldberg , Y. : Dependency-based word embeddings . In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2 :

Short

Papers ) . vol. 2 , pp. 302 { 308 ( 2014 )

8. Mikolov , T. , Chen , K. , Corrado , G. , Dean , J.: E cient estimation of word representations in vector space . arXiv preprint arXiv:1301.3781 ( 2013 )

9. Mikolov , T. , Sutskever , I. , Chen , K. , Corrado , G.S. , Dean , J. : Distributed representations of words and phrases and their compositionality . In: Advances in neural information processing systems . pp. 3111 { 3119 ( 2013 )

10. Pennington , J. , Socher , R. , Manning , C. : Glove: Global vectors for word representation . In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) . pp. 1532 { 1543 ( 2014 )

11. Rosso , P. , Rangel , F. , Far as, I.H. , Cagnina , L. , Zaghouani , W. , Char , A. : A survey on author pro ling, deception, and irony detection for the arabic language . Language and Linguistics Compass 12 ( 4 ), e12275 ( 2018 )

12. Sigar , A.H. , Taha , Z.S.: A contrastive study of ironic expressions in english and arabic . College Of Basic Education Researches Journal 12 ( 2 ), 795 { 815 ( 2012 )