=Paper=
{{Paper
|id=Vol-1881/StanceCat2017_paper_7
|storemode=property
|title=Shared Task on Stance and Gender Detection in Tweets on Catalan Independence - LaSTUS System Description
|pdfUrl=https://ceur-ws.org/Vol-1881/StanceCat2017_paper_7.pdf
|volume=Vol-1881
|authors=Francesco Barbieri
|dblpUrl=https://dblp.org/rec/conf/sepln/Barbieri17
}}
==Shared Task on Stance and Gender Detection in Tweets on Catalan Independence - LaSTUS System Description==
<pdf width="1500px">https://ceur-ws.org/Vol-1881/StanceCat2017_paper_7.pdf</pdf>
<pre>
Shared Task on Stance and Gender Detection in Tweets
on Catalan Independence - LaSTUS System Description

                                 Francesco Barbieri
                          francesco.barbieri@upf.edu

                          Universitat Pompeu Fabra, Barcelona, Spain


        Resumen In this paper we describe the system LaSTUS presented in the sha-
        red task on Stance and Gender Detection in Tweets on Catalan Independence, in
        the context of IberEval 2017. We participated to the task using FastText a linear
        model, extension of the classic bag of word. We also use pre-trained embeddings
        trained on 5 million tweets posted in Spain.


1.    Introduction

     In the past few years the debate on Catalan independence has been quite discussed
in politics. The topic generated a lot of discussion as well in social media. In the shared
task “Stance and Gender Detection in Tweets on Catalan Independence”[8] the orga-
nizers proposed a task to automatically recognize if a document (a tweet) is in favor
or against the Catalan independence. Such automatic systems are very useful in practi-
ce, in order to analyze people opinion about a specific topic [7]. To successfully detect
stance, automatic systems need to identify important bits of information that may not be
present in the focus text. Moreover, this task is harder then the classic Sentiment Analy-
sis task, since understanding whether the polarity of the tweet is positive or negative is
not sufficient to understand the opinion of the author of the tweet.
     The shared task also included a gender identification challenge, in order to study
the demographic of the debate. The documents were in Spanish and Catalan. In the
next section we will describe the tasks and the dataset provided by the organizers. In
Section 3 we describe the system we used, and in Section 4 we show the results of our
system.


2.    Task and Dataset

     The shared task included two tasks (for Spanish and Catalan tweets) [8]:

 1. Stance Detection: Given a message, decide the stance taken towards the target
    Çatalan Independence”. The possible stance labels are: FAVOR, AGAINST and
    NONE.
 2. Identification of Gender: Given a message, determine its author’s gender. The
    possible gender labels are: FEMALE and MALE.
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)


             The dataset [3] used in the tasks included tweets retrieved during the regional elec-
         tions in September 2015, and the political debate was focused on a possible indepen-
         dence of Catalonia. The dataset included 8638 tweets for the stance and for the gender
         recognition tasks (4319 in Spanish and 4319 in Catalan). In Table1 we report examples
         from the dataset.


                                                        Spanish
                            Lo dije ayer y lo repito: votar algo que no sea ’Junts pel Si’ o la CUP
                      F
                            es tirar el voto a la basura. #somriureCUP
                T1          Primeros datos de participación. 34,78 %. Un 5 % más a estas horas que
                      N
                            en 2012 #27S
                      A     #27S ¡Sı́! ¡Soy ESPAÑOL!
                            Artur Mas llamando a todos sus colegas empresarios, le falta un 3 %
                     M
                            para llegar al 50 %. #27S
                T2
                            En unas plebiscitarias (votas una preguntas binaria) ¿prevalecen votos
                      F
                            (ciudadanos) o escaños? #27S
                                                        Catalan
                            Avui #si ha arribat el dia #27S serà un gran dia. Gràcies a tothom que
                      F
                            hi ha treballat tant per fer-ho possible
                T1    N     A #Sants n’hi ha que van a votar preparats #27S
                            A casa hem jugat a les votacions i ma filla diu q ha votat al
                      A
                            #presidentMas :( #epicfail #27S
                     M      Avui farem història ?????????? #27s
                T2
                     F      Bon dia Catalunya! Llibertat i democràcia. Cap a omplir les urnes! #27S

         Cuadro 1: Examples of the dataset for each language and label of the two tasks. T1 is the stance
         detection task (Favor, Neutral, Against) and and T2 is the gender identification Task (Male and
         Female).


             In addition to these tweets we also use a corpus o 5 million tweets posted in Spain
         between October 2015 and December 2016 in Spain, in order to train pre-trained vec-
         tors.


         3.     Our System

              In this section we will describe the system we presented to the shared task. In the
         first sub-section we describe the preprossing pipeline, and in the second sub-section we
         describe the FastText classifier.


         3.1.    Preprocessing

            Tweet texts were preprocessed with a modified version of the CMU Tweet Two-
         kenizer [4], where we changed several regular expressions and added a Twitter emojis


                                                                                                                        218
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)


         vocabulary to better tokenize the tweets1 . We also removed, from each tweet, all hyper-
         links, and lowercased all textual content in order to reduce noise and sparsity. We also
         replace each user mention with the token “@user”.

         3.2.     FastText
             Fastext2 [5] is a linear model for text classification. We decided to employ FastText
         as it has been shown that on specific classification tasks, it can achieve competitive
         results, comparable to complex neural classifiers (RNNs and CNNs). The best feature
         of FastText is the speed as it can be much faster than complex neural models. The
         FastText algorithm is similar to the CBOW algorithm [6], where the middle word is
         replaced by the label. Given a set of N documents, the loss that the model attempts to
         minimize is the negative log-likelihood over the labels:
                                                 n=1
                                              1 X
                                   loss = −       en log(softmax (BAxn ))
                                              N
                                                  N

         where en is the label included in the n-th tweet, represented as hot vector. A and B are
         affine transformations (weight matrices), and xn is the unit vector of the bag of features
         of the n-th document (comment). The bag of features is the average of the input words,
         represented as vectors with a look-up table.
             We initialize the look-up table with pre-trained embeddings trained with the algo-
         rithm of [2], an extension of the continuous skipgram algotithm [6], where also the sub-
         information of the words is taken in account (by representing each word with a bag of
         n-grams, i.e. the sum of the vector representation of each n-gram included in the word).
         We pre-train the vectors on 5 million tweets geo-localized in Spain (see Section 2).


         4.     Results and Discussion
             In this section we show the results of the model in the shared task and discuss them.
         In Table 2 are reported the results for the two tasks in the two languages. We show
         results of the best participant model, our model described in the previous section and
         also the ranking position of our model (comparing to other participant models).
             In Table 2 we can see that our model is somehow competitive in the Stance-ES task
         and Gender-CA where it is outperformed by the best systems of four points. In the other
         two tasks (Stance-CA and Gender-ES) our model performs quite poorly comparing to
         the best system (8 points difference). We are not aware of the models used by other
         participants and can not infer the reason of these results. We can not even say that our
         system is better in one language or in one task as our best results are in Stance-ES and
         Gender-CA.
             We believe that one of the problem of our system was the preprocessing: removing
         the user mentions (user) was not a good idea, as the user mentions could include im-
         portant insights about the stance of the tweet. Also, we need to explore whether our
          1
              http://www.ark.cs.cmu.edu/TweetNLP/
          2
              https://github.com/facebookresearch/fastText


                                                                                                                        219
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)


                                                     Stance              Gender
                                                  ES        CA         ES     CA
                               Best Model        0.49      0.49       0.69    0.49
                               Our Model         0.46      0.40       0.61    0.44
                                Ranking         4 of 10   6 of 9     4 of 5  3 of 4

         Cuadro 2: Results of the best model and our model in the two tasks and two languages. The
         Macro F1 is used in the Stance results and for the Gender task the accuracy. It is also reported the
         ranking of our system compared to other participant.


         system was overfitting the training dataset, as using systems like Bag of Words, or si-
         milar methods, can lead to model a specific topic instead of modeling the target labels
         [1].


         5.    Conclusions

             In this paper we describe the system we presented at the shared task on Stance and
         Gender Detection in Tweets on Catalan Independence. We used the FastText classifier
         with pre-trained embeddings trained on 5 million tweets. Our model performances are
         acceptable in some tasks, but in other tasks are very poor, suggesting that we need to
         improve the system. We look forward to see how other participants tackled the problem
         of Stance and Gender classification.


         Referencias

         1. Barbieri, F., Ronzano, F., Saggion, H.: How topic biases your results? a case study of sentiment
            analysis and irony detection in italian. In: Recent Advances in Natural Language Processing,
            RANLP. pp. 41–47. Bulgaria (2015)
         2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword in-
            formation. arXiv preprint arXiv:1607.04606 (2016)
         3. C. Bosco, M. Lai, V.P.F.R.P.R.: Tweeting in the debate about catalan elections. Language
            Resources and Evaluation Conference (LREC), workshop on Emotion and Sentiment Analysis
            Workshop (ESA) (2016)
         4. Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M.,
            Yogatama, D., Flanigan, J., Smith, N.A.: Part-of-speech tagging for twitter: Annotation, fea-
            tures, and experiments. In: Proceedings of the 49th Annual Meeting of the Association for
            Computational Linguistics: Human Language Technologies: short papers-Volume 2. pp. 42–
            47. Association for Computational Linguistics (2011)
         5. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classifica-
            tion. In: Proceedings of the 2017 Conference of the European Chapter of the Association for
            Computational Linguistics. Association for Computational Linguistics, Valencia, Spain (April
            2017)
         6. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in
            vector space. arXiv preprint arXiv:1301.3781 (2013)


                                                                                                                        220
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017)


         7. Mohammad, S.M., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: Semeval-2016 task 6:
            Detecting stance in tweets. In: Proceedings of the International Workshop on Semantic Eva-
            luation. SemEval ’16, San Diego, California (June 2016)
         8. Taulé, M., Martı́, M.A., Rangel, F., Rosso, P., Bosco, C., Patti, V.: Overview of the task of
            Stance and Gender Detection in Tweets on Catalan Independence at IBEREVAL 2017. In:
            Notebook Papers of 2nd SEPLN Workshop on Evaluation of Human Language Technologies
            for Iberian Languages (IBEREVAL). CEUR Workshop Proceedings, CEUR-WS.org, Murcia,
            Spain (September 2017)


                                                                                                                        221

</pre>