Introduction

Stance detection at IberEval 2017: A Biased Representation for a Biased Problem

Diego Aineto Garc a

og@inf.upv.es 0

Antonio Manuel Larriba Flor

0 0 Universitat Politecnica de Valencia

2017

204 209

In this paper, we explain our approach to the task Stance and Gender Detection in Tweets on Catalan Independence whose objective is to detect the author's posture towards the topic of Catalan independence. An in-depth description of our system submitted to the task is presented as well as an overview of the experiments which leads to this solution during the development of the system. Our best solutions are based on a biased representation which allows us to train with the whole dataset.

Introduction

Stance detection is the task of identifying the favorability of a given piece of text towards a particular target. This task shares many similarities with sentiment analysis, which aims to detect whether a text is objective or subjective and, for the latter case, evaluate its positiveness or negativeness. The main di erence lies in that, in stance detection, the system needs to be able to determine if the texts is in favor, against or neutral towards a topic that may not even explicitly appear in the text. However some di erences exist, an opinion might be positive and against the topic.

The di culty of this task is aggravated for the case of microblogs, such as Twitter, where the texts are very short. Moreover, in these kinds of platforms it is common for people to use informal language or even shortened words and emotes. Nonetheless, stance detection has many real-world applications, such as opinion mining and author pro ling, and many companies and organizations are showing increasing interest in the area.

In the following section we will describe the task at hand (Section 2), explain the approach followed in the system we submitted (Section 3), and show the results obtained for the di erent system con gurations tried out during our experiments (Section 4). Finally, we will present our conclusions in Section 5. The Stance and Gender Detection in Tweets on Catalan Independence1 task is one of the task proposed at the IberEval 2017 workshop. The goal of this 1 http://stel.ub.edu/Stance-IberEval2017/ task is to detect the author's gender and the stance with respect to the topic "Independence of Catalonia" in a collection of tweets written in Spanish and Catalan. Teams are allowed to participate in either stance detection or gender and stance detection. Our team participated only in stance detection so we will omit anything regarding gender detection from this point on.

The distribution for stances for both Spanish and Catalan tweets is shown in Table 1. As can be seen in the table, the distribution along the three possible stances is not uniform, a situation that usually arises in this kind of problem. However, the bias towards some stances is very accentuated in this particular problem with 'Favor' and 'Neutral' comprising 97% of the Catalan tweets, and 'Against' and 'Neutral' comprising the 92% of the Spanish tweets. The reduced size of the corpus and the skewed distribution makes training an accurate system a challenging class.

The evaluation of the systems is performed following the metric proposed at SemEval 2016 - Task 6 [ 3 ]. The metric is a macro average of the F1-score for 'Favor' and the F1-score for 'Against'. This metric does not disregard the 'Neutral' class and will still be negatively a ected by falsely labeling 'Neutral' instances.

Favg =

Ffavor + Fagainst 2 (1) 3

System Description

This section explains our approach to the task and describes the system we submitted to the shared task. The system we developed tries to take advantage of the particularities of the problem by making use of a problem-oriented preprocessing and representation. 3.1

Preprocessing

Our method starts with a preprocessing that follows an approach similar to the one described in [ 2 ], but adding a few more steps to take into account some problem-speci c particularities. Following, the list of modi cations performed on the tweets: { Convert the text to lowercase { Remove stopwords { All URLs are replaced by keyword URL { All user mentions are replaced by keyword MENTION { Certain hashtags which could be found mostly in tweets labeled as 'Against' like '#siqueespot' and '#iceta27s' are replaced by a keyword AGAINST STANCE { Certain hashtags which could be found mostly in tweets labeled as 'Favor' like '#cup' and '#guanyemjunts' are replaced by a keyword FAVOR STANCE { All other hashtags are replaced by a keyword HASHTAG { When multiple question marks are found together they are replaced by a token MULTIPLEQUESTIONMARKS { When multiple exclamation marks are found together they are replaced by a token MULTIPLEEXCLAMATIONMARKS { When mixed question and exclamantion marks are found together like ' !?', they are replaced by a token MIXEDMARKS { Given that the collection of tweets takes place during the regional elections, many percentages can be found in the tweets which may indicate objectivity. As such, we decided to replace such percentages with a keyword PERCENTAGE { All punctuation marks save for exclamation and questions marks are removed

Following these steps a tweet like 'As estan las cosas con el 60% escrutado. #eleccionescatalanas #27S http://t.co/XlLf8RqRZ8' is converted to 'asi estan cosas PERCENTAGE escrutado HASHTAG HASHTAG URL' 3.2

Features and Representation

The system uses a model based on TF-IDF vectors and two separate sets of features. The rst set of features is built from unigrams extracted from the preprocessed tweet text. For this set of features only the top 2500 most frequent unigrams from the vocabulary extracted from the tweet texts are used. A second set of features built from hashtags is also used, in this case using only the top 500 most frequent ones. Both sets of features are extracted separately for tweets in Catalan and tweets in Spanish.

Although the sets of features is extracted separately, the feature vector for each tweet is built by concatenating the vectors from both Catalan and Spanish features as shown in Figure 1. This representation aims to depict the heavy bias found in the distribution of stances in two ways. First, a tweet in Catalan would be less likely to hold an against stance while a tweet in Spanish would rarely hold an in favor stance. By clearly di erentiating two parts in the feature vector we want to transmit this bias to the classi er. Second, because Catalan and Spanish have many words in common, the same word may be used to transmit di erent postures towards the topic. This also applies to hashtags, and thus, it is important to di erentiate if a word or hashtag was used in a language or another. Two di erent systems were submitted to the workshop, one based on support vector machine (SVM) and one based on arti cial neural networks (ANN). The system using the SVM classi er implements a one-vs-rest multiclass strategy and uses a linear kernel. On the other hand, the ANN-based system implements a 3 hidden layer fully-connected network with 2000 neurons on the rst layer, 1000 neurons on the second, and 500 on the last. All layers use ReLu as activation function (save for the output layer which uses SoftMax), and dropout of 40%. 4

Experiments

During the development phase, we tried out di erent sets of features, representations and parameters and evaluated them using 5-fold cross-validation. Table 2 displays the results obtained for each experiment, measuring the F1-score for each stance and the o cial metric used in the task evaluation (Favg) obtained through equation 1.

S0 is a system trained using only unigrams as features and basic preprocessing but implementing the representation depicted in Figure 1. We used this as base for the other systems developed and as baseline to evaluate them. Next, we present a brief explanation of each of the systems evaluated: { S1: In an attempt to evaluate whether our 2-part feature vector added any value over a common representation, we decided to treat the Catalan and Spanish datasets as two di erent subtasks. As such, we trained two separate systems using only the dataset corresponding to the language in question. Results show that our representation achieves better results, also because a common feature vector allows us to train using double the data. { S2: This system uses an extra feature indicating if the text contains any hashtag associated to a stance, i.e., hashtags that can only be found in tweets labeled with one particular stance. This approach didn't render good results, but replacing these hashtags with a keyword during the preprocessing provided a slight improvement. { S3 and S4: These are the systems described in Section 3, and the ones submitted to the shared task. Both use the same feature vector but S3 employs a SVM while S4 an ANN. { S5: This last system is built upon S3, but adds some new features in the form of length of tweet in words and initial unigrams [ 1 ].

As can be seen in table 2 (bolded numbers) we got the best results using support vector machine (SVM) for Catalan and arti cial neural networks (ANN) for Spanish.

These systems allowed us to classify as the third team for Catalan and the fth for Spanish in the shared task [ 4 ]. Results obtained over the test set can be seen in table 3. S1 systems are based on SVM and s2 on ANN. We provide the system and team ranking for each task. 5

Conclusion

In this paper we describe our submission to the task Stance and Gender Detection in Tweets on Catalan Independence at IberEval 2017. The problem considered in this task has many particularities which we tried to take advantage of by developing a system that relies heavily in preprocessing and representation. SVM and ANN have o ered similar results, our guess it is that with more data to train those di erences will grow and we will have the possibility to try new architectures.

1. Anand , P. , Walker , M. , Abbott , R. , Tree , J.E.F. , Bowmani , R. , Minor , M. : Cats rule and dogs drool!: Classifying stance in online debate . In: Proceedings of the 2Nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis . pp. 1 { 9 ( 2011 )

2. Krejzl , P. , Steinberger , J.: UWB at semeval -2016 task 6: Stance detection . In: Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2016 , San Diego, CA, USA, June 16-17, 2016 . pp. 408 { 412 ( 2016 )

3. Mohammad , S. , Kiritchenko , S. , Sobhani , P. , Zhu , X. , Cherry , C. : Semeval-2016 task 6: Detecting stance in tweets . In: Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2016 , San Diego, CA, USA, June 16-17, 2016 . pp. 31 { 41 ( 2016 )

4. Taule , M. , Mart , M.A. , Rangel , F. , Rosso , P. , Bosco , C. , Patti , V. : Overview of the task of stance and gender detection in tweets on catalan independence at ibereval 2017 . In: Notebook Papers of 2nd SEPLN Workshop on Evaluation of Human Language Technologies for Iberian Languages (IBEREVAL) , Murcia, Spain, September 19, CEUR Workshop Proceedings ( 2017 )