=Paper=
{{Paper
|id=Vol-2765/151
|storemode=property
|title=SSNCSE-NLP @ EVALITA2020: Textual and Contextual Stance Detection from Tweets Using Machine Learning Approach (short paper)
|pdfUrl=https://ceur-ws.org/Vol-2765/paper151.pdf
|volume=Vol-2765
|authors=Bharathi B,Bhuvana J,Nitin Nikamanth Appiah Balaji
|dblpUrl=https://dblp.org/rec/conf/evalita/BJB20
}}
==SSNCSE-NLP @ EVALITA2020: Textual and Contextual Stance Detection from Tweets Using Machine Learning Approach (short paper)==
SSNCSE-NLP @ EVALITA2020: Textual and Contextual Stance Detection from Tweets Using Machine Learning Approach B. Bharathi, J. Bhuvana, Nitin Nikamanth Appiah Balaji Department of CSE, Sri Sivasubramaniya Nadar College of Enginnering Chennai, India (bharathib, bhuvanaj)@ssn.edu.in nitinnikamanth17099@cse.ssn.edu.in Abstract Expressing one’s stand on any matter is refereed to as stance. Recognizing the stance, the stance Opinions expressed via online social me- detection is an interesting part of Natural Lan- dia platforms can be used to analyse the guage processing that gains lots of traction nowa- stand taken by the public about any event days. Demand of automatic detection of stance is or topic. Recognizing the stand taken is found in variety of applications such as rumour de- the stance detection, in this paper an au- tection, political standpoint of public, predictions tomatic stance detection approach is pro- over election results, advertising, opinion survey posed that uses both deep learning based and so on. feature extraction and hand crafted feature This paper proposes a method that can be used extraction. BERT is used as a feature ex- for textual and contextual stance detection for traction scheme along with stylistic, struc- the task hosted by sardistance@evalita2020. The tural, contextual and community based overview of the sardistance@evalita2020 shared features extracted from tweets to build a task is given in Cignarella et al. (2020). The pro- machine learning based model. This work ceedings of the task EVALITA can be found in has used multilayer perceptron to detect Basile et al. (2020). BERT is used to perform the the stances as favour, against and neu- classification of stance from the tweets. Two mod- tral tweets. The dataset used is provided els have been constructed, where the first one will by SardiStance task with tweets in Italian classify the stance of a tweet into 3 categories as about Sardines movement. Several vari- favour, against and neutral, the second model is ants of models were built with different built to classify the tweets into same number of feature combinations and are compared classes as above by considering the additional con- against the baseline model provided by the textual information namely number of retweets, task organisers. The models with BERT number of followers, replies and quote’s relations. and the same combined with other con- textual features proven to be the best per- 2 Survey of Existing Stance Detections forming models that outperform the base- line model performance. As per the authors in Küçük and Can (2020) stance detection is related to so many NLP problems namely, emotion recognition, irony detection, sen- 1 Introduction timent analysis, rumour classification etc. In spe- In today’s era everything is in the digital form, cific the stance detection is closely related to sen- people started spending more time online to stay timental analysis of the text, which is concerned connected. We get to learn about the events about feelings such as tenderness, sadness, or nos- across the universe via online social media plat- talgia etc., whereas the stance detection needs a forms namely, Facebook, Twitter, Instagram and specific target on which the text is opined about. so on. Sharing everyone’s opinion becomes the Stance detection is similar to perspective identifi- norm of today’s digital world either towards or cation as well. against or neutral upon a particular topic or event. Stance detection can be done using learning based approaches via training and testing stages Copyright © 2020 for this paper by its authors. Use per- mitted under Creative Commons License Attribution 4.0 In- along with necessary pre-processing. These meth- ternational (CC BY 4.0). ods are categorized into machine learning based, deep learning based and ensemble based ap- encoder has enhance the model performance. proaches. Conventional machine learning ap- After pre-processing steps like stemming, stop proaches require the features to be extracted from word removal , normalization and Hashtag Pre- the text after the pre-processing operations like processing, the data are fed to five different models normalization, tokenization etc. The deep learn- such as 1-D CNN-based sentence classification, ing approaches use the pre-trained models for Target-Specific Attention Neural Network [TAN], classification of text using word embeddings like Recurrent Neural Network with Long Short Term word2vec, GloVe, ELMo, CoVe, etc., as features Memory(LSTM), SVM-based SEN Model, Two- (Sun et al., 2019). Bidirectional Encoder Rep- step SVM for reproducibility. Apart from the resentations from Transformers, BERT is one of above the authors Ghosh et al. (2019) have also the recent pre-trained models designed by Google used pre-trained BERT (Large-Uncased) model (Devlin et al., 2018), which is a bidirectional trans- for stance detection. Experiments were conducted former. using SemEval microblog dataset and text dataset In Lai et al. (2020), stance detection was done about health-related articles and applied voting in multiple languages using Stylistic, Structural, scheme for final predictions. The authors observed Affective and Contextual features and are fed to that the pre-processing enhanced the performance Linear Regression and SVM classifiers. The au- and also reported that the contextual feature will thors reported that the machine learning classifiers help to improve the stance detection further. are more efficient to classify the stance in multi- To detect the stance of tweets as one of favour, lingual dataset than the deep learning counterparts. against and none a new CNN named CCNN-ASA, In Aldayel (2019), a stance detection was made the Condensed CNN by Attention over Self- At- using the features such as on-topic content, net- tention has been designed by Mayfield (2019). work interactions, user’s preferences, online net- Self-attention based convolution module to im- work connection say Connection Networks. Ex- prove the representation of each and every word tracted features are given to the standard machine and attention-based condensation module for text learning classifier Support Vector Machine (SVM) condensation are embedded. They have exper- with linear kernel to classify the stance of tweets imented on SemEval-2016 challenge for super- into Atheism, Climate change is a real concern, vised stance detection in Twitter with three usual Hillary Clinton, Feminist movement and Legal- stances The works reported in Zhou et al. (2019) ization of abortion (LA) classes. The authors ob- ,Sen et al. (2018) , Wei and Mao (2019), Popat et served that the textual features combined with the al. (2019) are few of the other stance detection ar- network features helped in detecting the stance ticles. more accurately. 3 Proposed System A fine tuned BERT model was used for same side stance classification in Ollinger (2020). The 3.1 Dataset Description authors have used both base and Large models for The dataset hosted by SardiStance has tweets in binary classification and reported that the Large Italian language about Sardines movement. The model has outperformed the other one. They also total tweets are about 3,242 instances out of have observed that longer input sequences are pre- which, training set has 2,132 and testing will have dicted well when compared with the smaller ones 1,110. The three stances are Against, Favor and with a precision of 0.85. Neutral about the Sardines movement with 1,028, Bi-directional Recurrent Neural Networks 589, 515 instances respectively. (RNNs) (Borges et al., 2019) along with other fea- tures were used for the fake news identification. 3.2 Model Construction Sentence encoder for the headlines and document The models are built in Python and used GPU sys- encoder for the content of the news were used tem with NVIDIA GTX1080 for running the ex- along with the common features extracted by periments. The features are extracted from the combining the headlines and the body of the Italian tweets about Sardines movement to con- news. The four stances detected are Agree, struct the model and the same is evaluated for per- Disagree, Unrelated and Discusses. The authors formance using the tweets meant for testing. have reported that the pre-training the sentence Feature engineering in our work includes both via the explicit features and also using a deep contextual features have been built for stance de- learning model that does the same. We have used tection. Along with that, to explore the combined the pre-trained deep learning model BERT to col- feature space each of the above mentioned fea- lect the features that provides a sequence of vec- tures have combined in two and three to built mod- tors of maximum size 512 which represents the els. Totally 89 models were built to investigate the features extracted. Along with that both structural performance each of the feature is combined with and stylistic features are also extracted from the other one and used for training the MLP classifier. training instances of the Italian tweets. And 147 variants of classifiers were constructed Stylistic features considered in our proposed by combining three features together. work are as follows: unigram is the representation Both the classifiers are iterated for 1000 times in binary of unigrams; Char-grams is the represen- with relu as its activation function in their hid- tation in binary with 2 to 5 char n-grams; Struc- den layers and adam as the optimization function tural features extracted from the Italian tweets are which is a variant of stochastic gradient descent. num-hashtag which will use the count of most fre- quently occurred hashtags of the tweet; punctu- 4 Results and Discussion ation marks considers 6 punctuation marks such Models are built after 5-fold cross validation and as !?.,; and their frequencies as numerical values; with different combinations of both deep learn- Length feature will extract the number of charac- ing based BERT and hand crafted structural, con- ters, the number of words, the average length of textual features together to investigate the perfor- the words in each tweet; mance of stance detection system. The few of the Community based features are also used as dis- best cross validation results are shown in Table 1. criminating features in our work that exhibits the The validation results show that the BERT works relationship among the tweets, comments such as well either it is used alone for feature extraction or network quote community, network reply commu- when combined with other features. In particular, nity, network retweet community, network friend when we analyze the validation results we found community. These features are vectors of nu- that the community based features contribute more merical attributes that represent the number of towards the stance detection either independently retweets, retweets with comments, number of or when combined with other textual features. friends, number of followers, count of lists, cre- The models constructed for textual and contex- ated at information and number of emojis in the tual stance detection are tested with the instances twitter bio. of the test set. Two runs were submitted for each For the textual stance detection, features such of the two tasks namely the textual stance and con- as BERT, unigram, unigram-hashtag, char-grams, textual stance detection under the name SSNCSE- num-hashtag, punctuation marks and length are NLP. Performance measures precision (P), recall extracted from the training instances. These fea- (R), and F-score (F) for the three stances such as tures are given to Multilayer Perceptron (MLP) tweet towards the Sardines movement, against the with 128 hidden layers with 512 nodes each. The movement and neutral ones are computed. training uses K-fold cross validation to fine tune A baseline model was built by the task organiz- the model parameters with K = 5 folds. ers of Sardistance using the conventional machine For the contextual stance detection, along with learning algorithm SVM with the help of uni-gram the features mentioned for the textual SD, addi- feature and has been used to compare the perfor- tional features of the tweet such as network quote mance of our models. community, network reply community, network Best results obtained are reported in Table 2, retweet community, network friend community, with macro average of F1 measure along with the user info bio, tweet info retweet, tweet info create scores for F1 for against tweets, for favour and for at were also extracted from the training instances neutral tweets classification. The baseline that was and all are fed to MLP classifier with 512 nodes in used by the task organisers was the SVM with lin- each of 128 hidden layers. The second model also ear kernel obtained the F1 average as 0.5784. The undergoes 5 fold cross validation to avoid over- Run 1 which has been built on the model using fea- fitting and selection bias problems. tures extracted by the pre-trained BERT has shown 14 different models with individual textual and a F1 score average of 0.6067 that is around 3% Models with listed features F1 score BERT 0.5763 Unigram 0.5509 chargrams 0.5734 network quote community 0.5419 bert + unigram 0.5897 bert + unigramhashtag 0.5583 bert + chargrams 0.5721 bert + numhashtag 0.5773 bert + puntuactionmarks 0.5501 bert + length 0.5226 bert + network quote community 0.6212 bert + network reply community 0.5993 bert + network retweet community 0.6086 bert + network friend community 0.6482 bert + user info bio 0.5748 bert + tweet info retweet 0.6086 bert + tweet info create at 0.5431 unigram + chargrams 0.5834 unigram + network quote community 0.5965 bert + unigram+ length 0.5813 bert + unigram+ network reply community 0.6048 bert + chargrams + network quote community 0.5853 bert + chargrams+ user info bio 0.5834 bert + network quote community + network friend community 0.6436 Table 1: Results after 5-fold cross validation Task A - Textual Stance Detection Run f-avg prec a prec f prec n recall a recall f recall n fa ff fn Baseline 0.5784 0.7549 0.3975 0.2589 0.6806 0.4949 0.2965 0.7158 0.4409 0.2764 1∗ 0.6067 0.7506 0.4245 0.2679 0.7951 0.4592 0.1744 0.7723 0.4412 0.2113 2∗ 0.5749 0.7798 0.3664 0.3196 0.6873 0.4898 0.3605 0.7307 0.4192 0.3388 Task B - Contextual Stance Detection Baseline 0.6284 0.7845 0.4506 0.3054 0.7507 0.5357 0.2965 0.7672 0.4895 0.3009 1∗ 0.6582 0.8321 0.4715 0.3508 0.7547 0.5918 0.3895 0.7915 0.5249 0.3691 2∗ 0.6556 0.8419 0.4574 0.3660 0.7466 0.6020 0.4128 0.7914 0.5198 0.3880 Table 2: Detection Results of SardiStance tasks using test data (* - Run 1 & 2 of proposed system ) more than the baseline model as shown in Table in favour tweets over the baseline model for the 2. The Run 2 has obtained a performance near to same. In order to explore all feature spaces, in the baseline that has used the char n-gram as the this work the structural, stylistic, contextual fea- feature extracted. tures are combined in different permutations and Our model for Run 1 has outperformed the validated for their performance. The best perform- baseline model in terms of precision of favour and ing models are found to be using BERT and char neutral tweets also shown a 11% increase in recall n-gram for textual stance and combinations such of against tweets over the baseline model. This as BERT along with numhashtag, network friend can be interpreted that the most of the testing in- community and BERT with network quote com- stances are identified as relevant tweet against the munity, network friend community features for Sardines movement. contextual stance detection. For the second task on contextual stance We have observed that most contributing fea- detection, our models for Run 1 and 2 have tures along with the textual features are com- performed better than the baseline model munity based features of the tweets, those meta for the same, whose F1 average is given as data serve well in discriminating the stance better. 0.6284. The Run 1 for this task has used More analysis on these features and their combi- BERT, numhashtag, network friend community nation can help in improving the performance of features whereas the run 2 has been built automatic stance detection system. Since the tweet on BERT, network quote community, net- exhibits the nature of stance a person takes on any work friend community features. event or topic also lead tot he violation of that per- This can be inferred that the additional infor- son’s privacy, which also needs to look at. mation about the Sardine tweets such as the com- munity based contextual features have contributed Acknowledgments towards the classification of the tweets. Metadata We would like to thank the SSN management for about the tweets have served in discriminating the supporting the work by sponsoring the GPU sys- stance better than the textual information of the tems for the research work. tweets themselves. 5 Conclusion References Abeer Aldayel and Walid Magdy. 2019. Your stance In this paper, we presented the suitable models is exposed! analysing possible factors for stance de- for stance detection in Italian tweets about Sar- tection on social media. Proceedings of the ACM on dine movement. The three stances considered for Human-Computer Interaction, 3(CSCW):1–20. this work are in favour of the movement, against Valerio Basile, Danilo Croce, Maria Di Maro, and Lu- and neutral. Multilayer perceptron is the clas- cia C. Passaro. 2020. EVALITA 2020: Overview of sifier used for classification of stance of tweets. the 7th Evaluation Campaign of Natural Language The deep learning pre-trained model BERT has Processing and Speech Tools for Italian. In Valerio been used to extract the features from the tweets Basile, Danilo Croce, Maria Di Maro, and Lucia C. Passaro, editors, Proceedings of Seventh Evalua- along with several stylistic, contextual and com- tion Campaign of Natural Language Processing and munity based features namely, The features are Speech Tools for Italian. Final Workshop (EVALITA extracted Unigram , Char-grams , num-hashtag , 2020), Online. CEUR.org. Length, network quote community, network re- Luı́s Borges, Bruno Martins, and Pável Calado. 2019. ply community, network retweet community, net- Combining similarity features and deep representa- work friend community, user info bio, tweet info tion learning for stance detection in the context of retweet, tweet info create at are few of the at- checking fake news. Journal of Data and Informa- tributes that are extracted to detect the stance. The tion Quality (JDIQ), 11(3):1–26. Models are trained using the dataset provided by Alessandra Teresa Cignarella, Mirko Lai, Cristina SardiStance task for textual and contextual stance Bosco, Viviana Patti, and Paolo Rosso. 2020. detections. Three of models have outperformed SardiStance@EVALITA2020: Overview of the Task on Stance Detection in Italian Tweets. In Valerio when compared against the baseline model that Basile, Danilo Croce, Maria Di Maro, and Lucia C. have used the SVM for stance detection. A max- Passaro, editors, Proceedings of the 7th Evalua- imum of 5% increase is found in precision of tion Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). CEUR- WS.org. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language under- standing. arXiv preprint arXiv:1810.04805. Shalmoli Ghosh, Prajwal Singhania, Siddharth Singh, Koustav Rudra, and Saptarshi Ghosh. 2019. Stance detection in web and social media: a comparative study. In International Conference of the Cross- Language Evaluation Forum for European Lan- guages, pages 75–87. Springer. Dilek Küçük and Fazli Can. 2020. Stance detection: A survey. ACM Computing Surveys (CSUR), 53(1):1– 37. Mirko Lai, Alessandra Teresa Cignarella, Delia Irazú Hernández Farı́as, Cristina Bosco, Viviana Patti, and Paolo Rosso. 2020. Multilingual stance detection in social media political debates. Com- puter Speech & Language, page 101075. Elijah Mayfield and Alan W Black. 2019. Stance classification, outcome prediction, and impact as- sessment: Nlp tasks for studying group decision- making. In Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science, pages 65–77. Stefan Ollinger, Lorik Dumani, Premtim Sahitaj, Ralph Bergmann, and Ralf Schenkel. 2020. Same side stance classification task: Facilitating argument stance classification by fine-tuning a bert model. arXiv preprint arXiv:2004.11163. Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, and Gerhard Weikum. 2019. Stancy: Stance clas- sification based on consistency cues. arXiv preprint arXiv:1910.06048. Anirban Sen, Manjira Sinha, Sandya Mannarswamy, and Shourya Roy. 2018. Stance classification of multi-perspective consumer health information. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, pages 273–281. Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. 2019. How to fine-tune bert for text classification? In China National Conference on Chinese Compu- tational Linguistics, pages 194–206. Springer. Penghui Wei and Wenji Mao. 2019. Modeling trans- ferable topics for cross-target stance detection. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Infor- mation Retrieval, pages 1173–1176. Shengping Zhou, Junjie Lin, Lianzhi Tan, and Xin Liu. 2019. Condensed convolution neural network by attention over self-attention for stance detection in twitter. In 2019 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE.