=Paper=
{{Paper
|id=Vol-3181/paper23
|storemode=property
|title=NLP Techniques for Water Quality Analysis in Social Media Content
|pdfUrl=https://ceur-ws.org/Vol-3181/paper23.pdf
|volume=Vol-3181
|authors=Muhammad Asif Ayub,Khubaib Ahmad,Kashif Ahmad,Nasir Ahmad,Ala Al-Fuqaha
|dblpUrl=https://dblp.org/rec/conf/mediaeval/AyubAAAA21
}}
==NLP Techniques for Water Quality Analysis in Social Media Content==
NLP Techniques for Water Quality Analysis in Social Media Content Muhammad Asif Ayub 1 , Khubaib Ahmad 1 , Kashif Ahmad2 , Nasir Ahmad1 , Ala Al-Fuqaha2 1 Department of Computer Systems Engineering, University of Engineering and Technology, Peshawar, Pakistan. 2 Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Qatar Foundation, Doha, Qatar. {khubaibtakkar,asifayub836}@gmail.com,{kahmad,aalfuqaha}@hbku.edu.qa,n.ahmad@uetpeshawar.edu.pk ABSTRACT This paper provides a detailed description of the methods pro- This paper presents our contributions to the MediaEval 2021 task posed by team CSE-Innoverts for the water quality analysis rep- namely ”WaterMM: Water Quality in Social Multimedia”. The task resented in the MediaEval task. The dataset provided for the task aims at analyzing social media posts relevant to water quality with covers multi-modal information including textual, visual, and meta- particular focus on the aspects like watercolor, smell, taste, and re- data. However, images are available for very few posts. Moreover, lated illnesses. To this aim, a multimodal dataset containing both tex- the majority of the available images are not relevant. Thus, we tual and visual information along with meta-data is provided. Con- mainly focus on textual information by proposing four different sidering the quality and quantity of available content, we mainly solutions as detailed in Section 2. focus on textual information by employing three different models individually and jointly in a late-fusion manner. These models in- 2 PROPOSED APPROACHES clude (i) Bidirectional Encoder Representations from Transformers In total, we submitted 4 different runs by employing three differ- (BERT), (ii) Robustly Optimized BERT Pre-training Approach (XLM- ent Neural Networks (NNs) architectures, namely BERT [3], XLM- RoBERTa), and a (iii) custom Long short-term memory (LSTM) RoBERTa [5], and LSTM, individually and jointly in a late fusion model obtaining an overall F1-score of 0.794, 0.717, 0.663 on the scheme. Run 1 is based on the late fusion where we jointly em- official test set, respectively. In the fusion scheme, all the models ployed the models by aggregating the classification scores obtained are treated equally and no significant improvement is observed in with the individual models. Figure 1 provides the block diagram the performance over the best performing individual model. of the proposed methodology for Run 1. Run 2, Run 3, and Run 4 are based on the individual models namely BERT, XLM-RoBERTa, 1 INTRODUCTION and LSTM, respectively. The details of the individual model based In recent years, social media has emerged as a valuable tool and plat- solutions are provided below. form to discuss and convey concerns over different challenges and • BERT-based Solution (Run 2): In this proposed solu- daily life issues [1]. The literature covers a diversified list of societal, tion, we rely on a pre-trained BERT model, which is fine- environmental, and technological topics, such as racism and hate tuned on the data development set provided by the task speech [6], public health [7], natural disasters and rehabilitation [8], organizers. Before proceeding with fine-tuning the model, and technological conspiracies [4], discussed in social media outlets. necessary pre-processing is performed, using Tensorflow li- More recently, there have been debates in social networks on envi- braries, to bring the data in the required form to be used for ronmental issues especially the quality of air and drinking water training the model. Since it is a binary classification task, in different parts of the world. The discussions generally revolve we used Binary Cross entropy loss function with Adaptive around the topics like strange color, smell, bad taste, and diseases Moments (Adam) optimizer. caused by polluted water. This information could help in several • XLM-RoBERTa-based Solution (Run 3): In this approach, ways. For instance, it can serve as valuable feedback for public we rely on the multilingual pre-trained XLM-RoBERTa authorities on the water distribution network. However, extracting model that is fine-tuned on the development set. As a information from such informal sources is very challenging. It is first step, the input text is tokenized in the pre-processing possible that social media posts containing water-quality-related phase. A pre-trained model is then fine-tuned on the pre- keywords do not represent discussions on polluted water. In this processed data using Adam optimizer with a binary cross- regard, Machine Learning (ML) and Natural Language Processing entropy loss function. (NLP) techniques could be employed to automatically analyze and • LSTM-based Solution (Run 4): In this approach, we rely filter out irrelevant posts. In order to explore the potential of ML on a custom LSTM model. The model is composed of three and NLP techniques in this challenging problem, a task namely ”Wa- layers including an input, LSTM, and output layer. We used terMM: Water Quality in Social Multimedia” has been introduced this model as a baseline for our experiments. However, the in the benchmark MediaEval 2021 competition [2]. model obtained encouraging results on the development Copyright 2021 for this paper by its authors. Use permitted under Creative Commons and was thus utilized in the fusion scheme. License Attribution 4.0 International (CC BY 4.0). MediaEval’21, December 13-15 2021, Online We also cleaned the data before feeding into the models by remov- ing URLs, account handles, emojis, and unnecessary punctuation. MediaEval’21, December 13-15 2021, Online M. Asif et al. Input Text Models be the low-performing models as all the models are treated equally by simply aggregating the obtained posterior probabilities. This Late Fusion limitation could be addressed by using merit-based fusion where weights are assigned to the contributing models based on the per- Predicted_Label formance of the model. Table 2: Evaluation of our proposed solutions on the test set in terms of micro precision, recall, and f1-score. Runs Precision Recall F1-Score Run 1 0.732 0.866 0.794 Run 2 0.732 0.866 0.794 Figure 1: Block diagram of the proposed methodology. Run 3 0.606 0.877 0.717 Run 4 0.565 0.801 0.663 Moreover, in all the proposed solutions, we used an up-sampling technique to balance the dataset. 4 CONCLUSIONS AND FUTURE WORK 3 RESULTS AND ANALYSIS The quantity and quality of the images associated with the social 3.1 Evaluation Metric media posts were not good enough to contribute to the task. Thus, we focused on the textual information only by employing several For the evaluation of the proposed methods, we used four different NNs based solutions. In total, four different solutions including a metrics, namely (i) accuracy, (ii) micro precision, (iii) micro recall, fusion and three individual models based solutions. In the current and (iv) micro F1-score. Precision, recall, and f1-scores are the implementation, we used a simple fusion mechanism by simply ag- official metrics while accuracy has been used as an additional metric gregating the posterior probabilities obtained with each individual for the evaluation of the methods on the development set. model. In the future, we aim to employ more sophisticated fusion schemes 3.2 Experimental Results on the Development by assigning merit-based weights to the contributing models. We Set also aim to make use of the additional information available in the Table 1 provides the experimental results of our proposed solutions form of metadata in our future fusion-based solutions. on the development set. To this aim, a separate validation set com- posed of 1,810 samples is used. Run 1 represents our fusion-based REFERENCES solutions while Run 2, Run 3, and Run 4 represent our solutions [1] Kashif Ahmad, Konstantin Pogorelov, Michael Riegler, Nicola Conci, based on the individual models namely BERT, RoBERTa, and LSTM, and Pal Halvorsen. 2019. Social media and satellites: Disaster event respectively. On the development set, overall better results are ob- detection, linking and summarization. MULTIMEDIA TOOLS AND tained with the BERT-based solution obtaining an overall F1-score APPLICATIONS 78, 3 (2019), 2837–2875. and accuracy of 0.950 and 0.929, respectively. The least performance [2] Stelios Andreadis, Ilias Gialampoukidis, Aristeidis Bozas, Anasta- in terms of F1-score and accuracy are observed for RoBERTa. sia Moumtzidou, Roberto Fiorin, Francesca Lombardo, Anastasios Karakostas, Daniele Norbiato, Stefanos Vrochidis, Michele Ferri, and Ioannis Kompatsiaris. 2021. WaterMM:Water Quality in Social Multi- Table 1: Evaluation of our proposed solutions on the develop- media Task at MediaEval 2021. In Proceedings of the MediaEval 2021 ment set in terms of precision, recall, f1-score, and accuracy. Workshop, Online. [3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Runs Precision Recall F1-Score Accuracy 2018. Bert: Pre-training of deep bidirectional transformers for language Run 1 0.950 0.925 0.938 0.914 understanding. arXiv preprint arXiv:1810.04805 (2018). Run 2 0.949 0.950 0.950 0.929 [4] Abdullah Hamid, Nasrullah Shiekh, Naina Said, Kashif Ahmad, Asma Run 3 0.862 0.900 0.881 0.836 Gul, Laiq Hassan, and Ala Al-Fuqaha. 2020. Fake news detection in social media using graph neural networks and NLP Techniques: A Run 4 0.885 0.947 0.915 0.885 COVID-19 use-case. arXiv preprint arXiv:2012.07517 (2020). [5] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoy- 3.3 Experimental Results on the Test Set anov. 2019. Roberta: A robustly optimized bert pretraining approach. Table 2 provides the official results on the test set in terms of pre- arXiv preprint arXiv:1907.11692 (2019). [6] Ariadna Matamoros-Fernández and Johan Farkas. 2021. Racism, Hate cision, recall, and f1-score. Overall better results are obtained for Speech, and Social Media: A Systematic Review and Critique. Television BERT among the individual model-based solutions while the least & New Media 22, 2 (2021), 205–224. scores are observed for the LSTM based solution. However, inter- [7] Salman Bin Naeem, Rubina Bhatti, and Aqsa Khan. 2021. An explo- estingly, no significant improvement in the performance for the ration of how fake news is taking over social media and putting public fusion-based solution over the best-performing individual models- health at risk. Health Information & Libraries Journal 38, 2 (2021), based solution has been observed. One of the possible reasons could 143–149. WaterMM: Water Quality in Social Multimedia MediaEval’21, December 13-15 2021, Online [8] Naina Said, Kashif Ahmad, Michael Riegler, Konstantin Pogorelov, detection in social media and satellite imagery: a survey. Multimedia Laiq Hassan, Nasir Ahmad, and Nicola Conci. 2019. Natural disasters Tools and Applications 78, 22 (2019), 31267–31302.