=Paper=
{{Paper
|id=Vol-3181/paper22
|storemode=property
|title=Deep Models for Visual Sentiment Analysis of Disaster-related Multimedia
						Content
|pdfUrl=https://ceur-ws.org/Vol-3181/paper22.pdf
|volume=Vol-3181
|authors=Khubaib Ahmad,Muhammad Asif Ayub,Kashif Ahmad,Ala Al-Fuqaha,Nasir Ahmad
|dblpUrl=https://dblp.org/rec/conf/mediaeval/AhmadAAAA21
}}
==Deep Models for Visual Sentiment Analysis of Disaster-related Multimedia
						Content==
<pdf width="1500px">https://ceur-ws.org/Vol-3181/paper22.pdf</pdf>
<pre>
    Deep Models for Visual Sentiment Analysis of Disaster-related
                        Multimedia Content
                                     Khubaib Ahmad1 , Muhammad Asif Ayub1 , Kashif Ahmad2 ,
                                                Ala Al-Fuqaha2 , Nasir Ahmad1
        1 Department of Computer Systems Engineering, University of Engineering and Technology, Peshawar, Pakistan.
         2 Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa

                                        University, Qatar Foundation, Doha, Qatar.
           {khubaibtakkar,asifayub836}@gmail.com,{kahmad,aalfuqaha}@hbku.edu.qa,n.ahmad@uetpeshawar.edu.pk

ABSTRACT                                                                                 This paper provides the details of the solutions proposed by team
This paper presents a solutions for the MediaEval 2021 task namely                   CSE-Innoverts for the visual sentiment analysis task. The task is
”Visual Sentiment Analysis: A Natural Disaster Use-case”. The task                   composed of three sub-tasks including a (i) single-label multi-class
aims to extract and classify sentiments perceived by viewers and                     classification task with three labels, a (ii) multi-label multi-class clas-
the emotional message conveyed by natural disaster-related images                    sification task with seven labels, and a (iii) multi-label multi-class
shared on social media. The task is composed of three sub-tasks                      classification task with 11 labels. In the first subtask, the partici-
including, one single label multi-class image classification subtask,                pants need to classify an image into Negative, Positive, and Neutral
and, two multi-label multi-class image classification subtasks. Both                 sentiments. In the second subtask, the proposed solution aims to
the multi-label classification tasks cover different sets of labels. In              differentiate among Joy, sadness, fear, disgust, anger, surprise, and
our proposed solutions, we mainly rely on two different state-of-                    neutral. The final subtask is composed of 11 labels including anger,
the-art models namely, Inception-v3 and VggNet-19, pre-trained on                    anxiety, craving, empathetic pain, fear, horror, joy, relief, sadness,
ImageNet. Both the pre-trained models are fine-tuned for each of                     and surprise.
the three subtasks using different strategies. Overall encouraging
results are obtained on all of the three subtasks. On the single-label
classification subtask (i.e. subtask 1), we obtained the weighted aver-              2 PROPOSED APPROACH
age F1-scores of 0.540 and 0.526 for the Inception-v3 and VggNet-19                  2.1 Methodology for Single-label Classification
based solutions, respectively. On the multi-label classification tasks                   task (subtask 1)
i.e., subtask 2 and subtask 3, the weighted F1-scores of our Inception-
                                                                                     For the first task, we mainly rely on two different Convolutional
v3 based solutions are 0.572 and 0.516, respectively. Similarly, the
                                                                                     Neural Networks (CNNs) architectures namely Inception V-3 and
weighted F1-scores of our VggNet-19 based solution on the subtask
                                                                                     VggNet based on their proven performances in similar tasks [10].
2 and subtask 3 are 0.584 and 0.495, respectively.
                                                                                     Since the available dataset is not large enough to train the models
                                                                                     from the scratch, we fine-tuned the existing models pre-trained
1    INTRODUCTION                                                                    on ImageNet dataset [4]. In the literature, generally, the models
                                                                                     pre-trained on ImageNet and Places dataset [11] are fine-tuned
Over the last few years, natural disasters analysis in social media
                                                                                     for image classification tasks. However, our choice for the current
outlets has been one of the active areas of research. During this
                                                                                     implementation is based on the better performance of the models
time several interesting solutions exploring different aspects of
                                                                                     pre-trained on the ImageNet dataset in similar tasks [10]. In this
natural disasters have been proposed [9]. Some key aspects of
                                                                                     work, the models are fine-tuned for 50 epochs using Adam optimizer
natural disasters explored in the literature include disaster detection
                                                                                     with a learning rate of 0.0001.
[8], disaster news dissemination [1], and disasters severity and
                                                                                        It is important to mention that the provided dataset is imbalanced
damage assessment [2, 7]. Some efforts on the sentiment analysis of
                                                                                     with a large number of negative samples while fewer samples are
natural disaster-related social media posts have also been reported.
                                                                                     available in the neutral class. Before fine-tuning the models, we
However, most of the efforts made in this regard are based on
                                                                                     applied an up-sampling technique to balance the dataset. Moreover,
textual information [3]. More recently, Hassan et al. [5] introduced
                                                                                     some data augmentation techniques are also employed to further
the concept of visual sentiment analysis of natural disaster-related
                                                                                     increase the training samples by cropping, rotating, and flipping
images by proposing a deep sentiment analyzer. However, the topic
                                                                                     the image patches.
is very challenging and there are several aspects of visual sentiment
analysis of natural disaster-related visual content that yet need to
be explored. As part of their efforts to further explore the topic,                  2.2    Methodology for Multi-label Classification
the authors proposed a task namely ”Visual Sentiment Analysis: A
Natural Disaster Use-case Task at MediaEval 2021” [6].
                                                                                            tasks (subtask 2 and subtask 3)
                                                                                     We used the same strategy of fine-tuning the existing pre-trained
Copyright 2021 for this paper by its authors. Use permitted under Creative Commons   state-of-the-art models for the subtask 2 and subtask 3. However,
License Attribution 4.0 International (CC BY 4.0).                                   to deal with the multi-label classification, several changes are made.
MediaEval’21, December 13-15 2021, Online
                                                                                     For instance, the top layers of the models are extended to support
MediaEval’21, December 13-15 2021, Online                                                                                               K. Ahmad et al.

Table 1: Evaluation of our proposed solutions on the devel-                  Table 2: Evaluation of our proposed solutions on the test set
opment set in terms of F1-score.                                             in terms of weighted F1-score.

                    Binary Accuracy                Weighted F1-score               Runs                Subtask 1       Subtask 2       Subtask 3
    Runs
             Subtask 1 Subtask 2 Subtask 3   Subtask 1 Subtask 2 Subtask 3
                                                                                   Run 1               0.540           0.572           0.516
    Run 1    0.722      0.664       0.675    0.714     0.588        0.479
    Run 2    0.750      0.710       0.628    0.666     0.535        0.479          Run 2               0.526           0.584           0.495
                                                                                   Highest Score       0.771           0.627           0.583


the multi-label classification tasks. Moreover, the sigmoid Cross-
Entropy loss function is used to deal with every CNN output vector           of labels. The first task aims to cover the conventional three cate-
component independently.                                                     gories/labels generally used to represent sentiments. The other two
   Similar to subtask 1, the distribution of the samples in the senti-       tasks aim to cover sets of labels more specific to natural disasters.
ment categories covered in subtask 2 and subtask 3 is not balanced.          These three sets of labels allow to explore different aspects of the
To this aim, the same strategy of up-sampling the minority classes           domain, and the task’s complexity increases by going deeper in the
is used to balance the dataset. Moreover, the data augmentation              sentiments hierarchy. For all the tasks, we rely on two state-of-the-
techniques are also employed in these subtasks.                              art deep architectures namely Inception-v3 and VggNet-19. To this
                                                                             aim, the models pre-trained on the ImageNet dataset are fine-tuned
3 RESULTS AND ANALYSIS                                                       on the development dataset.
                                                                                In the current implementations, we rely on object-level infor-
3.1 Evaluation Metric                                                        mation only by employing the models pre-trained on ImageNet
We used two different metrics for the evaluation of the proposed             dataset. We believe, scene-level features could also contribute to the
solutions. On the test set, the evaluations are carried out in terms of      task. In the future, we aim to jointly utilize both object and scene-
weighted F1-score, which is the official evaluation metric of the task.      level information for better performance on all the tasks. Moreover,
On the development set, we used binary accuracy as an additional             we aim to employ merit-based fusion schemes by considering the
metric along with the weighted F1-score. For computing the scores            contribution of the individual models to the tasks.
in the multi-label classification task, we used the default threshold
(i.e., 0.5).                                                                 REFERENCES
                                                                              [1] Kashif Ahmad, Michael Riegler, Konstantin Pogorelov, Nicola Conci,
3.2         Experimental Results on the development                               Pål Halvorsen, and Francesco De Natale. 2017. Jord: a system for col-
            set                                                                   lecting information and monitoring natural disasters by linking social
                                                                                  media with satellite imagery. In Proceedings of the 15th International
Table 1 provides the experimental results of our proposed solutions               Workshop on Content-Based Multimedia Indexing. 1–6.
on the development set in terms of F1-score. It is important to note          [2] Firoj Alam, Ferda Ofli, Muhammad Imran, Tanvirul Alam, and Umair
that our validation set in these experiments is composed of 487                   Qazi. 2020. Deep Learning Benchmarks and Datasets for Social Media
samples. As can be seen, overall better results are obtained on the               Image Classification for Disaster Response. In 2020 IEEE/ACM Interna-
single-label classification subtask 1, which is composed of three                 tional Conference on Advances in Social Networks Analysis and Mining
classes only. As we go deeper in the sentiment categories/classes                 (ASONAM). IEEE, 151–158.
hierarchy the performance of the algorithms decreases as the inter-           [3] Ghazaleh Beigi, Xia Hu, Ross Maciejewski, and Huan Liu. 2016. An
class variation decreases.                                                        overview of sentiment analysis in social media and its applications
                                                                                  in disaster relief. Sentiment analysis and ontology engineering (2016),
   As far as the performance of the models is concerned, Inception-
                                                                                  313–340.
v3 has significant improvements over VggNet-19 on subtask 1 and               [4] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei.
subtask 2 while comparable results are obtained on subtask 3.                     2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE
                                                                                  conference on computer vision and pattern recognition. Ieee, 248–255.
3.3         Experimental Results on the test set                              [5] Syed Zohaib Hassan, Kashif Ahmad, Ala Al-Fuqaha, and Nicola Conci.
                                                                                  2019. Sentiment analysis from images of natural disasters. In Interna-
Table 2 presents the official results of our proposed solutions on
                                                                                  tional Conference on Image Analysis and Processing. Springer, 104–113.
the test set. Surprisingly, overall better results are obtained on a
                                                                              [6] Syed Zohaib Hassan, Kashif Ahmad, Michael Riegler, Steven Hicks,
multi-label classification task subtask 2 for both the models. On the             Nicola Conci, Pal Halvorsen, and Ala Al-Fuqaha. 2021. Visual Senti-
other hand, similar to the development set, the least performance is              ment Analysis: A Natural Disaster Use-case Task at MediaEval 2021.
observed for both models on subtask 3. As far as the performance of               In Proceedings of the MediaEval 2021 Workshop Online.
the models is concerned, Inception-v3 based solution outperformed             [7] Nayomi Kankanamge, Tan Yigitcanlar, Ashantha Goonetilleke, and
the VggNet-19 based solution on subtask 1 and subtask 3 while                     Md Kamruzzaman. 2020. Determining disaster severity through social
comparable results are obtained on subtask 2.                                     media analysis: Testing the methodology with South East Queensland
                                                                                  Flood tweets. International journal of disaster risk reduction 42 (2020),
                                                                                  101360.
4      CONCLUSIONS AND FUTURE WORK                                            [8] Naina Said, Kashif Ahmad, Nicola Conci, and Ala Al-Fuqaha. 2021.
The challenge is composed of three tasks including a single-label                 Active learning for event detection in support of disaster analysis
and two multi-label image classification tasks with different sets                applications. Signal, Image and Video Processing (2021), 1–8.
Visual Sentiment Analysis: A Natural Disaster Use-case                                                MediaEval’21, December 13-15 2021, Online


 [9] Naina Said, Kashif Ahmad, Michael Riegler, Konstantin Pogorelov,             2018. Deep Learning Approaches for Flood Classification and Flood
     Laiq Hassan, Nasir Ahmad, and Nicola Conci. 2019. Natural disasters          Aftermath Detection.. In MediaEval.
     detection in social media and satellite imagery: a survey. Multimedia   [11] Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio
     Tools and Applications 78, 22 (2019), 31267–31302.                           Torralba. 2017. Places: A 10 million image database for scene recogni-
[10] Naina Said, Konstantin Pogorelov, Kashif Ahmad, Michael Riegler,             tion. IEEE transactions on pattern analysis and machine intelligence 40,
     Nasir Ahmad, Olga Ostroukhova, Pål Halvorsen, and Nicola Conci.              6 (2017), 1452–1464.

</pre>