=Paper= {{Paper |id=Vol-1984/Mediaeval_2017_paper_43 |storemode=property |title=Flood detection using Social Media Data and Spectral Regression based Kernel Discriminant Analysis |pdfUrl=https://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_43.pdf |volume=Vol-1984 |authors=Muhammad Hanif,Muhammad Atif Tahir,Mahrukh Khan,Muhammad Rafi |dblpUrl=https://dblp.org/rec/conf/mediaeval/HanifTKR17 }} ==Flood detection using Social Media Data and Spectral Regression based Kernel Discriminant Analysis== https://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_43.pdf
               Flood detection using Social Media Data and Spectral
                  Regression based Kernel Discriminant Analysis
                       Muhammad Hanif, Muhammad Atif Tahir, Mahrukh Khan, Mohammad Rafi
                                                        School of Computer Science
                            National University of Computer and Emerging Sciences, Karachi Campus, Pakistan
                                      {k153506,atif.tahir,mahrukh.khan,muhammad.rafi}@nu.edu.pk

ABSTRACT                                                                 Inverse Document Frequency (TFIDF) of every user tag is calculated
Natural disasters destroy valuable resources and are necessary to        and resultant matrix of TFIDF vectors are saved [5, 6]. The matrix
recognize so that appropriate strategies may be designed. In recent      is analyzed with SRKDA to predict the confidence of respective
past, social networks are very good source to gather event specific      class of each image, as shown in Figure 2.
information. This working notes paper is based on the task of Disas-
ter Image Retrieval from Social Media dataset (DRISM), as a part of
MediaEval, 2017. The Dataset of images and their relevant metadata
is taken from various social networks including Twitter and Flicker.
An ensemble approach is adopted in this paper where different
visual and metadata features are integrated. Kernel Discriminant
analysis using spectral regression is then used as dimensionality
reduction technique. Mean Average Precision (MAP) at various
cutoffs are reported in this paper.


1    INTRODUCTION
The solution is submitted in Multimedia Satellite Task at Media-
Eval, 2017, on the task “Disaster image retrieval from social media"
[1]. The goal of the challenge is to retrieve images, having direct      Figure 1: Data Flow diagram for visual feature processing.
evidence of flood in descending order of probability value for the
task.

2    APPROACH
Information is provided in two forms, first includes visual features,
while other includes textual metadata of each image. For process-
ing of visual data, features are ensembled to perform classification
at the later stage. Figure 1 shows our proposed approach. In this
approach, all visual features provided by organizer [1] including
AutoColor Correlation, Edge Histogram, Tamura etc are integrated.
Kernel Discriminant Analysis using Spectral Regression [2] is then
used during classification. Spectral methods have now established
                                                                         Figure 2: Data Flow diagram for metadata feature process-
as a great technique for both manifold learning and dimensionality
                                                                         ing.
reduction and [2]. Spectral Regression in combination with Kernel
Discriminant Analysis (SRKDA) has proved successful in many clas-
sification tasks such as multilabel classification, action recognition      Moreover, a combination of metadata and all visual features are
[3, 4]. Large matrix decomposition becomes less complex in this          ensembled to produce and processed through SRKDA, as shown in
method due to spectral graph analysis. We have also investigated         Figure 3.
various machine learning techniques such as Random Forest, Sup-
port Vector Machine. On validation data and using 10 Fold Cross          3     RESULTS AND ANALYSIS
Validation, the best results were obtained using SRKDA and thus          Flood prediction is performed by using datasets of visual features,
adopted in this task.                                                    metadata and combination of both visual features and metadata of
    In metadata, most valuable features of ’usertags’ are selected       each image using SRKDA algorithm.
for prediction. To enhance effectiveness, stop words are removed
from selected features. Afterwards, for each image, Term Frequency       3.1    Parameter Tunings using Training Data
Copyright held by the owner/author(s).
                                                                         The most important parameter in our approach is the selection of
MediaEval’17, 13-15 September 2017, Dublin, Ireland                      kernel. Leave one out cross validation is used to select the best ker-
                                                                         nel in KDA including cityblock, euclidean and chi-squared kernel.
MediaEval’17, 13-15 September 2017, Dublin, Ireland                                                                           M. Hanif et al.

                                                                         Table 2: Mean over Average Precisions at different cutoffs
                                                                         (50, 100, 250, 480) independently evaluated by organizers.

                                                                                                Run 1   Run 2   Run 3
                                                                                                80.98   71.79   80.84




Figure 3: Data flow diagram for fusion of metadata and vi-
sual feature processing.

Table 1: Average Precision at various cutoffs using valida-
tion data and leave one out cross validation.

         AP at various cutoffs    Visual    Meta     Fusion
                  50              0.9505   0.9617    0.9544
                 100              0.9517   0.9551    0.9539
                                                                          Figure 5: Top 4 images retrieved by the proposed system.
                 200              0.9462   0.9441    0.9492
                 300              0.9424   0.9385    0.9454
                 400              0.9386   0.9314     0.942
                 500              0.9335   0.9252    0.9374              3.2    Results on Test Data
                                                                         The proposed system is then evaluated on test data and predicted
Results of visual features are collected by tuning different parame-     values of all test images are submitted to organizers for independent
ters of SRKDA and the best results were obtained by using distance       evaluation. Using visual features, we able to obtain around average
as city block and value of Gamma as 0.05 in KDA. Average precision       precision of 0.649 at 480 cutoff point while for metadata, we able to
is calculated at different values of k = 50, 100, 200, 300 and 400 and   obtain average precision of around average precision of 0.65. The
500 as shown in Figure 4 and Table 1. Average precision were quite       fusion of visual and meta give us around 0.646 average precision
high as we able to obtained around 0.94 using fusion of visual and       which is surprising as we were expecting better results using fusion.
metadata.                                                                It is part of future work to explore the reasons behind poor perfor-
                                                                         mance of fusion. Table 2 shows mean average precisions at different
                                                                         cutoffs. The best results were obtained using visual features which
                                                                         are around 80.98%. Figure 5 shows top 4 images correctly retrieved
                                                                         by the system. Future work aims to identify images that are not
                                                                         correctly retrieved by the system and investigate deep learning
                                                                         approaches to improve the overall system.
                                                                             The experiments using SRKDA, indicates that fusion of visual
                                                                         and textual features produce better results on validation set. But on
                                                                         test data, the performance drops using fusion of visual and textual
                                                                         features. Particularly for metadata, prediction can be improved
                                                                         by using all provided metadata information including title and
                                                                         description.

                                                                         4     CONCLUSION
                                                                         In this paper, we have presented our runs on Disaster Image Re-
Figure 4: Best results found for visual, metadata and fusion             trieval from Social Media task (DRISM). SRKDA technique is inves-
of visual / metadata. Leave one out cross validation is used             tigated to train and test the model using ensemble of 6 different
to select best parameters on training data.                              features. Our proposed system able to obtain around 0.81 mean
                                                                         average precisions at different cutoffs (50, 100, 250, 480).
Multimedia Satellite Task                                                        MediaEval’17, 13-15 September 2017, Dublin, Ireland


REFERENCES
[1] Bischke, Benjamin and Helber, Patrick and Schulze, Christian and
    Venkat, Srinivasan and Dengel, Andreas and Borth, Damian, 2017. “The
    Multimedia Satellite Task at MediaEval 2017: Emergence Response
    for Flooding Events", Proc. of the MediaEval 2017 Workshop, Dublin,
    Ireland
[2] Cai Deng, He Xiaofei, and Han Jiawei, 2011. “Speed up kernel discrimi-
    nant analysis", The VLDB Journal, 20(1), 21–33.
[3] Tahir Muhammad Atif, Fei Yan, Peter Koniusz, Muhammad Awais, Mark
    Barnard, Krystian Mikolajczyk, Ahmed Bouridane, and Josef Kittler,
    2013. A robust and scalable visual category and action recognition
    system using kernel discriminant analysis with spectral regression.
    IEEE Transactions on Multimedia, 15(7).
[4] Tahir Muhammad Atif, Josef Kittler and Ahmed Bouridane, 2016. ”Multi-
    label classification using stacked spectral kernel discriminant analysis."
    Neurocomputing 171, 127-137.
[5] Bramer, Max, 2013. ”Principles of Data Mining." Springer-Verlag Lon-
    don.
[6] John, Vineet, and Olga Vechtomova, 2017. “Sentiment Analysis on
    Financial News Headlines using Training Dataset Augmentation", Pro-
    ceedings of the 11th International Workshop on Semantic Evaluation.