=Paper=
{{Paper
|id=Vol-1984/Mediaeval_2017_paper_2
|storemode=property
|title=The Multimedia Satellite Task at MediaEval 2017
|pdfUrl=https://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_2.pdf
|volume=Vol-1984
|authors=Benjamin Bischke,Patrick Helber,Christian Schulze,Venkat Srinivasan,Andreas Dengel,Damian Borth
|dblpUrl=https://dblp.org/rec/conf/mediaeval/BischkeHSSDB17
}}
==The Multimedia Satellite Task at MediaEval 2017==
<pdf width="1500px">https://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_2.pdf</pdf>
<pre>
                       The Multimedia Satellite Task at MediaEval 2017
                                                      Emergency Response for Flooding Events

                                      Benjamin Bischke1, 2 , Patrick Helber1, 2 , Christian Schulze1 ,
                                       Venkat Srinivasan3 , Andreas Dengel1, 2 , Damian Borth1
                         1 German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, 67663, Germany
                                                      2 Technical University of Kaiserslautern, Germany
                                         3 Department of Computer Science, Virginia Tech, VA 24061, USA

                                   benjamin.bischke@dfki.de,patrick.helber@dfki.de,christian.schulze@dfki.de
                                        svenkat@vt.edu,andreas.dengel@dfki.de,damian.borth@dfki.de

ABSTRACT                                                                         importance for this natural disaster type, our Multimedia Satellite
This paper provides a description of the MediaEval 2017 Multimedia               Tasks specifically focuses on flooding events in this year.
Satellite Task. The primary goal of the task is to extract and fuse                 One challenge when solely relying on remote sensing is the
content of events which are present in Satellite Imagery and Social              sparsity problem of satellite data over time. Due to the delayed
Media. Establishing a link from Satellite Imagery to Social Multi-               receiving time of satellite imagery and low temporal revisit time
media can yield to a comprehensive event representation which                    of a particular location by satellites, locations are often sparsely
is vital for numerous applications. Focusing on natural disaster                 sensed with missing information. In the context of natural disaster
events in this year, the main objective of the task is to leverage the           monitoring, where effects are often present at multiple locations at
combined event representation withing the context of emergency                   the same time, missing information represents a crucial problem
response and environmental monitoring. In particular, our task                   since humanitarian organizations and rescuer efforts need to rely
focuses this year on flooding events and consists of two subtasks.               on up-to-date disaster maps.
The first Disaster Image Retrieval form Social Media subtask requires               In order to overcome this problem and provide an accurate and
participants to retrieve images from Social Media which show a                   comprehensive view of the event, the objective of this task is to fuse
direct evidence of the flooding event. The second task Flood Detec-              satellite imagery with real-time multimedia content from Social Me-
tion in Satellite Images aims to extract regions in satellite images             dia. Our approach is motivated by previous work in [1, 3, 7] which
which are affected by a flooding event. Extracted content from both              demonstrated the contextual enrichment of remote-sensed events
tasks can be fused by means of the geographic information. The                   in satellite imagery by leveraging contemporary content from So-
task seeks to go beyond state-of-the-art flooding map generation                 cial Media. Our multimedia satellite task constitutes a combination
towards recent approaches in Deep-Learning while augmenting the                  of satellite image processing and social media retrieval, where the
satellite information at the same time with rich social multimedia.              particular challenges are addressed in two separate subtasks. Task
                                                                                 participants are required to retrieve images which provide direct
                                                                                 evidence of flooding event from a given set of Flickr images. Beyond
1     INTRODUCTION                                                               that, participants quantify the geospatial impact of the flooding
Recent advances in earth observation are opening up a new ex-                    events in the corresponding satellite images in form of segmenta-
citing area for exploration of satellite image data. Programs like               tion masks.
ESA Copernicus, NASA Landsat, and private companies like Plan-
etLabs or Digital Globe provide access to such imagery, for the first            2   TASK DETAILS
time. Large-scale datasets such as the EuroSAT-Dataset [4] or the                In the following, we define two tasks for our challenge.
ImageCLEFremote-Dataset [2] have emerged from these programs
and encourage research in this direction to extract meaningful in-               Disaster Image Retrieval from Social Media.
sights from this new data source. A proper analysis of these satellite           The goal of the first subtask is to retrieve all images which show
images has potential to change how agriculture, urbanization and                 direct evidence of a flooding event from social media streams, in-
environmental monitoring will be done in the future. Hand in hand                dependently of a particular event. The objective is to design an
with this development, the Multimedia Satellite Task at MediaEval                algorithm that given any collection of multimedia images and their
2017 addresses natural disaster and environmental monitoring, al-                metadata (e.g., YFCC100M, Twitter, Wikipedia, news articles) is able
lowing to raise situational awareness for such events. According                 to identify those images that are related to a flooding event. Please
to the United Nations Office for the Coordination of Humanitar-                  note, that only those images which convey a visual evidence of a
ian Affairs1 , flooding events represent currently the most often                flooding event will be considered as True Positives. Specifically, we
observed natural disaster type on our planet. Given this significant             define images showing ”unexpected high water levels in industrial,
                                                                                 residential, commercial and agricultural area“ as images providing
1 http://reliefweb.int/disasters
                                                                                 evidence of a flooding event.
                                                                                    The main challenges of this task lie in the proper discrimination
Copyright held by the owner/author(s).
MediaEval’17, 13-15 September 2017, Dublin, Ireland                              of the water levels in different areas (e.g., images showing a lake vs.
                                                                                 showing a flooded street) as well as the consideration of different
MediaEval’17, 13-15 September 2017, Dublin, Ireland                                                                              B. Bischke et al.

           Metadata              image_id, image_url, date_taken, date_uploaded, user_nsid, user_nickname, title, description,
                                 user_tags, capture_device, latitude, longitude, license_url, license_name
           Visual Features       AutoColorCorrelogram, EdgeHistogram, Color and Edge Directivity Descriptor (CEDD), Color-
                                 Layout, Fuzzy Color and Texture Histogram (FCTH), Joint Composite Descriptor (JCD), Gabor,
                                 ScalableColor, Tamura
                   Table 1: Details of provided metadata information and visual features for the DIRSM-Dataset


types of flooding events (e.g., coastal flooding, river flooding, pluvial   in Table 1. The dataset is separated with a ratio of 80/20 into the
flooding). Participants are allowed to submit 5 runs:                       following two sets:
       • Required run 1: using visual data only                                   • Development-Set contains 5,280 images, along with fea-
       • Required run 2: using metadata only                                        tures and class labels (1=evidence of a flooding event and
       • Required run 3: using metadata-visual data only fused with-                0=no evidence)
         out resources other than those provided by the organizers                • Test-Set contains 1,320 images and features
       • General run 4, 5: everything automated allowed, including          Flood-Detection in Satellite Images Dataset.
         using data from external sources (e.g. Twitter, Flickr)            The dataset for the second subtask consists satellite image patches
                                                                            which have been derived from Planet’s 4-band satellites [5]. The
Flood-Detection in Satellite Images.                                        imagery has a ground-sample distance (GSD) of 3.7 meters and an
The aim of the second subtask is to develop a method that is able to
                                                                            orthorectified pixel size of 3 meters. The data was collected from
identify regions in satellite imagery which are affected by a flooding.
                                                                            eight different flooding events between 01.06.2016 and 01.05.2017.
Participants are given a set of satellite image patches for multiple
                                                                            The image patches have the shape of 320 x 320 x 4 pixels and are
instances of flooding events along with corresponding segmenta-
                                                                            provided in the GeoTiff format. All image scenes have been pro-
tion masks for the flooding to train their models. Participants report
                                                                            jected in the UTM projection using the WGS84 datum (EPSG:3857).
for the unseen image patches a segmentation masks of the flooded
                                                                            Each image patch contains four channels with Red, Green, Blue, and
area. Participants are allowed to submit 5 runs:
                                                                            Near Infrared band information. Pixel values are represented in a
       • Required run 1, 2, 3: using satellite data only                    16 bit digital number format. The dataset is separated as follows:
       • General run 4, 5: everything automated allowed, including                 • Development-Set contains 462 image patches from six
         using data from external sources                                            locations. For each image patch we provide a segmentation
                                                                                     mask of the flooded area, extracted by human annotators
3    DATA                                                                            (0=background, 1= flooded area).
Disaster Image Retrieval from Social Media Dataset.                                • Test-Set-1 contains unseen patches extracted from the
The dataset for the first subtask consists of 6,600 Flickr images. All               same region which are present in the development set.
images were extracted from the YFCC100M-Dataset [6] which are                      • Test-Set-2 contains unseen patches extracted from a dif-
shared under Creative Commons licenses. The dataset contains one                     ferent region which are not present in the dev-set.
image per user to avoid a bias towards content from same locations
and the actively content-sharing users.                                     4   EVALUATION
   Images with the tags of flooding, flood and floods were selected         Disaster Image Retrieval from Social Media.
and additionally refined by human annotators according to the               The official metric for evaluating the correctness of retrieved im-
strength of the evidence of flooding that they depict: very strong          ages from Social Media is Average Precision at k (AP@k) at various
non-evidence of a flooding (0), non-evidence of a flooding (1), direct      cutoffs, k=50,100, 200, 300, 400, 500. The metric measures the num-
evidence of a flooding (4), very strong direct evidence of a flooding       ber of relevant images among the top k retrieved results and takes
(5), or with “don’t know” answer (3). The definition of relevance           the rank into consideration.
was available to the annotators in the interface during the entire          Flood-Detection in Satellite Images.
process. The annotation process was not time restricted. The scores         In order to assess performance of generated segmentation masks
were collected from two annotators and the final ground truth label         for flooded areas in the satellite image patches, the intersection-
was determined as flooding if both annotators rated the image with          over-union metric (Jaccard Index), is used for the official evaluation:
4 or 5 and as non flooding for scores of 0 or 1. To cover a broader         IoU = TP / (TP + FP + FN), where TP, FP, and FN are the numbers
diversity of images, we injected additional distractor images in the        of true positive, false positive, and false negative pixels, respec-
dataset.                                                                    tively, determined over the whole test set. The metric measures the
   For each image, image metadata from YFCC100M and visual                  accuracy for the pixel-wise classification.
feature descriptors are provided to participants. Visual features
were extracted with the open-source LIRE library2 using default             ACKNOWLEDGMENTS
parameter settings. A overview of the provided features is given
                                                                            We would like to thank Planet for providing us with high resolution
                                                                            satellite images for this task. Additionally, this work was supported
2 LIRE, http://www.lire-project.net/
                                                                            BMBF project MOM (Grant 01IW15002).
The Multimedia Satellite Task at MediaEval 2017                              MediaEval’17, 13-15 September 2017, Dublin, Ireland


REFERENCES
 [1] Kashif Ahmad, Michael Riegler, Ans Riaz, Nicola Conci, Duc-Tien
     Dang-Nguyen, and Pål Halvorsen. 2017. The JORD System: Linking
     Sky and Social Multimedia Data to Natural Disasters. In Proceedings
     of the 2017 ACM on International Conference on Multimedia Retrieval.
     ACM, 461–465.
 [2] Helbert Arenas, Md Bayzidul Islam, and Josiane Mothe. 2017. Overview
     of the ImageCLEF 2017 Population Estimation (Remote) Task. (2017).
 [3] Benjamin Bischke, Damian Borth, Christian Schulze, and Andreas
     Dengel. 2016. Contextual enrichment of remote-sensed events with
     social media streams. In Proceedings of the 2016 ACM on Multimedia
     Conference. ACM, 1077–1081.
 [4] Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian
     Borth. 2017. EuroSAT: A Novel Dataset and Deep Learning Bench-
     mark for Land Use and Land Cover Classification. arXiv preprint
     arXiv:1709.00029 (2017).
 [5] Planet Team. 2017. Planet Application Program Interface: In Space for
     Life on Earth, San Francisco, CA. (2017).
 [6] Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde,
     Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M:
     The New Data in Multimedia Research. Commun. ACM 59, 2 (Jan.
     2016), 64–73. https://doi.org/10.1145/2812802
 [7] Alan Woodley, Shlomo Geva, Richi Nayak, and Timothy Campbell.
     2016. Introducing the Sky and the Social Eye. In Working Notes Pro-
     ceedings of the MediaEval 2016 Workshop, Vol. 1739. CEUR Workshop
     Proceedings.

</pre>