=Paper=
{{Paper
|id=Vol-2670/MediaEval_19_paper_64
|storemode=property
|title=MediaEval2019: Flood Detection in Time Sequence Satellite Images
|pdfUrl=https://ceur-ws.org/Vol-2670/MediaEval_19_paper_64.pdf
|volume=Vol-2670
|authors=Pallavi Jain,Bianca Schoen-Phelan,Robert Ross
|dblpUrl=https://dblp.org/rec/conf/mediaeval/0003SR19
}}
==MediaEval2019: Flood Detection in Time Sequence Satellite Images==
<pdf width="1500px">https://ceur-ws.org/Vol-2670/MediaEval_19_paper_64.pdf</pdf>
<pre>
       MediaEval2019: Flood Detection in Time Sequence Satellite
                               Images
                                                Pallavi Jain, Bianca Schoen-Phelan, Robert Ross
                                                             Technological University Dublin
                                                                      Dublin, Ireland
                                               {pallavi.jain,bianca.schoenphelan,robert.ross}@tudublin.ie

ABSTRACT                                                                      NDWI struggles to separate built up areas from water bodies, as
In this work, we present a flood detection technique from time series         NDWI and built-up falls in same range [20] of reflectance values.
satellite images for the City-centered satellite sequences (CCSS) task        Considering the built-up area issue and incapability of shallow
in the MediaEval 2019 competition [1]. This work utilises a three             water detection using NDWI, a combination of two indices has
channel feature indexing technique [13] along with a VGG16 pre-               been proposed [15]. The combination of the NDWI water index
trained model for automatic detection of floods. We also compared             along with an index using Blue and NIR bands to highlight shallow
our result with RGB images and a modified NDWI technique by                   water along with water bodies. Similarly, Li et al.,2017 [13] proposed
Mishra et al, 2015 [15]. The result shows that the three channel              a three channel feature index for supervised learning. In this work
feature indexing technique performed the best with VGG16 and                  they leveraged the three indexes being NDVI, NDWI, and RE-NDWI
is a promising approach to detect floods from time series satellite           and combined them to create 3 channel images instead of one like
images.                                                                       RGB [10]. All these indexing techniques are capable of mapping
                                                                              water bodies. Consequently, we assume that these can also be useful
                                                                              in flood water mapping. This could be helpful to rescue teams and
1    INTRODUCTION                                                             provide an improved understanding of disaster situations and areas.
Flooding is the most common natural disaster event, which affects             As these processes are mostly manual, automating them can be
people every year all around the world. In most cases, it directly            hugely helpful in order to have accurate information in a timely
impacts human life and damages properties. In recent years, many              manner.
techniques have been developed to organise rescue operations in                  Lately, Deep Convolutional Neural Networks (CNNs) such as
such events in more efficient ways. Flood mapping through satellite           AlexNet [11], VGG16 [17], have performed very well in many do-
images is one such area where a lot of research has been conducted            mains such as speech recognition, image classification and natural
aiming to monitor floods and perform timely risk analysis [2, 3, 5,           language processing. Remote sensing has also become a widely
18].                                                                          popular area where deep CNNs have shown good performance [16].
   Sentinel-2 provides high resolution multi-spectral images, with            However, in order to train the CNN models with a large number
13 bands for emergency services, which can also be useful to moni-            of layers, a significant amount of data is required. This is one of
tor and analyse the flooding situation. Each of these bands high-             the main challenges in the domain of flood detection. At the same
lights a certain geological features like water, land or clouds. Each         time it has been shown that transfer learning or pre-trained deep
band offers a different reflectance and absorbance property which             CNNs can be a strong option for automating flood detection [8].
can be exploited for flood detection and monitoring.                          Among the deep CNNs, VGG16 has shown great performance pre-
   Among the 12 bands, visible range bands Red, Green and Blue                viously in many image classification tasks like object detection,
create a true colour image. These images can map floods and stand-            image segmentation and scene classification [7].
ing water but often suffer from cloud or building shadows which                  Flood water is mostly a shallow water body, and difficult to detect
prevents accurate mapping. For that reason several water index                due to built-up area or cloud shadows. In this work we propose
techniques have been proposed in order to reduce the effects of               that if each type of feature such as vegetation, water or clouds are
shadows and expose appropriate water values. The near infrared                separated efficiently, it can be trained using a pre-trained deep CNN,
(NIR) band highly absorbs water reflectance and reflects vegeta-              which is capable to automate the process of flood detection in time
tion. This property of NIR has made it a popular choice in the past           series satellite imagery.
in order to extract water bodies from images. For that reason the
normalised difference water index (NDWI) was introduced [14],                 2 APPROACH
which leverages NIR and the green band as shown in equation 1.
NDWI maximises water features and minimises all other features.               2.1 Image Processing
Leveraging this particular water indexing technique resulted in the              2.1.1 Run 1. As shallow water is difficult to map in remote
development of many improvements in recent years [6, 20].                     sensing images due to built-up areas, a combination of water index
                                        Green − N IR                          techniques has been proposed in the past [15]. In this approach
                           N DW I =                                   (1)     NDWI is used along with Blue and NIR band indexing as shown in
                                        Green + N IR
                                                                              equation 2
Copyright 2019 for this paper by its authors. Use
permitted under Creative Commons License Attribution                                                     Green − N IR Blue − N IR
4.0 International (CC BY 4.0).                                                            ModN DW I =                +                          (2)
                                                                                                         Green + N IR Blue + N IR
MediaEval’19, 27-29 October 2019, Sophia Antipolis, France
MediaEval’19, 27-29 October 2019, Sophia Antipolis, France                                                                                  Jain et al.


   2.1.2 Run 2. For this run we used true colour images, that is
three channel RGB composite images with Red, Green and Blue
bands.
   2.1.3 Run 3. For this run we leveraged the three-channel index
feature space approach [13]. The images are processed to NDVI [eq.
3] that uses NIR and Red bands, NDWI [eq. 1], and Red Edge NDWI
(RE-NDWI) [eq. 4], which uses green and red edge (RE) vegetation
band. All three of them are then combined horizontally to create
a three-channel images like RGB. This approach highlights the
individual properties of vegetation, water and clouds.                                         Figure 1: Model Architecture
                                    Red − N IR
                         N DV I =                                     (3)
                                    Red + N IR                              loss function. The model is trained for approx. 30 epochs depending
                                     Green − RE                             on best performance of each processed images.
                     RE_N DW I =                                      (4)
                                     Green + RE
                                                                            3    RESULTS
2.2    Model                                                                For the evaluation of the model we used micro average F1 Score,
The VGG16 network is one of the most popular deep CNN’s for                 as mentioned in the competition evaluation task [1]. Also, image
image classification and object detection [7, 8, 12]. It consists of 13     data had imbalance classes, due to which accuracy measure can be
convolutional layers and 3 fully-connected layers. We leveraged             misleading, for that reason F1 score is an appropriate evaluation
the pre-trained VGG16 network, which is trained on the ImageNet             metric as it provides balance score of precision and recall.
dataset [4]. Initial layers only extract the general features, and task        The result shown in table 1, which clearly show that the averag-
specific features are extracted by the later layers. We froze the initial   ing of images can provide good performance in order to detect if
4 blocks and leveraged the last block for our task.                         a city is flooded. Additionally, the 3-dimensional feature indexing
                                                                            technique outperforms the true colour RGB and Mod-NDWI [15]
2.3    Experiment                                                           by approximately 3% in both development and test results.
The 12 band data was provided by MediaEval 2019 under subtask
City-centered satellite sequences (CCSS) of the multimedia satellite                     Table 1: Development and Test Results
task [1]. It consists of 267 sets of sequences in the development
dataset and 68 sets in the test dataset. For the training and testing                              Run      Dev F1      Test F1
of the model we split the development dataset into 80% training set,                               Run 1    0.963       0.897
10% validation set, and another 10% development test set. Data had                                 Run 2    0.963       0.941
imbalance class, so we used stratified sampling by class to split the                              Run 3    1.00        0.970
data into train, test, and validation datasets. We also used an image
augmentation technique for training datasets by shifting, rotating
and flipping the images and achieved the boost of approximately
                                                                            4    CONCLUSION
2-4%.
    VGG16 was originally trained to work on 3-channel image data            In this work, we explored the automatic detection of floods in an
like RGB. However, Mod-NDWI creates a single-channel images                 area for sequence of time series images. We used a pixel based
like greyscale, which we consequently converted into a 3-channel            averaging approach on RGB, Modified NDWI and a three-channel
image assuming identical values for each input channel.                     feature indexing technique along with deep CNNs model VGG16.
    For processing the time series image, we used a pixel based             The results pointed towards significant improvements in flood de-
technique. For that, we created the average image of each set of            tection when using a three-channel feature index. Furthermore, it
sequence images after individual image processing and fed those to          appears that the averaging technique is efficient in detection of
the VGG16 model. The average image modifies only the changed                flood in the city over the time period.
values due to change in image while keeping unchanged values the
same. The changed values in average image possibly be influenced            REFERENCES
by cloud coverage or atmospheric changes. But as each changed                [1] Benjamin Bischke, Patrick Helber, Erkan Basar, Simon Brugman,
value is due to the different features, it might be distinguishable              Zhengyu Zhao, and Konstantin Pogorelov. The Multimedia Satel-
                                                                                 lite Task at MediaEval 2019: Flood Severity Estimation. In Proc. of the
from change due to water values.
                                                                                 MediaEval 2019 Workshop (Oct. 27-29, 2019). Sophia Antipolis, France.
    These averaged images are then fed to the VGG16 model with
                                                                             [2] Miles A Clement, CG Kilsby, and P Moore. 2018. Multi-temporal syn-
frozen 4 blocks and unfrozen last block for our task. The VGG16                  thetic aperture radar flood mapping using change detection. Journal
network is then followed by a flatten layer, dense layer of 128 unit,            of Flood Risk Management 11, 2 (2018), 152–168.
and softmax layer. We also used a dropout [19] of 0.5 to avoid over-         [3] Roberto Cossu, Elisabeth Schoepfer, Philippe Bally, and Luigi Fusco.
fitting and the ReLU activation function. The Adam optimiser [9]                 2009. Near real-time SAR-based processing to support flood monitor-
with learning rate of 5e-6 has been used with binary cross entropy               ing. Journal of Real-Time Image Processing 4, 3 (2009), 205–218.
The Multimedia Satellite Task: Flood Severity Estimation                       MediaEval’19, 27-29 October 2019, Sophia Antipolis, France


 [4] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei.
     2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE
     conference on computer vision and pattern recognition. Ieee, 248–255.
 [5] Dieu Anh Dinh, B Elmahrad, Patrick Leinenkugel, and Alice New-
     ton. 2019. Time series of flood mapping in the Mekong Delta using
     high resolution satellite images. In IOP Conference Series: Earth and
     Environmental Science, Vol. 266. IOP Publishing, 012011.
 [6] Gudina L Feyisa, Henrik Meilby, Rasmus Fensholt, and Simon R Proud.
     2014. Automated Water Extraction Index: A new technique for surface
     water mapping using Landsat imagery. Remote Sensing of Environment
     140 (2014), 23–35.
 [7] Gang Fu, Changjun Liu, Rong Zhou, Tao Sun, and Qijian Zhang. 2017.
     Classification for high resolution remote sensing imagery using a fully
     convolutional network. Remote Sensing 9, 5 (2017), 498.
 [8] Fan Hu, Gui-Song Xia, Jingwen Hu, and Liangpei Zhang. 2015. Trans-
     ferring deep convolutional neural networks for the scene classification
     of high-resolution remote sensing imagery. Remote Sensing 7, 11 (2015),
     14680–14707.
 [9] Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic
     optimization. arXiv preprint arXiv:1412.6980 (2014).
[10] Sascha Klemenjak, Björn Waske, Silvia Valero, and Jocelyn Chanussot.
     2012. Unsupervised river detection in RapidEye data. In 2012 IEEE
     International Geoscience and Remote Sensing Symposium. IEEE, 6860–
     6863.
[11] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Im-
     agenet classification with deep convolutional neural networks. In
     Advances in neural information processing systems. 1097–1105.
[12] Erzhu Li, Junshi Xia, Peijun Du, Cong Lin, and Alim Samat. 2017.
     Integrating multilayer features of convolutional neural networks for
     remote sensing scene classification. IEEE Transactions on Geoscience
     and Remote Sensing 55, 10 (2017), 5653–5665.
[13] Na Li, Arnaud Martin, and Rémi Estival. 2017. An automatic water de-
     tection approach based on Dempster-Shafer theory for multi-spectral
     images. In 2017 20th International Conference on Information Fusion
     (Fusion). IEEE, 1–8.
[14] Stuart K McFeeters. 1996. The use of the Normalized Difference Water
     Index (NDWI) in the delineation of open water features. International
     journal of remote sensing 17, 7 (1996), 1425–1432.
[15] Kshitij Mishra and P Prasad. 2015. Automatic extraction of water
     bodies from Landsat imagery using perceptron model. Journal of
     Computational Environmental Sciences 2015 (2015).
[16] Keiller Nogueira, Waner O Miranda, and Jefersson A Dos Santos. 2015.
     Improving spatial feature representation from aerial scenes by using
     convolutional networks. In 2015 28th SIBGRAPI Conference on Graphics,
     Patterns and Images. IEEE, 289–296.
[17] Karen Simonyan and Andrew Zisserman. 2015. Very deep convolu-
     tional networks for large-scale image recognition. In 3rd International
     Conference on Learning Representations, ICLR 2015, San Diego, CA, USA,
     May 7-9, 2015.
[18] Sergii Skakun, Nataliia Kussul, Andrii Shelestov, and Olga Kussul.
     2014. Flood hazard and flood risk assessment using a time series of
     satellite images: A case study in Namibia. Risk Analysis 34, 8 (2014),
     1521–1537.
[19] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever,
     and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent
     neural networks from overfitting. The journal of machine learning
     research 15, 1 (2014), 1929–1958.
[20] Hanqiu Xu. 2006. Modification of normalised difference water index
     (NDWI) to enhance open water features in remotely sensed imagery.
     International journal of remote sensing 27, 14 (2006), 3025–3033.

</pre>