=Paper=
{{Paper
|id=Vol-3027/paper57
|storemode=property
|title=Segmentation of Seismic Images
|pdfUrl=https://ceur-ws.org/Vol-3027/paper57.pdf
|volume=Vol-3027
|authors=Ekaterina Tolstaya,Anton Egorov
}}
==Segmentation of Seismic Images==
<pdf width="1500px">https://ceur-ws.org/Vol-3027/paper57.pdf</pdf>
<pre>
Segmentation of Seismic Images
Ekaterina Tolstaya 1 and Anton Egorov 1
1
 Aramco Research Center, Moscow, Aramco Innovations LLC, Leninskie Gory, 1 bld. 75b, Moscow, 119234,
Russia

                Abstract
                In this paper we propose a method of seismic facies labeling. Given the three-dimensional
                image cube of seismic sounding data, labeled by a geologist, we first train on the part of the
                cube, then we propagate labels to the rest of the cube. We use open-source fully annotated 3D
                geological model of the Netherlands F3 Block. We apply state-of-the-art deep network
                architecture, adding on top a 3D fully connected conditional random field (CRF) layer. This
                allows to get smoother labels on data cube cross-sections. Pseudo labeling technique is used to
                overcome training data scarcity and predict more reliable labels for geological units. Additional
                data augmentation allows also to enlarge training dataset. The results show superior network
                performance over existing baseline model.

                Keywords 1
                Seismic Facies Labeling, UNet, Domain Specific Augmentation, Pseudo Labels.

1. Introduction
    Seismic sounding of the Earth gives some insight on the geological structure of the formation and
allows predicting the presence of oil or gas traps inside. Usually, seismic sounding is carried out at the
early stage of potential reservoir exploration, to mark prospective locations of exploration and
production wells.
    Seismic facies labeling task consists of assigning specific geological rock types to the seismic data.
When done manually, this work is tedious and time-consuming. That is why automation of this
procedure can save a lot of time and effort of geologists. A lot of work is already invested in the
automation of seismic data labeling. The recent trend includes using modern approaches based on deep
learning and semantic segmentation.

2. Prior work
    A lot of researcher efforts are now aimed at automating the process of seismic labeling task ([1], [2],
[3]).
    One of the first attempts of seismic image analysis and geological features detection (gas chimneys
and faults) with use of neural network was made by [4]. The authors applied image processing
techniques to analyze seismic data represented by sections of 3D cubes containing information related
to seismic attributes – measured time, amplitude, frequency and attenuation of reflected seismic waves.
Authors of [5] applied competitive networks for labeling seismic facies known from well information
and interpolating known facies from wells to the rest of the reservoir region. Later a lot of researchers
continued this work using different network architectures.
    A newly presented by [6] UNet architecture showed higher performance in the tasks of image
segmentation. Authors of [7] compared performance of multi-layer perceptron and Unet for the task of
facies labeling. Their conclusion was that Unet performs better. This architecture was also applied to
seismic labeling by [8].

GraphiCon 2021: 31st International Conference on Computer Graphics and Vision, September 27-30, 2021, Nizhny Novgorod, Russia
EMAIL: ekaterina.tolstaya@aramcoinnovations.com (E. Tolstaya); anton.egorov@aramcoinnovations.com (A. Egorov)
ORCID: 0000-0002-8893-2683 (E. Tolstaya); 0000-0001-9139-6191 (A. Egorov)
             ©️ 2021 Copyright for this paper by its authors.
             Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
             CEUR Workshop Proceedings (CEUR-WS.org)
   In case when no labels for facies are available, unsupervised learning methods can be applied. For
example, [9] proposed to use deep convolutional autoencoders and clustering of deep-feature vectors.
This approach allows performing fast analysis of geological patterns.
   Recently self-supervised learning methods were applied on a wide range of different tasks. Starting
from natural language processing, where a lot of data is already available in textual form, this technique
made possible to train large scale neural nets. Similar technique was also applied to image data for
learning representations. This helped to pre-train neural nets for the task where only a small amount of
labeled data is available.
   To overcome the problem of labeled data scarcity, semi-supervised learning techniques were
proposed in the literature, such as
       transfer learning, where the model is first pre-trained on some data and then trained on available
   labeled data [10];
       weak labels technique, which allows learning from non-accurate labels [11];
       pseudo-labels – use a trained model to predict labels for unlabeled data, and possibly add these
   data to the training set ([12], [13]);
       meta-pseudo labels – state-of-the-art technique where two models are trained in parallel,
   teacher and student, first student learns from teacher, and after that teacher is learning on student
   outputs from class conditional probabilities of the unlabeled test set.

3. Proposed solution
3.1. Data
   In this work we consider an open-source fully annotated 3D geological model of the Netherlands F3
Block [1]. This model is based on the study of the 3D seismic data in addition to 26 well logs and
contains seismic data and labeled geological structures. Training data is an image cube of size
401×701×255 cells, test set #1 contains cube of size 201×701×255 cells. The figures below illustrate
the data:
   Figure 1 shows full seismic image with labels, Figure 2 shows spatial position of training and testing
cubes.


                                                                                                   train
                                                         inline slices                      test
Figure 1: Full seismic image cube, including train and test datasets, and labels of the image. Red
dashed line shows separation on train/test datasets
                           crossline
                   1000


                                  Test set
                                                       Train set


                                                                               inline
                    300

                            100              300                             700

Figure 2: Scheme of the dataset, view from the top

   The train and test sets contain six groups of lithostratigraphic units. Table 1 below illustrates the
percentage of present classes in the sets:

Table 1
Percentage of present classes in the sets
    Dataset            1                 2              3                4          5             6
     Train           28%               12%            48.5%             7%         3%           1.5%
      Test           19%               10%             45%              7%         17%           2%

   The test data is adjacent to the training cube, so, the slices that are closer to the boundary will be
predicted better than the ones that are further away.

3.2.    Model
   We apply the well-known UNet architecture with Efficient net B1 as backbone ([14], [15], [16]) and
five pooling levels, including scSE (Concurrent Spatial and Channel “Squeeze & Excitation”) attention
layer [17]. The model has ~9M trainable parameters.
    The model is two-dimensional, it takes patches of size 256 by 256 pixels as input and predicts the
labels of the geological structures. Additional channel with depth gradient is introduced to the seismic
image patches, so that the model does not mix up deep and shallow layers (Figure 3).


                                             b)                    c)                      d)
           a)

Figure 3: Patch-based training: a) two half patches, extracted from cube along orthogonal directions;
b) whole patch; c) augmented patch; d) depth added as an additional channel.
   Augmentations include random flips and rotations. We also add special random distortion to
simulate stretching and faults in the data (shown in Figure 4).


                   a)                                        b)                               c)

Figure 4: Random distortion of the image: a) curve for warping, with simulated layers stretch and
faults; b) initial image; c) warped image.

   As the test data cube is adjacent to the training cube, usually the quality of prediction for the slices
that are further away from train cube is poor compared to the slices that are close to train cube.
Therefore, it was decided to use the approach of pseudo-labels.
   Firstly, a model is trained. After that, a part of labels is predicted and added to the training set. Then
the model is re-trained on the expanded dataset as shown in Figure 5.

                        Make inference                                                 Re-train UNet
                            of class         Train 3D CRF         Predict part of
     Train UNet         probabilities by
                                                                                        with added
                                                 model                labels
                             UNet                                                           data


Figure 5: Steps of the proposed algorithm.

   3D step with CRF model is added to force label continuity [18], because very often inline label slices
have a lot of artifacts near the class boundaries, where the model is not confident enough in prediction,
and assigns probabilities close to 0.5 to two different classes. CRF layer helps to smooth these
discontinuities and provide smoother labels as pseudo-labels for model re-training. We also apply an
idea from ([19] and [20]), where authors propose to use LSTM block in the bottleneck of the UNet
architecture but did not gain success.
   Loss function is a combination of cross-entropy (𝐿𝐶𝐸 ), dice loss (𝐿𝑑𝑖𝑐𝑒 ) and total variation (𝐿 𝑇𝑉 ) loss
with corresponding weights:

                                 𝐿 = 𝑤𝐶𝐸 𝐿𝐶𝐸 + 𝑤𝑑𝑖𝑐𝑒 𝐿𝑑𝑖𝑐𝑒 + 𝑤𝑇𝑉 𝐿 𝑇𝑉                                     (1)

   Cross-entropy penalizes wrong predictions, it is a smooth differentiable function, which is easier to
optimize:
                                      1
                             𝐿𝐶𝐸 = − 𝑁 ∑𝐶𝑐=1 𝑤𝑐 ∑𝑁   𝑖=1 𝑡𝑖,𝑐 log(𝑝𝑖,𝑐 ) ,                         (2)

where C is number of classes, N is number of observations (e.g. pixels in the image), and 𝑝𝑖,𝑐 are output
class probabilities and 𝑡𝑖,𝑐 is target distribution for observation i and class c. All loss functions use class
weights 𝑤𝑐 , which are inversely proportional to the class sizes, with their sum equal to 1.
   Dice loss usually performs better for imbalanced classes, which is often the case for the segmentation
tasks, where background (non-object) area is larger than the area occupied by the object. Dice loss was
proposed in the paper by [21], where it was stated, that it works better than logistic loss with class
weights. We apply multi-class dice loss function 𝐿𝑑𝑖𝑐𝑒 as follows:

                                            1                     2𝑝 𝑡
                                𝐿𝑑𝑖𝑐𝑒 = 𝐶𝑁 ∑𝐶𝑐=1 𝑤𝑐 ∑𝑁         𝑖,𝑐 𝑖,𝑐
                                                     𝑖=1(1 − 𝑝 +𝑡 +𝜀 ),                                 (3)
                                                                 𝑖,𝑐   𝑖,𝑐


where ε is an additional term that solves the division-by-zero problem (it is set to be a small number).
  TV loss is computed as
                                        1
                                𝐿 𝑇𝑉 = 𝑁 ∑𝐶𝑐=1 𝑤𝑐 ∑𝑁
                                                   𝑖=1(|𝐃1 (𝑝𝑖,𝑐 )| + |𝐃2 (𝑝𝑖,𝑐 )|),                    (4)

where 𝐃1 and 𝐃2 are finite difference operations along the first and second spatial dimensions. This
term enforces the predicted probabilities to be more ‘blocky’, which potentially leads to smoother
predicted labels.
   The weights for loss terms were chosen empirically: 𝑤𝐶𝐸 = 0.25, 𝑤𝑑𝑖𝑐𝑒 = 0.65 and 𝑤𝑇𝑉 = 0.1.

3.3.    Results
   The results show improvement after the application of the pseudo labeling technique. The test dataset
was divided in two parts, after the first training round the labels of the first half of test cube were
predicted, and a half of test set was added to the training set. Then the model was re-trained. The metrics
are shown in the Table 2 below.

Table 2
Class-wise metrics
     Dataset             1                 2              3              4              5         6
    Precision          0.99              0.85            0.99           0.81           0.97      0.72
      Recall           0.96              0.98            0.96           0.91           0.92      0.98
    F1-score           0.98              0.91            0.98           0.86           0.94      0.83
    Accuracy           0.96              0.98            0.96           0.91           0.92      0.98

    The scores outperform the scores reported in the paper with the “baseline” model by [1] as well as
the paper by [22] (where mean class accuracies are 0.804 and 0.78 correspondingly, compared with our
mean class accuracy of 0.952).
    Figure 6 illustrates the labels of in-line slice 200. We can see that the re-trained model is able to
overcome the prediction imperfections of the first training round and predict smooth layers. The panel
d) illustrates maximum class probability, which can be treated as a network confidence. We can see that
predictions on class boundaries are not so confident as the ones inside the layers.
    In our work we also applied a 2.5D (or pseudo-3D) approach, taking not a single vertical slice from
the image cube, but stacking several slices (3 or 5), but it did not give any meaningful improvement
compared to the single-slice input. This conclusion agrees with the one by [23].
    The Table 3 summarizes different approaches.

Table 3
Accuracy for different algorithm variations
    Algorithm       No augment       Flip / rotate     Random          3 slices    5 slices     Pseudo
                                                        warp                                     labels
 Mean accuracy          0.880               0.913       0.919          0.896           0.897     0.952
                                  a)                                    b)


                                  c)                                      d)
Figure 6: Prediction results: a) predicted in-line slice 200, after first training round; b) predicted in-line
slice 200 after adding pseudo-labels in training set and retraining of the model; c) true labels of in-line
slice 200; d) maximum of class probability

4. Discussion
   In this work we applied the pseudo labeling technique to the seismic facies segmentation task. We
show that adding predictions of slices that are close to the training cube can improve network’s overall
performance on the test set. We also applied the TV loss component that forced the predictions to be
smooth, and the special type of non-linear warping of patches, to increase the diversity in the training
data. The results show superior performance over reported techniques.

5. References
[1] Y. Alaudah, P. Michałowicz, M. Alfarraj, G. AlRegib, A machine-learning benchmark for facies
    classification. Interpretation, 7(3), (2019), SE175-SE187. doi:10.1190/INT-2018-0249.1
[2] L. Huang, X. Dong, T.E. Clee, A scalable deep learning platform for identifying geologic features
    from seismic attributes. The Leading Edge, 36(3), (2017) 249-256. doi:10.1190/tle36030249.1
[3] Alaudah, Y. and AlRegib, G.: Weakly-supervised labeling of seismic volumes using reference
    exemplars, in: Proceedings of the 2016 IEEE International Conference on Image Processing 2016,
    pp. 4373-4377. doi:10.1109/icip.2016.7533186
[4] P. Meldahl, R. Heggland, B. Bril, P. de Groot, Identifying faults and gas chimneys using
    multiattributes and neural networks. The Leading Edge, 20(5), (2001) 474-482.
    doi:10.1190/1.1438976
[5] M. M. Saggaf, M. N. Toksöz, and M. I. Marhoon, Seismic facies classification and identification
    by competitive neural networks. Geophysics, 68(6), (2003) 1984-1999. doi:10.1190/1.1635052
[6] O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional Networks for Biomedical Image
    Segmentation. arXiv:1505.04597 (2015). URL:https://arxiv.org/abs/1505.04597
[7] H. Di, Z. Wang, G. AlRegib, Why using CNN for seismic interpretation? An investigation, in: SEG
    Technical Program Expanded Abstracts 2018, pp. 2216-2220. doi:10.1190/segam2018-2997155.1
[8] M. Alfarhan, M. Deriche, A. Maalej, Robust concurrent detection of salt domes and faults in
    seismic surveys using an improved UNet architecture. IEEE Access. (2020).
    doi:10.1109/access.2020.3043973
[9] V. Puzyrev, C. Elders, Unsupervised seismic facies classification using deep convolutional
     autoencoder. arXiv preprint arXiv:2008.01995 (2020). URL:https://arxiv.org/abs/2008.01995
[10] D. Chevitarese, D. Szwarcman, R. M. D. Silva, E. V. Brazil, Transfer learning applied to seismic
     images          classification. AAPG           Annual         and         Exhibition        (2018).
     URL:https://www.searchanddiscovery.com/documents/2018/42285chevitarese/ndx_chevitarese.p
     df
[11] Y. Alaudah, S. Gao, G. AlRegib, Learning to label seismic structures with deconvolution networks
     and weak labels, in: SEG Technical Program Expanded Abstracts 2018, pp. 2121-2125.
     doi:10.1190/segam2018-2997865.1
[12] A. Saleem, J. Choi, D. Yoon, J. Byun, Facies classification using semi-supervised deep learning
     with pseudo-labeling strategy, in: SEG Technical Program Expanded Abstracts 2019, pp. 3171-
     3175. doi:10.1190/segam2019-3216086.1
[13] Y. Babakhin, A. Sanakoyeu, and H. Kitamura, “Semi-supervised Segmentation of Salt Bodies in
     Seismic Images Using an Ensemble of Convolutional Neural Networks,” Pattern Recognition
     2019, pp. 218–231. doi:10.1007/978-3-030-33676-9_15
[14] M. Tan, Q. Le, Efficientnet: Rethinking model scaling for convolutional neural networks,
     in: Proceedings of International Conference on Machine Learning 2019, pp. 6105-6114
[15] B. Baheti, S. Innani, S. Gajre, S. Talbar, Eff-unet: A novel architecture for semantic segmentation
     in unstructured environment, in: Proceedings of the IEEE/CVF Conference on Computer Vision
     and Pattern Recognition Workshops 2020, pp. 358-359. doi:10.1109/cvprw50498.2020.00187
[16] L. D. Huynh, N. Boutry, A U-Net++ With pre-trained EfficientNet backbone for segmentation of
     diseases and artifacts in endoscopy images and videos, in: EndoCV@ ISBI 2020, pp. 13-17.
     URL:http://ceur-ws.org/Vol-2595/endoCV2020_paper_id_11.pdf
[17] A. G. Roy, N. Navab, C. Wachinger, Concurrent spatial and channel ‘squeeze & excitation’in fully
     convolutional networks, in: Proceedings of International conference on medical image computing
     and computer-assisted intervention 2018, pp. 421-429. doi:10.1007/978-3-030-00928-1_48
[18] L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L.Yuille, Semantic image segmentation
     with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062 (2014).
     URL:https://arxiv.org/abs/1412.7062
[19] A. A. Novikov, D. Major, M. Wimmer, D. Lenis, K. Bühler, Deep sequential segmentation of
     organs in volumetric medical scans. IEEE transactions on medical imaging, 38(5), (2018) 1207-
     1215. doi:10.1109/tmi.2018.2881678
[20] A. Pfeuffer, K. Schulz, K. Dietmayer, Semantic segmentation of video sequences with
     convolutional lstms, in: Proceedings of IV IEEE Intelligent Vehicles Symposium 2019, pp. 1441-
     1447. doi:10.1109/ivs.2019.8813852
[21] F. Milletari, N. Navab, S. A. Ahmadi, V-net: Fully convolutional neural networks for volumetric
     medical image segmentation, in: Proceedings of 4th international conference on 3D vision 2016,
     pp. 565-571. doi:10.1109/3dv.2016.79
[22] M. Q. Nasim, T. Maiti, A. Shrivastava, T. Singh, J. Mei, Seismic facies analysis: a deep domain
     adaptation           approach. arXiv            preprint         arXiv:2011.10510           (2020).
     URL:https://arxiv.org/abs/2011.10510
[23] M. H. Vu, G. Grimbergen, T. Nyholm, T. Löfstedt, Evaluation of multislice inputs to convolutional
     neural networks for medical image segmentation. Medical Physics, 47(12), (2020) 6216–6231.
     doi:10.1002/mp.14391

</pre>