=Paper=
{{Paper
|id=Vol-3027/paper57
|storemode=property
|title=Segmentation of Seismic Images
|pdfUrl=https://ceur-ws.org/Vol-3027/paper57.pdf
|volume=Vol-3027
|authors=Ekaterina Tolstaya,Anton Egorov
}}
==Segmentation of Seismic Images==
Segmentation of Seismic Images Ekaterina Tolstaya 1 and Anton Egorov 1 1 Aramco Research Center, Moscow, Aramco Innovations LLC, Leninskie Gory, 1 bld. 75b, Moscow, 119234, Russia Abstract In this paper we propose a method of seismic facies labeling. Given the three-dimensional image cube of seismic sounding data, labeled by a geologist, we first train on the part of the cube, then we propagate labels to the rest of the cube. We use open-source fully annotated 3D geological model of the Netherlands F3 Block. We apply state-of-the-art deep network architecture, adding on top a 3D fully connected conditional random field (CRF) layer. This allows to get smoother labels on data cube cross-sections. Pseudo labeling technique is used to overcome training data scarcity and predict more reliable labels for geological units. Additional data augmentation allows also to enlarge training dataset. The results show superior network performance over existing baseline model. Keywords 1 Seismic Facies Labeling, UNet, Domain Specific Augmentation, Pseudo Labels. 1. Introduction Seismic sounding of the Earth gives some insight on the geological structure of the formation and allows predicting the presence of oil or gas traps inside. Usually, seismic sounding is carried out at the early stage of potential reservoir exploration, to mark prospective locations of exploration and production wells. Seismic facies labeling task consists of assigning specific geological rock types to the seismic data. When done manually, this work is tedious and time-consuming. That is why automation of this procedure can save a lot of time and effort of geologists. A lot of work is already invested in the automation of seismic data labeling. The recent trend includes using modern approaches based on deep learning and semantic segmentation. 2. Prior work A lot of researcher efforts are now aimed at automating the process of seismic labeling task ([1], [2], [3]). One of the first attempts of seismic image analysis and geological features detection (gas chimneys and faults) with use of neural network was made by [4]. The authors applied image processing techniques to analyze seismic data represented by sections of 3D cubes containing information related to seismic attributes – measured time, amplitude, frequency and attenuation of reflected seismic waves. Authors of [5] applied competitive networks for labeling seismic facies known from well information and interpolating known facies from wells to the rest of the reservoir region. Later a lot of researchers continued this work using different network architectures. A newly presented by [6] UNet architecture showed higher performance in the tasks of image segmentation. Authors of [7] compared performance of multi-layer perceptron and Unet for the task of facies labeling. Their conclusion was that Unet performs better. This architecture was also applied to seismic labeling by [8]. GraphiCon 2021: 31st International Conference on Computer Graphics and Vision, September 27-30, 2021, Nizhny Novgorod, Russia EMAIL: ekaterina.tolstaya@aramcoinnovations.com (E. Tolstaya); anton.egorov@aramcoinnovations.com (A. Egorov) ORCID: 0000-0002-8893-2683 (E. Tolstaya); 0000-0001-9139-6191 (A. Egorov) ©️ 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) In case when no labels for facies are available, unsupervised learning methods can be applied. For example, [9] proposed to use deep convolutional autoencoders and clustering of deep-feature vectors. This approach allows performing fast analysis of geological patterns. Recently self-supervised learning methods were applied on a wide range of different tasks. Starting from natural language processing, where a lot of data is already available in textual form, this technique made possible to train large scale neural nets. Similar technique was also applied to image data for learning representations. This helped to pre-train neural nets for the task where only a small amount of labeled data is available. To overcome the problem of labeled data scarcity, semi-supervised learning techniques were proposed in the literature, such as transfer learning, where the model is first pre-trained on some data and then trained on available labeled data [10]; weak labels technique, which allows learning from non-accurate labels [11]; pseudo-labels – use a trained model to predict labels for unlabeled data, and possibly add these data to the training set ([12], [13]); meta-pseudo labels – state-of-the-art technique where two models are trained in parallel, teacher and student, first student learns from teacher, and after that teacher is learning on student outputs from class conditional probabilities of the unlabeled test set. 3. Proposed solution 3.1. Data In this work we consider an open-source fully annotated 3D geological model of the Netherlands F3 Block [1]. This model is based on the study of the 3D seismic data in addition to 26 well logs and contains seismic data and labeled geological structures. Training data is an image cube of size 401×701×255 cells, test set #1 contains cube of size 201×701×255 cells. The figures below illustrate the data: Figure 1 shows full seismic image with labels, Figure 2 shows spatial position of training and testing cubes. train inline slices test Figure 1: Full seismic image cube, including train and test datasets, and labels of the image. Red dashed line shows separation on train/test datasets crossline 1000 Test set Train set inline 300 100 300 700 Figure 2: Scheme of the dataset, view from the top The train and test sets contain six groups of lithostratigraphic units. Table 1 below illustrates the percentage of present classes in the sets: Table 1 Percentage of present classes in the sets Dataset 1 2 3 4 5 6 Train 28% 12% 48.5% 7% 3% 1.5% Test 19% 10% 45% 7% 17% 2% The test data is adjacent to the training cube, so, the slices that are closer to the boundary will be predicted better than the ones that are further away. 3.2. Model We apply the well-known UNet architecture with Efficient net B1 as backbone ([14], [15], [16]) and five pooling levels, including scSE (Concurrent Spatial and Channel “Squeeze & Excitation”) attention layer [17]. The model has ~9M trainable parameters. The model is two-dimensional, it takes patches of size 256 by 256 pixels as input and predicts the labels of the geological structures. Additional channel with depth gradient is introduced to the seismic image patches, so that the model does not mix up deep and shallow layers (Figure 3). b) c) d) a) Figure 3: Patch-based training: a) two half patches, extracted from cube along orthogonal directions; b) whole patch; c) augmented patch; d) depth added as an additional channel. Augmentations include random flips and rotations. We also add special random distortion to simulate stretching and faults in the data (shown in Figure 4). a) b) c) Figure 4: Random distortion of the image: a) curve for warping, with simulated layers stretch and faults; b) initial image; c) warped image. As the test data cube is adjacent to the training cube, usually the quality of prediction for the slices that are further away from train cube is poor compared to the slices that are close to train cube. Therefore, it was decided to use the approach of pseudo-labels. Firstly, a model is trained. After that, a part of labels is predicted and added to the training set. Then the model is re-trained on the expanded dataset as shown in Figure 5. Make inference Re-train UNet of class Train 3D CRF Predict part of Train UNet probabilities by with added model labels UNet data Figure 5: Steps of the proposed algorithm. 3D step with CRF model is added to force label continuity [18], because very often inline label slices have a lot of artifacts near the class boundaries, where the model is not confident enough in prediction, and assigns probabilities close to 0.5 to two different classes. CRF layer helps to smooth these discontinuities and provide smoother labels as pseudo-labels for model re-training. We also apply an idea from ([19] and [20]), where authors propose to use LSTM block in the bottleneck of the UNet architecture but did not gain success. Loss function is a combination of cross-entropy (𝐿𝐶𝐸 ), dice loss (𝐿𝑑𝑖𝑐𝑒 ) and total variation (𝐿 𝑇𝑉 ) loss with corresponding weights: 𝐿 = 𝑤𝐶𝐸 𝐿𝐶𝐸 + 𝑤𝑑𝑖𝑐𝑒 𝐿𝑑𝑖𝑐𝑒 + 𝑤𝑇𝑉 𝐿 𝑇𝑉 (1) Cross-entropy penalizes wrong predictions, it is a smooth differentiable function, which is easier to optimize: 1 𝐿𝐶𝐸 = − 𝑁 ∑𝐶𝑐=1 𝑤𝑐 ∑𝑁 𝑖=1 𝑡𝑖,𝑐 log(𝑝𝑖,𝑐 ) , (2) where C is number of classes, N is number of observations (e.g. pixels in the image), and 𝑝𝑖,𝑐 are output class probabilities and 𝑡𝑖,𝑐 is target distribution for observation i and class c. All loss functions use class weights 𝑤𝑐 , which are inversely proportional to the class sizes, with their sum equal to 1. Dice loss usually performs better for imbalanced classes, which is often the case for the segmentation tasks, where background (non-object) area is larger than the area occupied by the object. Dice loss was proposed in the paper by [21], where it was stated, that it works better than logistic loss with class weights. We apply multi-class dice loss function 𝐿𝑑𝑖𝑐𝑒 as follows: 1 2𝑝 𝑡 𝐿𝑑𝑖𝑐𝑒 = 𝐶𝑁 ∑𝐶𝑐=1 𝑤𝑐 ∑𝑁 𝑖,𝑐 𝑖,𝑐 𝑖=1(1 − 𝑝 +𝑡 +𝜀 ), (3) 𝑖,𝑐 𝑖,𝑐 where ε is an additional term that solves the division-by-zero problem (it is set to be a small number). TV loss is computed as 1 𝐿 𝑇𝑉 = 𝑁 ∑𝐶𝑐=1 𝑤𝑐 ∑𝑁 𝑖=1(|𝐃1 (𝑝𝑖,𝑐 )| + |𝐃2 (𝑝𝑖,𝑐 )|), (4) where 𝐃1 and 𝐃2 are finite difference operations along the first and second spatial dimensions. This term enforces the predicted probabilities to be more ‘blocky’, which potentially leads to smoother predicted labels. The weights for loss terms were chosen empirically: 𝑤𝐶𝐸 = 0.25, 𝑤𝑑𝑖𝑐𝑒 = 0.65 and 𝑤𝑇𝑉 = 0.1. 3.3. Results The results show improvement after the application of the pseudo labeling technique. The test dataset was divided in two parts, after the first training round the labels of the first half of test cube were predicted, and a half of test set was added to the training set. Then the model was re-trained. The metrics are shown in the Table 2 below. Table 2 Class-wise metrics Dataset 1 2 3 4 5 6 Precision 0.99 0.85 0.99 0.81 0.97 0.72 Recall 0.96 0.98 0.96 0.91 0.92 0.98 F1-score 0.98 0.91 0.98 0.86 0.94 0.83 Accuracy 0.96 0.98 0.96 0.91 0.92 0.98 The scores outperform the scores reported in the paper with the “baseline” model by [1] as well as the paper by [22] (where mean class accuracies are 0.804 and 0.78 correspondingly, compared with our mean class accuracy of 0.952). Figure 6 illustrates the labels of in-line slice 200. We can see that the re-trained model is able to overcome the prediction imperfections of the first training round and predict smooth layers. The panel d) illustrates maximum class probability, which can be treated as a network confidence. We can see that predictions on class boundaries are not so confident as the ones inside the layers. In our work we also applied a 2.5D (or pseudo-3D) approach, taking not a single vertical slice from the image cube, but stacking several slices (3 or 5), but it did not give any meaningful improvement compared to the single-slice input. This conclusion agrees with the one by [23]. The Table 3 summarizes different approaches. Table 3 Accuracy for different algorithm variations Algorithm No augment Flip / rotate Random 3 slices 5 slices Pseudo warp labels Mean accuracy 0.880 0.913 0.919 0.896 0.897 0.952 a) b) c) d) Figure 6: Prediction results: a) predicted in-line slice 200, after first training round; b) predicted in-line slice 200 after adding pseudo-labels in training set and retraining of the model; c) true labels of in-line slice 200; d) maximum of class probability 4. Discussion In this work we applied the pseudo labeling technique to the seismic facies segmentation task. We show that adding predictions of slices that are close to the training cube can improve network’s overall performance on the test set. We also applied the TV loss component that forced the predictions to be smooth, and the special type of non-linear warping of patches, to increase the diversity in the training data. The results show superior performance over reported techniques. 5. References [1] Y. Alaudah, P. Michałowicz, M. Alfarraj, G. AlRegib, A machine-learning benchmark for facies classification. Interpretation, 7(3), (2019), SE175-SE187. doi:10.1190/INT-2018-0249.1 [2] L. Huang, X. Dong, T.E. Clee, A scalable deep learning platform for identifying geologic features from seismic attributes. The Leading Edge, 36(3), (2017) 249-256. doi:10.1190/tle36030249.1 [3] Alaudah, Y. and AlRegib, G.: Weakly-supervised labeling of seismic volumes using reference exemplars, in: Proceedings of the 2016 IEEE International Conference on Image Processing 2016, pp. 4373-4377. doi:10.1109/icip.2016.7533186 [4] P. Meldahl, R. Heggland, B. Bril, P. de Groot, Identifying faults and gas chimneys using multiattributes and neural networks. The Leading Edge, 20(5), (2001) 474-482. doi:10.1190/1.1438976 [5] M. M. Saggaf, M. N. Toksöz, and M. I. Marhoon, Seismic facies classification and identification by competitive neural networks. Geophysics, 68(6), (2003) 1984-1999. doi:10.1190/1.1635052 [6] O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv:1505.04597 (2015). URL:https://arxiv.org/abs/1505.04597 [7] H. Di, Z. Wang, G. AlRegib, Why using CNN for seismic interpretation? An investigation, in: SEG Technical Program Expanded Abstracts 2018, pp. 2216-2220. doi:10.1190/segam2018-2997155.1 [8] M. Alfarhan, M. Deriche, A. Maalej, Robust concurrent detection of salt domes and faults in seismic surveys using an improved UNet architecture. IEEE Access. (2020). doi:10.1109/access.2020.3043973 [9] V. Puzyrev, C. Elders, Unsupervised seismic facies classification using deep convolutional autoencoder. arXiv preprint arXiv:2008.01995 (2020). URL:https://arxiv.org/abs/2008.01995 [10] D. Chevitarese, D. Szwarcman, R. M. D. Silva, E. V. Brazil, Transfer learning applied to seismic images classification. AAPG Annual and Exhibition (2018). URL:https://www.searchanddiscovery.com/documents/2018/42285chevitarese/ndx_chevitarese.p df [11] Y. Alaudah, S. Gao, G. AlRegib, Learning to label seismic structures with deconvolution networks and weak labels, in: SEG Technical Program Expanded Abstracts 2018, pp. 2121-2125. doi:10.1190/segam2018-2997865.1 [12] A. Saleem, J. Choi, D. Yoon, J. Byun, Facies classification using semi-supervised deep learning with pseudo-labeling strategy, in: SEG Technical Program Expanded Abstracts 2019, pp. 3171- 3175. doi:10.1190/segam2019-3216086.1 [13] Y. Babakhin, A. Sanakoyeu, and H. Kitamura, “Semi-supervised Segmentation of Salt Bodies in Seismic Images Using an Ensemble of Convolutional Neural Networks,” Pattern Recognition 2019, pp. 218–231. doi:10.1007/978-3-030-33676-9_15 [14] M. Tan, Q. Le, Efficientnet: Rethinking model scaling for convolutional neural networks, in: Proceedings of International Conference on Machine Learning 2019, pp. 6105-6114 [15] B. Baheti, S. Innani, S. Gajre, S. Talbar, Eff-unet: A novel architecture for semantic segmentation in unstructured environment, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 2020, pp. 358-359. doi:10.1109/cvprw50498.2020.00187 [16] L. D. Huynh, N. Boutry, A U-Net++ With pre-trained EfficientNet backbone for segmentation of diseases and artifacts in endoscopy images and videos, in: EndoCV@ ISBI 2020, pp. 13-17. URL:http://ceur-ws.org/Vol-2595/endoCV2020_paper_id_11.pdf [17] A. G. Roy, N. Navab, C. Wachinger, Concurrent spatial and channel ‘squeeze & excitation’in fully convolutional networks, in: Proceedings of International conference on medical image computing and computer-assisted intervention 2018, pp. 421-429. doi:10.1007/978-3-030-00928-1_48 [18] L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L.Yuille, Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062 (2014). URL:https://arxiv.org/abs/1412.7062 [19] A. A. Novikov, D. Major, M. Wimmer, D. Lenis, K. Bühler, Deep sequential segmentation of organs in volumetric medical scans. IEEE transactions on medical imaging, 38(5), (2018) 1207- 1215. doi:10.1109/tmi.2018.2881678 [20] A. Pfeuffer, K. Schulz, K. Dietmayer, Semantic segmentation of video sequences with convolutional lstms, in: Proceedings of IV IEEE Intelligent Vehicles Symposium 2019, pp. 1441- 1447. doi:10.1109/ivs.2019.8813852 [21] F. Milletari, N. Navab, S. A. Ahmadi, V-net: Fully convolutional neural networks for volumetric medical image segmentation, in: Proceedings of 4th international conference on 3D vision 2016, pp. 565-571. doi:10.1109/3dv.2016.79 [22] M. Q. Nasim, T. Maiti, A. Shrivastava, T. Singh, J. Mei, Seismic facies analysis: a deep domain adaptation approach. arXiv preprint arXiv:2011.10510 (2020). URL:https://arxiv.org/abs/2011.10510 [23] M. H. Vu, G. Grimbergen, T. Nyholm, T. Löfstedt, Evaluation of multislice inputs to convolutional neural networks for medical image segmentation. Medical Physics, 47(12), (2020) 6216–6231. doi:10.1002/mp.14391