1. Introduction

NEURAL NETWORK APPROACH TO THE PROBLEM OF IMAGE SEGMENTATION FOR MORPHOLOGICAL STUDIES

A.V. Stadnik

0 2

O.I. Streltsova

strel@jinr.ru 0 2

D.V. Podgainy

0 2

I.A. Kolesnikоva

0 1

Yu.A. Butenko

0 2

A.V. Nechaevskiy

0 2

A.I. Anikina

0 2

T.V. Gudiev

0 3

A.I. Streltsov

0 4 0 Alexey Stadnik , Oxana Streltsova, Dmitry Podgainy, Inna Kolesnikоva, Yuri Butenko, Andrey Nechaevskiy, Anastasia Anikina, Timur Gudiev, Alexey Streltsov 1 Laboratory of Radiation Biology, JINR , 6 Joliot-Curie St., Dubna, 141980 , Russia 2 Meshcheryakov Laboratory of Information Technologies, JINR , 6 Joliot-Curie St., Dubna, 141980 , Russia 3 Regional Scientific and Educational Mathematical Center, North Ossetian State University after K. L. Khetagurov , 44-46 Vatutina, Vladikavkaz, 362025, North Ossetia - Alania , Russia 4 SAP SE , Walldorf, 69190 , Germany

2021

2743 5 9

The report presents the results on the development of the algorithmic block of the Information System (IS) for radiobiological studies, created within a joint project of MLIT and LRB JINR, in terms of solving the segmentation problem for morphological research to study the effect of ionizing radiation on biological objects. The problem of automating the morphological analysis of histological preparations is solved in the frames of the project by implementing algorithms based on a neural network approach and computer vision methods. The results of the investigations will be used in the development of the algorithmic block of the BIOHLIT information system.

morphological analysis image segmentation neural networks automated image processing

1. Introduction

Morphological data are images representing photographs of histological preparations of serial sections of the brain of laboratory animals. The task of the morphological analysis of nervous tissue embraces image segmentation, the classification of cells by the type of disorders, the assessment of structural changes in tissue. Different types of cells in this context are different degrees of damage to a neuron in the brain. As part of an experiment, scientists usually receive data on the behavioral reactions of laboratory animals in test installations, as well as on the results of a histological examination of the taken biological material. The global task of such a complex analysis is to study morphofunctional changes in the central nervous system.

The automation of the process of analyzing photographic images of histological preparations is important, since it allows one to speed up the acquisition of data, to increase their volume and reduce the likelihood of classification errors arising from the human factor. For this purpose, an information system for radiobiological studies, BIOHLIT [1], is being developed.

The main goals of the work are to extract information, to automate the process of analyzing and labeling images, extracting the maximum amount of data for research groups. Thus, a web-based service with a client-server architecture on the HybriLIT platform [2] has been implemented [3].

2. Training sample

Data for the task of training and building a working neural network model are the original image and the corresponding image markup, which can be implemented in various ways. Figure 1 illustrates a variant of the markup by an expert with markers applied to the image and a variant of the markup where objects are specified by a mask of different colors. This is equivalent to the set of binary masks required in a multiclass segmentation task.

The formulation of the segmentation problem for a neural network is to form a set of images with masks. Separate masks are formed for each type of violation.

Options for creating a training sample are:

 expert markup,  (semi-) automated marking, using the methods of adaptive threshold filtering, segmentation, search for contours.

Expert markup is the most correct, but the most time-consuming way of obtaining a training sample. However, using classical image processing techniques, such as edge search, we can extract additional information about the data and perform preliminary analysis. In some cases, this can be used to create markup. Figure 2 shows the results of an analysis of some of the images, where it can be seen that statistics on cell size can be extracted in a similar way.

3. Segmentation using a neural network approach

As a solution, the U-net neural network architecture [ 4 ] was chosen, it is well suited for image segmentation problems.

The key features of the architecture are:  Path of information passage, similar to an autoencoder (compressing and expanding path), which leads to the creation of a "bottleneck" for information passage, giving a high-quality generalization of data;  Transfer of information through the layers of the network (similar to Resnet [5]), which gives an increase in the spatial resolution, less attenuation of the gradient in the process of training the model.

For the loss function, the well-proven Dice function (dice loss) was chosen for the segmentation problem, it has the meaning of the fraction of overlap between the received and target areas. dice_loss = 2* (A ᑎ В) / (А + В) (1)

The training of neural network models and work with data were carried out on the HybriLIT heterogeneous cluster, which is part of the Multifunctional Information and Computing Complex (MICC) of the Meshcheryakov Laboratory of Information Technologies of JINR, Dubna. The platform consists of the “Govorun” supercomputer and the HybriLIT training and testing polygon. We studied models of neural networks with different capacities, but they all achieved overfitting at 80-90 epochs. At the same time, less-capacious networks settled on the best result compared to large-volume networks. A graph illustrating retraining at best is shown in [6].

In general, the result was unsatisfactory. The reasons for retraining can be both separately and several of the following factors:  Insufficient amount of data,  Systematic error in data,  Unsuccessful network architecture,  Insufficient data completeness,  Unbalanced data (by class).

For one example from the test set, the result of classification by the trained model is demonstrated in Figure 4, it is obvious that the main direction of optimization of the neural network architecture in this task turned out to be the direction of semantic segmentation, i.e. trying to separate objects from the background of the image.

4. Semantic segmentation

The analysis showed that U-net tried to select all segments in the image. Therefore, it seems reasonable to divide the original problem into two corresponding stages:  segmentation;  type classification.

At the same time, the stage of classification of the type of neuron can be carried out both using the U-net architecture with the involvement of additional information, taking into account the already obtained at the first stage of segmentation, and making decisions by segments for each of the objects.

To form a training sample for the semantic segmentation problem, the opportunity to automatically perform markup using the classical approach with the search for contours in the image with the previously applied adaptive threshold transformation was used. As part of solving this problem, a study on the stability of the solution to the segmentation problem to the presence of systematic errors in data markup was also carried out.

For this, two samples were formed:  mask of contours, which gives the search for contours in the OpenCV library;  convex contour mask.

Small-capacity networks were chosen as neural network architectures for segmentation:  simple convolutional network, without pooling layers, i.e. without reducing the spatial resolution,  autoencoder, i.e. compressive mapping, which is essentially U-net without concatenations (forwarding information through levels).

An autoencoder enables to achieve a greater generality of the model, to get rid of background remnants, to better select segments, but lose in the spatial resolution of the image, i.e. small details (such as the nuclei of neurons, for example).

The result of training both models on different sets was an almost identical classi fication result, which allows us to conclude that errors in the markup with this scheme will have an insignificant effect. In general, the autoencoder showed the best result (Fig. 6).

5. Conclusion

A number of interesting results have been acquired, a model of an autoencoder that allows for the semantic segmentation of an image has been obtained, and the presence of the stability of the obtained solution in relation to systematic errors in marking has been clarified.

Further work is aimed at solving the problem of classifying brain neurons by type and increasing the sample, including using data from open sources [7]. [2] HybriLIT heterogeneous cluster, http://hlit.jinr.ru/ [3] Yu. Butenko, D. Marov, A. Nechaevskiy, D. Podgainy Development of a Service for Conducting Radiobiological Studies on the HybriLIT Platform // Proceedings of the Workshop on Information Systems for the Radiation Biology Tasks, Dubna, Russia, June 18, 2020, pp. 26-33. http://ceurws.org/Vol-2743/26-33-paper-4.pdf [6] A. Stadnik, A. Streltsov, O. Streltsova Algorithms Based on Neural Network Approach for Image Segmentation in Research Morphofunctional Changes in the Central Nervous System // Proceedings of the Workshop on Information Systems for the Radiation Biology Tasks, Dubna, Russia, June 18, 2020, pp. 39-47 http://ceur-ws.org/Vol-2743/39-47-paper-6.pdf 2018.

Available

[4]

Olaf

Ronneberger ,

Philipp

Fischer , and Thomas Brox, U-Net: Convolutional Networks for BiomedicalImage Segmentation, Computer Science Department and BIOSS Centre for Biological Signalling Studies , University of Freiburg, Germany, arXiv:1505.04597v1 [cs.CV] 18 May 2015 [5] He , Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian ( 2015 -12-10). "Deep Residual Learning for Image Recognition" . arXiv:1512.0338