=Paper=
{{Paper
|id=Vol-1787/245-249-paper-41
|storemode=property
|title=Distributed system for detection of biological contaminants
|pdfUrl=https://ceur-ws.org/Vol-1787/245-249-paper-41.pdf
|volume=Vol-1787
|authors=Valery Grishkin,Konstantin Smirnov,Nikolai Stepenko
}}
==Distributed system for detection of biological contaminants==
Distributed system for detection of biological contaminants V. M. Grishkina, K. V. Smirnovb, N. A. Stepenko c Saint Petersburg State University, 7 9 Universitetskaya nab., Saint Petersburg, 199034, Russia E-mail: a valery-grishkin@yandex.ru, b kvsmirnov@list.ru, c nick_st@mail.ru The paper proposes a distributed system for detecting the types of biological contaminants existing on ob- jects’ surfaces. The system implements biofouling detection method based on an image processing technique. The system processes a series of object images obtained in the visible and near infrared spectral ranges. One im- age in the series is marked as a base image. All images of the series are converted to one common shooting point and to one common angle. The object of interest is detected in the base image, and then the background is re- moved from all images. To recognize the type of biological contaminants, we use a pre-trained classifier based on support vector machine method. The proposed detection method has an obvious parallelism in data processing. Each image in the series, except the base, can be processed independently. Therefore it is quite easy to implement this method on a com- puting cluster. The central host of the cluster is used to implement non-parallel branches of the image-processing algorithm. These branches are namely interactive segmentation, a search for key points in the base image, and classifier training. The central host also solves the problems of data distribution among cluster nodes and of syn- chronization of nodes. Other nodes of a cluster are processing other images in a series, except the base image. They implement a search for key points, convert all images to a common shooting point, remove the back- ground, and identify types of biofouling. After processing, we form the map of pollution for each image. This map then gets sent to the storage at the central node of the cluster Keywords: parallel image processing, pattern recognition, biofouling identification. The work was supported by Saint Petersburg State University. Research grant 9.37.157.2014 © 2016 Valery M. Grishkin, Konstantin M. Smirnov, Nikolai A. Stepenko 245 Introduction Biofouling is the result of unwanted accumulation of biological substances on surfaces that are constantly exposed to aggressive environments. A traditional approach to detecting biological destruc- tors is based upon probing surfaces for subsequent laboratory research. This approach results in large effort, is expensive, and requires expert consultations. At the same time, any biological object is able to consume or reflect light rays of different bands of the spectrum. Different kinds of biological ob- jects have different spectral characteristics. This makes it possible to identify the kind of a biological object basing upon its spectral characteristics. In this article we propose a distributed system for biofouling detection, which is based upon processing object images obtained in visible and near infra- red spectral bands and allows for parallel processing of large image collections. The method for detection of biofouling This method is based upon implicit measurement of spectral characteristics of biological and non-biological substances found on an object’s surface, and a subsequent recognition of substance types [Grishkin, Kovshov, …, 2015]. A proposed method is based upon building a recognition system. It uses the assessed ratios of spectral characteristics to build a vector of informative features. We sug- gest using feature vector V consisting of 9 components. For recognition of the types of biofouling, a standard classifier is trained against several training sets of the real objects’ images. Pictures of the objects were taken in the visible and near infrared bands of the spectrum. For training, a cross- validation technique was used. For a correct evaluation of feature vectors one needs to convert all used object images to a com- mon shooting point, and to remove background from each image. On the first stage of preprocessing, we pick a base image from the collection. Each image in a collection is converted to a shooting point and an angle of the base image. To achieve that, for each image in the collection, we evaluate a homo- graphic matrix. To evaluate the homographic matrix, we look for key points in basis and current imag- es of a collection. Then, for each key point of any given image, a matching point is found in the base image. With help of these point sets, the algorithm searches for suitable values of homographic trans- formation parameters. After that, using these suitable values, a homographic transformation is applied to the current image of the series. For the search for key points we use Surf algorithm [Bay, Ess, …, 2008]. A match between key points of base and current images is found using Ransac method [Fishler, Bolles, 2003]. On a second stage of preprocessing we shall remove the background. For that sake, we shall segment all images of a collection being processed in order to find an object of interest. With help of one of the methods of interactive segmentation (we use a method called GrabCut [Rother, Kolmogo- rov, …, 2004], the object of interest is found in the base image. If the segmentation is visually con- firmed to be successful, background is removed from the base image, and a binary mask of the base image is built. Then, a logical AND is applied to all converted images of the series and a mask of the base image. This yields a new series of images with segmented objects only, without background [Grishkin, Zhabko, …, 2014]. Distributed implementation of the method The proposed method has a clear parallelism by data. Each image in the series, except for the base image, can be processed independently of all others. The method also has two non-parallel branches. One of them is bound to processing the training set of images and to classifier training. An- other branch is bound to processing the base image. Data obtained by these branches are used during parallel processing of all images in the series. 246 The classifier is trained against a pre-selected training set of images of objects having known types of biofouling on their surface. The training set consists of object images taken in visible and near infrared bands of the spectrum. As a result of preprocessing, each infrared image gets converted to a shooting point and an angle of a corresponding visible-band image. The markup of a preprocessed training set involves participation of an expert. The expert analyz- es the converted image taken in the visible band of the spectrum and pinpoints a set of small areas cor- responding to one or the other kind of a biological destructor, together with areas of the object not af- fected by biofouling. For each pinpointed area, averaged values of feature vector V are evaluated and stored into a training set of feature vectors. After processing all training sets, we create a pinpointed training series of feature vectors that will later be used for training the classifier. In our work we use a support vector machine (SVM) [Steinwart, Christmann, 2008] with RBF kernel as a classifier. Another non-parallel branch of the algorithm processes the base image only. It interactively seg- ments the base image to find an object and creates a binary mask of the segmented object. Besides that, in frames of this branch we search for key points of the base image. The key points and the mask are then used during parallel processing of images in the series. Non-parallel branches of the algorithm are shown in Fig. 1 and Fig. 2. Both these branches are executed on a local workstation that is not part of the computing cluster. Fig. 1. First non-parallel branch of algorithm Fig. 2. Second non-parallel branch of algorithm Within a parallel branch of the algorithm, each pair of images (one visible-band and one infrared) is processed by one cluster node. A processing algorithm run by each cluster node is shown in Fig. 3. While processing one certain series of images, each cluster node loads a set of key points of the base image, a mask of the segmented object, and parameters of the trained classifier. In each pair of images in the series, we find the key points. Using these key points and the key points of the base image, we evaluate a homographic matrix for subsequent conversion of the current image. Then the converted image gets segmented using an object mask. Results of this processing are images of the object in the visible and infrared bands converted to a common shooting point. Both the- se images have the background removed. For each pixel that is part of an object, we evaluate a feature vector, which is fed to a trained classifier for recognition of biofouling type. The final result is a map of biofouling on an object’s surface. 247 Fig. 3. Parallel branch of algorithm An implementation of the system Some components of a distributed system run on a computing cluster; others run on a local com- puter. The local computer hosts non-parallel branches of the image processing algorithm, namely in- teractive segmentation and the search for key points in the base image. Processing results from non- parallel branches of the algorithm are loaded into the central host of the cluster. The computing cluster hosts parallel branches of the algorithm using standard parallel computing libraries such as MPI [Jina, Jespersena, …, 2011]. The central host of the cluster solves the problems of data distribution among cluster nodes, synchronization of nodes, and is also used to store results. Other nodes of a cluster process all images in a series, except for the base image. They implement the search for key points, convert all images to a common shooting point, remove background, and identi- fy types of biofouling. After processing, the map of pollution is formed for each image. This map is sent to the storage located at the central node of the cluster. Experimental results For training the classifier, we used a series of 95 images of biological destructors taken by photo- graphing cultural heritage monuments made of stone. Prior to our experimenting, an expert classified this series into 9 classes of most dangerous biological destructors found at monuments, and 6 kinds of materials the monuments have been made of. After the training was done, at the control set of 630 images, our system was able to properly classify 92% of biological destructors and building materials. An example of the monument’s source image and its corresponding map of distribution of biological destructors are shown in Fig. 4. Execu- tion of a distributed algorithm on a computing cluster of eight nodes allowed for 7.5-times reduction in processing time compared to serial processing on a local workstation. 248 Fig. 4. Source image of monument with natural background and map of biofouling: — monument material 25%, — fungi 46%, — colonies of moss 11%, — lichens 18%. Conclusion In this paper we propose the distributed system for processing a series of object images photo- graphed in a natural environment. The system converts all the images in one series to a common shooting point and angle, segments the image of the object itself, and removes the background. The system recognizes types of biological pollutants found on the object’s surface and builds pollution dis- tribution maps for all images in the series. Experimental results show the effectiveness of the proposed recognition system in assessment of biofouling. Use of parallel processing technique allows for sub- stantial reduction in processing time for large series of images. References Grishkin V. M., Kovshov A. M., Schigorec S. B., Vlasov D. Yu., Zhabko A. P., Iakushkin O. O. A sytem for the recognition of biofouling on the surface of the monuments of cultural heritage // in 2015 International Conference on "Stability and Control Processes" in Memory of V. I. Zubov. // SCP 2015 Proceedings. — 2015. — P. 630–633. — doi:10.1109/SCP.2015.7342244 Rother C., Kolmogorov V., Blake A. GrabCut: Interactive Foreground Extraction using Iterated Graph Cuts. // ACM Transactions on Graphics (SIGGRAPH’04). — 2004. Bay H., Ess A., Tuytelaars T., Van Gool L. SURF: Speeded Up Robust Features // Computer Vision and Image Understanding (CVIU). — 2008. — Vol. 110, No. 3. — P. 346–359. Fischler M., Bolles R. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography // Comm. of the ACM 24. — 2003. — P. 381–395. Steinwart I., Christmann A. Support Vector Machines. — New York, USA: Springer-Verlag. 2008. — 610 p. Jina H., Jespersena D., Mehrotraa P., Biswasa R., Huangb L., Chapmanb B. High performance com- puting using MPI and OpenMP on multi-core parallel systems // Par. Computing. — 2011. — Vol. 37, Issue 9. — P. 562–575. Grishkin V. M., Zhabko A. P., Vlasov D. Yu., Schigorec S. B. Multiple segmentation of the image se- ries // in Int. Conf. Computer Technologies Physical and Engineering Applications (ICCTPEA), 2014. — doi:10.1109/ICCTPEA.2014.6893275 249