=Paper=
{{Paper
|id=Vol-2353/paper11
|storemode=property
|title=A Joint Application of Fuzzy Logic Approximation and a Deep Learning Neural Network to Build Fish Concentration Maps Based on Sonar Data
|pdfUrl=https://ceur-ws.org/Vol-2353/paper11.pdf
|volume=Vol-2353
|authors=Dmitry Glukhov,Rykhard Bohush,Juho Mäkiö,Tatjana Hlukhava
|dblpUrl=https://dblp.org/rec/conf/cmis/GlukhovBMH19
}}
==A Joint Application of Fuzzy Logic Approximation and a Deep Learning Neural Network to Build Fish Concentration Maps Based on Sonar Data==
A Joint Application of Fuzzy Logic Approximation and a
Deep Learning Neural Network to Build Fish
Concentration Maps Based on Sonar Data
Dmitry Glukhov1 [0000-0003-4983-2919], Rykhard Bohush1 [0000-0002-6609-5810],
Juho Mäkiö 2 [0000-0001-9987-7600] and Tatjana Hlukhava 1
1
Polotsk State University, Blokhina st., 29, Novopolotsk, Republic of Belarus, 211440
{d.gluhov, r.bogush, t.gluhova }@psu.by
2
University of Applied Sciences Emden/Leer, Constantiaplatz 4, Emden, Germany, D-26723
juho.maekioe@hs-emden-leer.de
Abstract. This paper proposes an effective method for obtain topographic lake
map with fish concentration based on the results of an intelligent sonar data
processing. Fuzzy logic special implementation for approximation of sonar data
is used. The mathematics apparatus of fuzzy logic provides the possibility of
flexible adjustment approximator under conditions of problem to be solved
when working with data of high dimensionality. An algorithm for obtaining fish
concentration maps based on the results of intelligent processing of the sonar
data is also proposed. The algorithm is based on the following steps: input
frame separation into overlapping blocks, blocks-processing using convolu-
tional neural networks YOLO v2, and merging extracted bounding boxes
around one object. Experimental results for fish detection and fish concentra-
tions map building are presented.
Keywords: sonar data; fish concentration; maps of lakes; fuzzy logic; convolu-
tional neural networks
1 Introduction
Modern tools for detecting underwater objects with application of ultrasound (sonars)
have become widespread in solving various applied problems. A highly specialized
class of sonars designed to study the relief of the lake bottom and search for fish are
called echo sounders.
Currently, there is a wide range of sonars from different suppliers. The most fa-
mous sonars are produced by Lowrance, Raymarine and Humminbird. Moreover,
most modern sonars have a GPS module. Such devices are called chartplotters. Chart-
plotters fix echogram data to X, Y coordinates of Mercator projection (WGS-84/UTM
coordinate system). The peculiarity of echograms is that GPS data is updated much
less frequently than ultrasonic sounding data. So, each individual act of acoustic
sounding cannot be geographically fix.
The modern echograms formats contain the Mercator projection coordinates,
changing stepwise after GPS data update. The most common echograms formats are
SLG and SL2 developed by the Lowrance company. The SL2 format is used for a
multi-beam sonar equipped with a DownScan bottom scan function and a Structure-
Scan structural scan (455kHz or 800kHz beam) and the ability to probe Primary and
Secondary beams (at frequencies 83kHz and 200kHz) simultaneously.
Now there are several software packages designed for processing sonar data:
ReefMaster by ReefMaster Software Ltd., DrDepth (currently this project was pur-
chased by Humminbird and on its basis created the program AutoCharts), Surfer,
ArcGis, GlobalMapper and others. In addition to the high cost, the most of geographic
information systems (GIS) ignore the acoustic echo information and analyze only the
water depth data. This approach does not allow to create a fish concentration map or a
presence of vegetation map or a large fish habitats map or other water body character-
istics maps, indirectly extracted from the echolocation data.
Traditionally, various methods of spatial interpolation are used to construct a to-
pographic map of the bottom from a discrete set of measurements. Geostatic estima-
tion methods, such as kriging, require a large amount of computation, but allow us to
obtain interpolations that are optimal in a certain sense.
When it comes to the processing of sonar data, it is important to note a feature that
data is fragmentary, limited, and often inadequate to obtain reliable statistical esti-
mates. The presence of uncertainty of this kind is an additional argument in favor of
soft calculations. If we regard the unknown parameter as continuous, then we can
draw a parallel between the conclusion about the value of the unknown parameter and
the approximation of the function.
The idea of applying a fuzzy logical approximator for constructing a bottom to-
pographic map follows from an analogy. The set of depth point measurements can be
considered as a system of knowledge about the properties and structure of the water
body. Each acoustic sounding can be described in terms of formal logic.
The deployment of the best tool for image recognition based on deep learning neu-
ral networks allows us to talk about the use of the echo sounder for solving new ap-
plied problems, such as tourist, ecological, nature protection, search tasks.
In [1] authors propose sliding window filters with contour detection to extract low-
level features and fishes contours on echo images. This approach cannot adapt to
various shapes of fish-schools and bottom artifacts, because filters kernels were not
specialized for complicated forms.
In [2] presented approach which use sliding window filtering to extract objects
from echo image. At first a median filter was used for noise removing, after a low
pass filter with adaptive threshold was used to separate tracks with fishes from the
background noise level, finally a perimeter filter was used to remove small regions
with echo pulses from stochastic noise and bottom-structure. Described method can
give false negative results in case complicated forms school of fishes.
Next algorithm, presented in [3] proposes convolutional neural networks (CNN)
approach to extract information about fish localization on echo image. This algorithm
uses the sonar images of moving agent obtained by forward-looking sonar. Authors
used CNN Yolo to binary classification. But in case high-resolution images and
small-sized fishes this algorithm can give a lot of false negative results. Also, school
of fishes can be missed, because it was not taken into account when learning CNN.
Traditionally, to surface a topology map building by a number of discrete meas-
ures, different kinds of interpolation methods are used. Geostationary estimation
methods [4], like kriging, have large computational costs, but they can achieve opti-
mal interpolation. In the case of echo data processing, it is important to take into ac-
count that the data are limited and frequently and the proper evaluation of static char-
acteristics by these values is hard. GPS data updates rarely than the echo data that is
why georefrencing is performing for group of echo sounding points. These groups are
not equable located in water body and that is why we perform fuzzy logic for calcula-
tions.
We propose novel approach to generate fish concentration maps based on sonar
data using CNN and that can adapt to different environment conditions. The presented
approach to detect fishes or other objects on sonar images is based on the following
steps: 1) separation of the input image into overlapping blocks; 2) blocks-processing
using CNN YOLO v2, and 3) merging extracted bounding boxes around one object.
After fish detection, to construct maps of the distribution of features along the lake,
we propose a novel method for constructing the approximation of GPS-referenced
CNN results based on the original implementation of fuzzy logic.
2 Fish detection using CNN
Deep machine learning systems provide perfect performance for object detection and
classification challenges. Object detection systems have to dedicate following contri-
butions: accuracy, precise extraction of regions of interest (RoIs) on images, and their
classification with minimal deviation and speed. Usually typical image processing
systems (optical character recognition system, fire detection video systems and oth-
ers) include the following steps: preprocessing, features extraction, classification, and
context processing [5, 6]. Machine learning systems simulating the human brain, can
solve detection and classification problem as good as or even better than the human
brain. At the same time, machine learning systems are faster in problem solving than
the human brain. Currently, CNNs are increasingly used for image processing in vari-
ous practical areas. Unlike traditional networks, CNNs provide a reduced number of
extracting parameters and as an alternative of whole image processing and can proc-
ess only extracted feature map, which takes into account the image topology and is
stable to affine transformation.
We analyzed famous neural network architectures, like AlexNet[7], Faster R-CNN
[8], CoogleNet[9], ResNet[10] and etc. These networks process whole image as the
feature map. This approach can make calculation faster, than the whole image proc-
essing. As stated in [11], YOLO is the fastest object detection system that works bet-
ter than Faster R-CNN. In [12] the authors propose YOLO v2, YOLO9000, proposing
modifications of YOLO. Better segmentation and classification was achieved by: 1)
batch normalization from [13]; 2) high resolution classifier; 3) convolutional with
anchor boxes; 3) use of k-means; 4) direct location prediction; 5) RPN usage; 6) fine-
grained features; 7) multi-scale training; and 8) novel classification model Darknet-
19.
CNN Darknet-19 has 19 convolutional and five max-pooling layers. It can distin-
guish 9000 classes. At training time, instead of fixing the input image size, the net-
work was changed in every few iterations. After ten batches YOLO v2 randomly
chooses a new image dimension size. Since this model down-sampled by a factor of
32, was pulled from the following multiples of 32: {320, 352, … , 608}. Input image
resolution was resized to that dimension and continues training.
Since the sonar moves during the scanning of the lake along a complex trajectory
with an alternating speed, it is necessary to perform the procedure of echogram nor-
malizing. For this purpose, an algorithm to convert the echogram to metric coordi-
nates along the length of the sonar track was developed. Due to the corresponding
stretching/compression of the echogram, all objects of the acoustic echo may be rep-
resented on a single scale (fig.1).
Fig. 1. Example for echogram normalization process
The input images scaled before CNN processing. This means that the tiny objects
(fishes) can be missed. To solve this problem, we decided to process patches of echo
image for precisely tiny object detection and subsequently concatenate the output
results with performing post processing actions.
We propose an effective algorithm for fish detection on sonar images based on the
following steps: input frame separating into overlapping blocks, blocks processing
using CNN YOLO v2, merging extracted bounding boxes around one object.
Input image I with sizes H×W is divided into overlapping blocks Ci,j with sizes
ch×cw, i 0, H / ch 1 , j 0, W / cw 1 . Overlap size can vary by input frame reso-
lution and percentage ratio of minimal objects sizes.
Each block goes to CNN YOLO v2 [13] in which network predict objects are lo-
calized by using sequences of convolutional filters. YOLO v2 uses convolution with
anchor boxes, like in Faster R-CNN and run k-means (k=5) clustering for getting
good priors for predicted objects. After YOLO v2 processing, we have bounding
boxes in every block Ci,j that are presented as top left corner coordinates Bi,j(x1,y1),
bottom right corner coordinates Bi,j(x2,y2), object classification, and probability value.
In the next step - blocks post-processing - the neighbor RoIs, which have combined
overlapped region located closer than 20% from blocks edge, are searched. If these
blocks are found, we calculate IoU (Intersection over Union) which describes two
regions overlapping:
B1 B 2
IOU , (1)
B1 B 2
in which B1 and B2 are regions areas.
3 Maps building based on fuzzy logic
The idea of “fuzzy logic approximator” application to bottom topographical map
building or for building feature map is based on following analogy: The assembly of
separated depth measuring may be presented as knowledge-based systems including
information about body water features and structure. The echo detection is described
with formal logic as:
IF coordinates X, Y AND time t THEN depth D,
water temperature T and other parameters.
In this work the universal adaptive approximation is presented as fuzzy logic spe-
cific realization mathematical tools evolved in [14].
If distance between two points measuring operator is L(p,pi) then, as analogy of
production system, water body fragment knowledge proliferates on neighbor frag-
ments in accordance unimodal function has maximum value in specified point. We
propose membership function in distance from nodal approximation point pi to point p
as:
1
p, pi | L( p, pi ) Llimit , (2)
L( p, pi ) n
In case application defuzzification by COG (Center Of Gravity) method, desired
feature value for unknown point p can be calculated as:
p.z
( p, p ) p .z ,
i i i
(3)
i i
In addition, we present defuzzification methods modifications which overcome
neighbor point’s influence with help of space discretization. The replacement of rules
group comes into discretization interval on one rule with maximum membership func-
tion in point p influence.
Fuzzy logic approximation, in contrast to traditional approximation methods, can
take into account several predicates and build complex conditions.
For example, we can formulate approximation condition which allows not only
depth but also information of structure of the lake bottom for correction abrupt depth
change which may arise by bottom objects (artifacts).
IF coordinates X, Y AND bottom structure without artifacts, THEN depth D, wa-
ter temperature T, and other parameters.
We also proposed modifications to the defuzzification methods to eliminate the
disproportion of the influence of nearby points. This was done by space discretizing
and replacing the influence of the rule group, included in a single discretization inter-
val, on the influence of one rule with the maximum membership function at point p.
An interesting way to eliminate the influence of the nodes location unevenness is the
angular discretization.
We introduce the operator angle(p, pi) that returns the number of the circle sector
into which the angle between the unknown point p and the approximation node pi
falls. Define the nearest point for each sector as follows:
PAangle p, pi pi | pj P, pj pi, angle p, pj angle p, pi ,
(4)
p, pi p, pj , L p, pi Llimit
The value of the unknown parameter, for example, the depth z, for an 8-sector split:
p, PA PA .z .
8
k k
p.z k 1
(5)
p, PA
8
k 1 k
If the angular discretization interval tends to 0, then the output by the center of
gravity method will look as follows:
φ p, z d ,
π
p.z π
(6)
φ p, d
π
π
where (p,) is piecewise-linear interpolation in polar coordinates of the values of
the maximal membership functions of the nearest points in the direction to points,
and z() is a piecewise-linear interpolation in polar coordinates of the depth values of
the points with the maximum membership function value.
The key difference between fuzzy logic approximation and traditional methods of
approximation is the possibility of taking into account several predicates. For exam-
ple, we formulate an approximation condition that should include both the depth in-
formation and the bottom structure information in order to eliminate the effect of
depth jumps from bottomed artifacts.
In this case predicates value needs to be normalized and we propose the following
membership function modification:
n
L( p, pi )
p, pi 1 | L( p, pi ) Llimit . (7)
Llimit
To define depth irregularity for point p as R(p), we normalize this value and esti-
mate “bottom without artifacts” model as:
m
R( p ) Rmin
R p 1 . (8)
Rmax Rmin
Then:
p.z
max min p, p , ( p) z dz .
i R
(9)
max min p, p , ( p) dz
i R
In conclusion, we get a classical logical output minimax representation about un-
known parameter value and bottom approximation with low artifact influence.
4 Experimental Results
For Yolov2 we build our own training set including about 80 000 objects. We selected
ground truth bounding boxes around RoIs manually using VOTT (Visual Object Tag-
ging Tool) software [15]. VOTT can make ground truth coordinates and convert
them into Yolo format. Using this program, we additionally created annotations files.
We predicted five classes of objects: “fish”, “grass”, “school of fish”, “predator”,
“bottom fish”. Fig. 2 depicts ground truth boxes in VOTT.
a) b) c) d) e) f) g)
h) i) k) l) m) n)
Fig. 2. Ground truth bounding boxes in VOTT: a,b,c)”fish”; d,e) “predator”;
f,g)”school of fish”; h,i)”grass”; k,l,m,n)” bottom fish”
All images were taken from the river Western Dvina and lakes in the Republic of
Belarus with a maximum depth for river of 12 meters and for lakes of 38 meters.
Double-beam (200kHz and 450 kHz) echo sonar Lowrance HOOK 4 was used. Fig. 3
depicts the resulting classification after YOLOv2 processing.
Fig. 3. Fish detection and classification results
Presented algorithm has an accuracy of 72.1% and a low percentage of false posi-
tive results in case of fish presence. However, our approach, as shown in Fig 4, cannot
properly distinguish classes “grass” and “school of fish”, especially in case similar
shapes.
Fig. 4. Example for incorrect classification
Therefore, the accuracy of the approximation increases with increasing number
and density of approximation nodes. As the number of approximation nodes in-
creases, the accuracy of approximation increases. In this case it is important that the
sonar passes over all the most complex and characteristic sections of the bottom.
A special type of track is the lake contour, imported in KMZ format from known
GIS systems, such as Google Earth. The contour is a track with zero-depth points
(Fig. 5). Similar contours are used to simulate contours of islands. It is possible to
delete incorrect points of the sonar tracks, as well as some points of the contours for
modeling the open contour of the river bed. The same approach is used to construct
the approximation by other features (Fig. 6).
Fig. 5. Example for map building taking into account the contour of the lake
Fig. 6. Example for map of fish concentration
5 Conclusion
Method for obtaining topographic maps of lakes, maps of fish concentration and a
map of predator location based on the results of intelligent sonar data processing is
presented. The presented algorithm is based on sonar images for the detection of clas-
ses “fish”, “grass”, “school of fish”, “predator”, “bottom fish”. The algorithm in-
cludes following steps: input frame separating into overlapping blocks, blocks-
processing using CNN YOLO v2, and merging extracted bounding boxes around one
object, fish concentration map building. To construct maps of the distribution of fea-
tures along the lake, we propose a novel method for constructing the approximation of
GPS-referenced CNN results based on the original implementation of fuzzy logic.
Our method has an accuracy of 72.1% and has low percentage of false positive results
in case of fish presence. To increase the accuracy, we need to significantly expand the
dataset for CNN training.
References
1. Balk H., Lindem T. Improved fish detection probability in data from split-beam sonar.
Aquatic Living Resources. 13(5): 297–303(2000) doi: 10.1016/S0990-7440(00)01079-2
2. Helge B., Torfinn L.: Improved fish detection probability in data form split-beam sonar:
https://slides.tips/improved-fish-detection-probability-in-data-form-split-beam-sonar.html
3. Kim J., Yu, SC.: Convolutional neural network-based real-time rov detection using
forward-looking sonar image. Autonomous Underwater Vehicles (AUV), IEEE/OES. pp.
396–400. (2016) doi: 10.1109/AUV.2016.7778702
4. Krivoruchko, K.: Spatial Statistical Data Analysis for GIS Users. Redlands, Esri Press,
(2011)
5. Demant, C., Garnica, C., Streicher-Abel, B. : Industrial Image Processing: Visual Quality
Control in Manufacturing. Heidelberg, Springer (2013)
6. Shiping, Y., Zhican, B., Huafeng, C., Bohush, R. and Ablameyko, S.: An effective algo-
rithm to detect both smoke and flame using color and wavelet analysis. Pattern Recogni-
tion and Image Analysis. 27(1):131-138 (2017) doi: 10.1134/S1054661817010138
7. Krizhevsky, A., Sutskever, I. and Hinton, G. E.: ImageNet classification with deep convo-
lutional neural networks. Proceedings of the 25th International Conference on Neural In-
formation Processing Systems (NIPS'12), vol. 1, pp. 1097-1105 (2012)
8. Ren, Sh., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards Real-Time Object De-
tection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence. 39(6): 1137 - 1149 (2017) doi: 10.1109/TPAMI.2016.2577031
9. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna,Z.: Rethinking the inception ar-
chitecture for computer vision. Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 27-30 June 2016, pp. 2818–2826 (2016) doi:10.1109/CVPR.2016.308
10. He, K., Zhang, X., Ren, Sh., Sun, J.: Deep Residual Learning for Image Recognition. Pro-
ceedings of IEEE Conference on Computer Vision and Pattern Recognition, 27-30 June
2016, pp. 770–778 (2016) doi: 10.1109/CVPR.2016.90
11. Redmon, J., Divvala, S. K., Girshick, R. B., Farhadi, A.:You Only Look Once, Unified,
Real-Time Object Detection. Proceedings of IEEE Conference on Computer Vision and
Pattern Recognition, 27-30 June 2016, pp. 779–788 (2016) doi: 10.1109/CVPR.2016.91
12. Redmon, J., Farhadi, A.:YOLO9000: Better, Faster, Stronger. Proceedings of IEEE Con-
ference on Computer Vision and Pattern Recognition, 21-26 July 2017, pp. 6517–6525
(2017) doi:10.1109/CVPR.2017.690
13. Ioffe, S., Szegedy, Ch.: Batch normalization: accelerating deep network training by reduc-
ing internal covariate shift. Proceedings of the 32nd International Conference on Machine
Learning Microtome Publishing, 6 -11 July 2015., pp. 448–456 (2015)
14. Glukhov, D.:Dynamic expert system by fuzzy inference rules to automations an examina-
tion of complex objects. Budownictwo i Inzynieria, Srodowiska, pp. 105–109 (1998)
15. Visual Object Tagging Tool: An electron app for building end to end Object Detection
Models from Images and Videos: https://github.com/Microsoft/VoTT