=Paper=
{{Paper
|id=Vol-2608/paper59
|storemode=property
|title=Optical-electronic system of automatic detection and higt-precise tracking of aerial objects in real-time
|pdfUrl=https://ceur-ws.org/Vol-2608/paper59.pdf
|volume=Vol-2608
|authors=Ihor Shostko,Andrew Teviashev,Yuliia Kulia,Anton Koliadin
|dblpUrl=https://dblp.org/rec/conf/cmis/ShostkoTKK20
}}
==Optical-electronic system of automatic detection and higt-precise tracking of aerial objects in real-time==
<pdf width="1500px">https://ceur-ws.org/Vol-2608/paper59.pdf</pdf>
<pre>
    Optical-Electronic System of Automatic Detection and
    High-Precise Tracking of Aerial Objects in Real-Time
            Igor Shostko [0000-0002-5612-3080], Andriy Tevyashev [0000-0001-5261-9874],
              Yuliia Kulia [0000-0001-6541-7913], Anton Koliadin [0000-0001-5552-5080]

     Kharkiv National University of Radio Electronics, Kharkiv, 14 Nauky Ave., UKRAINE
                  ihor.shostko@nure.ua, tad45ua@gmail.com,
               yuliia.kulia@nure.ua, anton.koliadin@nure.ua


       Abstract. The article describes the results of the development of digital video
       processing technology in the visible and infrared frequency bands for the
       automatic detection and precision tracking of aerial objects in real-time.
       Algorithm and software based on automatic detection and precision tracking of
       aerial objects in real-time developed. The algorithm is tested and its
       performance is evaluated. The algorithm performance was evaluated by
       measuring the time spent processing each frame in sequence. As a result of
       testing, it was found that when executing the Field-Programmable Gate Array
       (FPGA) algorithm, the time spent processing the frame does not depend on the
       object configuration, frame-filling, and background characteristics. The speed
       of the algorithm when it is executed on the FPGA at frame size 1920x1080
       exceeds the speed of execution on the Personal computer (PC) more than 20
       times.


       Keywords: optical-electronic system, digital video processing, detection and
       tracking of moving objects.


1     Introduction

Optical-electronic systems (OES) with automatic detection and tracking of moving
objects are used in solving various problems in systems:
    – machine vision;
    – video surveillance;
    – weapons management.
    The development of high-tech specimens of equipment using OES is increasingly
demanding the detection, tracking and delivery of measurement information about the
parameters of the movement of objects of observation. At this time, the most
demanding of the delay in processing the image of the video stream are air defense
systems, weapons management. Therefore, there is a need to develop OES capable of
auto-detecting and precision real-time tracking of aerial objects.


  Copyright © 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
2     Relevance of research

The automatic optical-electronic system in the process of detection and tracking of the
target solves the following tasks:
    – improving image quality (reducing visible noise on the image, improving
image contrast, improving clarity);
    – adaptation of the camera to poor working conditions (low light, fog and other
atmospheric influences).
    – detection of all moving objects in the video stream;
    – search for a moving object based on a priori information about its type, shape,
speed and nature of movement;
    – measurement of geometric parameters of the objects of observation;
    – automatic identification of the object type by comparison with the standard in
shape, color, surface texture, speed of movement;
    – formation of special points of the accompanying object - the most informative
or vulnerable points, the center of mass, etc.
    – accurate support of the object on the selected points on its surface;
    – predicting the trajectories of objects accompanied;
    – re-capture of the object in case of failure of its accompaniment.
    Despite the progress achieved in solving these problems separately [1 - 9], there is
currently no comprehensive solution to the problem of automatic detection in the
optical range of moving objects and their high-precision real-time tracking. Therefore,
the task of developing a technology for detecting and tracking moving objects in the
air environment in the visible and infrared frequencies in real-time is urgent.
    Object of study - the process of digital processing of video stream in real-time for
detection and high-precision tracking of moving objects in the air environment.
    The purpose of the work is the development of digital video processing
technology in the visible and infrared frequency bands for automatic detection and
precision tracking of aerial objects in real-time.
    Research Methods - Mathematical, computer-aided modeling and experimental
study of digital video processing for real-time high-precision detection of aerial
objects.
    Detection and precision tracking of aerial objects consists of an orderly sequence
of procedures for obtaining, processing, analyzing video information, making and
implementing decisions for precision escorting an object in the context of high a
priori uncertainty about the behavior of the surveillance object and the environment.
The technology is implemented in the form of mathematical, algorithmic and
hierarchically ordered software complex for digital processing of video streams
coming from television and thermal imager channels. The software package provides
the following functions: image acquisition; pre-filtering the image; detecting moving
objects in the video stream; optical flow calculation; formation of object support
point, high-precision support of moving objects in the air environment; Search for
missing objects recapture of lost objects that were accompanied.
    Video streams from CCTV cameras are processed in real-time using a Xilinx
Zynq UltraScale + MPSoC ZCU104 boards and a personal computer.
3    Literature review

The development of high-tech weapons and military equipment is increasingly
demanding the detection, tracking and delivery of measurement information on target
movement parameters. Currently there are both high-precision stationary measuring
complexes and fully automated optical-electronic and television measuring complexes
(USA - Raptor system manufactured by IEC Infrared Systems; CZECH REPUBLIC -
Sirius Sea system; NORWAY - Multi-Sensor DC system; FRANCE - Thales and its
Margot 8000 system, SAFRAN and its TEOS; UK - Instro's Advanced Stabilized
Sensor Platform (ASSP) system, SelexEs Nerio-URL system; RUSSIA - Frigate
Station; AUSTRALIA - ETS Imaging's ATS 3000 system; TURKEY - Seaeye-
CUPRA station). Today, with its pivoting platform Mantis, Instro is one of the leaders
in the world. High-precision measurement in these systems is ensured by state-of-the-
art technology and an elemental base, combined with the high dynamic performance
of electromechanical tracking systems with the processing of measurement results on
a computer.
   Analytical review of optical-electronic systems can conclude that there are too
many systems for object tracking and detection, but they do not have the hardware for
trajectory measurements of object motion in real-time and mode. Therefore, it is
currently urgent to develop a new automated optical-electronic measuring system and
special algorithms for its operation, aimed at working in automatic mode and in real-
time.


4     Materials and methods

OES is a complex of software and hardware modules:
    – optoelectronic module;
    – support-rotary device;
    – digital image processing module;
    – software for real-time detection and precision tracking of aerial objects.
    Optoelectronic module includes digital cameras of the tracking system in the
visible wavelength range and in the infrared range. Each digital camera is an optical
module with a lens, a light-sensitive sensor, an exposure and aperture control unit, as
well as a control interface.
    The operation of the OES is possible with the receipt of a preliminary target from
external means or offline. Targeting automatically captures the target for auto-
tracking. The target is maintained in the visible wavelength range or in the infrared
range, providing round-the-clock use of the OES. Visual accompaniment is recorded
on a digital information storage.
    A structural diagram of the OES is shown in Fig. 1. The block diagram of the
algorithm of functioning of the OES is shown in Fig. 2.
Fig. 1 Block diagram OES
                                                     continuation of the flowchart


Fig. 2. Flowchart of the OES functioning algorithm
5     Experiment.
5.1    Development of prototype OES
Development of a prototype OES detecting and tracking moving objects in the air
environment was carried out using the licensed software package SolidWorks -
Computer-aided design system for automation of the stages of design and
technological preparation of production in the Microsoft Windows environment.
    SolidWorks's three-dimensional solid-state and surface parametric design
principle was used in the development of the OEC reference-rotation device, which
made it possible to create volumetric structural parts and assemble assemblies in the
form of three-dimensional electronic models, which created two-dimensional
drawings and specifications in accordance with the design requirements
documentation.
    Three-dimensional modeling of the Support-Rotary Device has given a number of
significant advantages over the traditional two-dimensional design: elimination of
errors of assembly of the product at the design stage; creation by electronic model of
the part of the control program for processing on the machine with the Numerical
Software Control of the fork of the Support and Rotary Device. Three-dimensional
parts of the Support-Rotary Device were obtained as a result of a combination of
three-dimensional primitives. The consistent extension of 3D objects allowed us to
eventually obtain a Supporting Rotary Device that meets all technical and
technological requirements of the OES. To carry out strength studies of the design of
the Support-Rotary Device, the SOLIDWORKS Simulation package was used, which
allowed to accelerate the design process of the Rotary-Rotary Device with the
guaranteed properties.
    On the basis of the developed design documentation, a prototype of the OES was
made to detect and tracking moving objects in the air environment at the research
plant Kharkov National University Research of Fig. 3.
    The OES prototype consists of: support-rotary device; optical-electronic module
in the composition of the television camera, thermal imager and laser rangefinder;
digital image processing module that is designed to identify and support air objects
based PC and board Xilinx Zynq UltraScale + MPSoC ZCU104.


Fig. 3. Production of optical-electronic system
5.2    Digital image processing module
Digital image processing module (DIPM) provides:
     – receive video from one of the cameras, primary image processing and real-
time detection, capture and tracking of goals based on video stream analysis;
     – the overlay on the image of the purpose of graphic information and parameters
of suppor;
     – the calculation of the trajectory of the movement of the target being
accompanied;
     – formation of control commands of the supporting-turning device.
     The following basic features and limitations related to the nature of the tasks being
solved have been taken into account in the development of image analysis methods
and algorithms for processing these images in the DIPM module:
     1) a priori information about the characteristics of the observed objects and
background is often missing or includes only approximate object sizes;
     2) image processing and analysis must be performed in real-time;
     3) the capture and tracking of the target must take place offline or with minimal
involvement of the operator.
     Basic approaches to improving vision in complex surveillance conditions are
based on the application of different methods of linear and nonlinear spatial-temporal
filtering, algorithms for estimating the parameters of geometric image
transformations.
     Among the approaches to the detection and measurement of coordinates of objects
that have proven, there are four main classes of methods.
     1. The benchmark-based methods can be used to measure the coordinates of
moving and fixed objects observed on a homogeneous and heterogeneous
background, with small signal-to-noise ratios. To use this method, it is necessary to
further develop a database of portraits (spatial-frequency characteristics) of typical
goals.
     2. Statistical segmentation methods are intended to highlight moving and fixed
objects that are observed on a relatively homogeneous background. They are based on
the use of a priori information about the difference between the statistical properties
of the object and the background. When using a thermal imaging camera, if the
temperature of the target or its fragments, such as an engine, significantly exceeds the
temperature of the surrounding background elements, then adjusting the contrast and
brightness of the image can get rid of the background. This method is chosen as the
main one working at night, when accompanying an object with a thermal imaging
camera.
     3. Methods for object selection using spatial filtering. These methods are based on
the use of linear and non-linear spatial filtering of the object in the image by color,
design feature, by the presence of graphic or text characters applied to the housing.
This class of methods is most effective at highlighting moving and fixed objects
against a clear or cloudy sky when the camera is moving.
     4. Methods based on the allocation of dynamic changes are focused on solving the
problem of selection of moving objects observed on a homogeneous or heterogeneous
background. The principle of operation of such algorithms is based on the detection of
changes that occur over time in the observed sequence of images. In the case of a
moving camera, the offset of the picture is initially offset by the movement of the
camera. This method is effective when the accompanying object moves against a sky
or static background surface. Therefore, this method is chosen as the main one when
working in the daytime on moving targets.
    Among the promising concepts for the development of information technologies
for image processing and analysis in optical-electronic systems can be distinguished:
    1) application of complex information received by different image registration
channels: television, thermal imaging, radar; it allows to increase reliability of
detection, allocation and estimation of parameters of objects at work at long distances,
in conditions of low visibility and at use of various masking means;
    2) the use of structural methods for detecting and evaluating the parameters of
objects based on the detection and analysis of visual primitives (nodal points,
segments, arcs) corresponding to the observed object; this class of methods is most
effective for detecting and measuring the coordinates of objects observed under
conditions of image deformation;
    3) analysis of the background on which the objects are observed, and automatic
selection of the algorithm for detecting and measuring the coordinates of the objects,
which is most effective in the current observation conditions; this makes it possible to
increase the degree of autonomy of the optical-electronic system, which eliminates in
many cases the need for human operator intervention.
    A significant feature of the presented concept is the presence of a stage analysis of
the target environment, which results in the decision to use an algorithm for tracking
and measuring coordinates of objects.
    At the heart of the DIPM is the FPGA processor and matrix. The processor
manages and the FPGA performs the calculations. The combination of these devices
is due to the fact that modern processors lack performance. System performance is
determined not only by the speed of image compression of the video frame, but also
by the speed of pre- and post-processing (scaling, sweep conversion, filtering, color
conversion, etc.). These procedures take longer than the actual compression process.
    The project uses Xilinx Zynq UltraScale + MPSoC ZCU104 board. The board
provides camera image capture and hardware processing, which minimizes delays and
increases the speed of the DIPM. Xilinx Zynq UltraScale + MPSoC ZCU104 Central
Plug-in Module is a FPGA-integrated, quad-core 64-bit ARM® Cortex®-A53
processor (up to 1.5 GHz) with hardware virtualization, AMP and Trust technology.
The co-processor of this system is a two-core Cortex-R5 real-time processor (up to
600 MHz). For hardware acceleration of video processing, the system has a graphical
accelerator ARM Mali-400 (clock speed up to 667 MHz), as well as video codec
H.264-H.265, support for DisplayPort, MIPI and HDMI. The board has high-speed
USB3.0, SATA3.1, GigabitEthernet, SD / SDIO peripherals. Power management is
carried out in a separate subsystem PMU, which performs the functions of control and
management of power throughout the system. The reVISION platform for the
ZCU104 provides software development in the SDSoC programming environment
with OpenCV libraries, allowing you to develop effective real-time streaming video
algorithms.
    The effectiveness of combining the processor and the FPGA matrix on the same
platform can be demonstrated by the example of the solution of the problem of image
stabilization. To reconcile the two frames, a parameter called the sum of the absolute
difference (SAD) is calculated. If the core of the system contains only the processor,
then the SAD calculation can take up to 65% of the processor time. Considering that
this is not the only task that the processor performs, it turns out that it is fully loaded.
FPGA application makes it easier for DSPs - the matrix will count SAD 10 times
faster. This will free up computing capabilities for other tasks. However, when
transferring tasks from the processor to the FPGA, they will exchange data.
Therefore, the choice between the processor and the FPGA is made on the principle:
if the calculations require the execution of hundreds of millions of accumulations per
second, they are carried out on the processor; if such operations are more than a
billion per second, then the FPGA. On FPGAs it is advisable to perform tasks such as
median filtering or feature selection of an accompanying object.


5.3    Description of algorithms for identifying, capturing and tracking targets

The algorithm is based on modern methods of finding and tracking moving objects in
a video stream obtained from a camcorder or infrared camera.
    The algorithm can operate in full automatic mode of capture and tracking of
targets, in semi-automatic mode when the target is specified from an external
targeting system or by the operator, and tracking is automatic, and in full manual
mode, when the operator directly controls the actuators with the joystick.
    The algorithm consists of individual blocks.
    The main blocks of the algorithm:
    1) The block of receiving the image.
    2) Image pre-filtering unit.
    3) The block capture targets.
    4) Missing object search block.
    5) The block of location of support points on the target.
    6) Goal support block.
    7) Block of filtering of the monitored points.
    8) Trajectory prediction unit.
    9) Block analysis of support errors.
    A block diagram of the algorithms for identifying, capturing and tracking targets
is shown in Fig. 4.
                                                                   continuation of the flowchart


Fig. 4. Flowchart of goal detection, capture and tracking algorithms
     1. Getting started with the algorithm. The block of receiving the image.
     The algorithm starts by choosing the source of the video stream - an infrared
camera or a visible range camera, or both cameras together.
     The image receiving unit performs the tasks of managing the cameras and
converting the image to an easy-to-process format. The camera controls depend on the
particular camera model. The main adjustable parameters are optical zoom, focus,
frame exposure time, matrix sensitivity (ISO), resolution, frame rate. Some of the
settings can be adjusted automatically by the camera without external commands.
     An important parameter for the algorithm operation is the resolution of the frame.
It depends on the speed of the algorithm and the quality of the image. Most pre-
processing, object search, and frame analysis operations have complexity that is
directly proportional to the number of pixels in the frame. According to preliminary
tests of the software prototype, image processing at 640x480 resolution was in the
range of 150-300 frames per second, at 1280x720 resolution - 40-100 frames per
second.
     Increasing the resolution allows distinguishing more details on the target, which
gives further information on facilities for algorithms using analysis of form and
texture of the object. Using a color camera adds more information about objects and
allows you to use algorithms that are sensitive to the target's color histogram. But due
to physical limitations, increasing the resolution of the matrix or using a color camera
will reduce the physical size of pixels on the matrix, which increases the noise and
reduces the quality of the image (especially in low light conditions). Therefore, it is
advisable to use the main camera in monochrome, with a large matrix size and a small
resolution (640x480) with a frame rate of 30-60 per second.
     2. Pre-filtering unit of the image.
     Standard image filtering operations such as linear pixel averaging, median
filtration, Gaussian blur, mathematical morphology, anisotropic pixel diffusion,
Wiener spatial filter, and others are used during image pre-processing. Filters are
selected according to the following methods of capture and tracking purposes.
     3. The block capture targets.
     A. Manual capture (semi-automatic mode). The goal is set manually by the
operator. In this case, the auto-capture detector is applied to the operator-selected area
of the so-called area of interest (ROI).
     B. Fully automatic mode. The choice of capture method in automatic mode
depends on the time of day, weather conditions, and the parameters of the object
being accompanied.
     Moving object detector (Highlighting dynamic changes). For the primary capture
of moving targets, a cascade motion detector consisting of sequential image
processing operations is used, characterized by a high speed of operation, versatility
in the appearance of objects, and the ability to add or remove operations from an
image processing sequence during operation. At this point, it is preferable that the
platform does not move so as not to interfere with the capture stage, but if this is not
possible, the offset of the image shift is pre-combined by maximizing the correlation
between the two frames using fast Fourier transform and phase correlation.
     Detection of moving objects is performed as a result of the following operations:
     1) calculation of inter-frame difference;
     2) morphological operation of erosion;
    3) morphological expansion operation;
    4) morphological operation of selection of object boundaries;
    5) iterative approach of connected domains.
    The detector contrast objects. Instead of a motion detector, statistical object
segmentation is used to capture still objects based on the contrast of the object and
background, or an object search is performed for certain features, such as color and
shape (spatial filtering). That is, instead of comparing the inter-frame difference with
the threshold, the values of the colors or the intensity of the pixels with the given
thresholds are compared. Subsequent operations of the morphology and iterative
traversal of the areas remain the same.
    Detector objects by color (in case of color camera use). Object Detector Structure
by Color
    1) highlight areas of a particular color;
    2) morphological operation of erosion;
    3) morphological expansion operation;
    4) morphological operation of selection of object boundaries;
    5) iterative bypass connected regions.
    Areas of a particular color are highlighted by converting the image to a HSV color
space, in which each pixel is described by color and color saturation. Thus, we select
areas whose color lies within the specified range and the intensity is above the
threshold.
    Instead of selecting objects of a particular color, you can select all objects that are
different from the color of a homogeneous background.
    Items 2 through 5 remain the same as in the contrast object detector.
    Detector of objects of a given contour shape. To search for a form object, the form
template is first taken and vectorized by the calculation of the statistical moments.
    As a pre-processing of the initial image, a Gaussian filter is used to reduce the
gradient at the boundaries.
    An object detector based on a template base. To capture objects of a certain type, a
method of analyzing histograms of directional gradients is used, where the number of
gradient directions and the cascade Viola-Jones cascade classifier that uses Haar traits
as descriptors of singular points are given special points. These methods require its
own database of objects of all types of interest to us, who sought on the image. For
grading purposes, pixel sizes should be larger than 10x10 pixels, otherwise there will
be a high probability of errors, so it is not always possible to determine the type or
look for objects of a particular type for long-range purposes.
    Capture objects at long range. Only the contrast object search method, the color
object search method, and partially the moving object search method, can be used to
detect long-range aerial targets that appear to be a point or a few pixels. The object
should have a high contrast with the background and the background should have a
high uniformity.
    Additional features can be used to extend the range of capture conditions for a 1
by 1 pixel target. One such feature is the inversion trace, which has much larger linear
dimensions, and its shape and direction unmistakably points to a moving object and
enables it to detect its type. Search the inversion trace is done by finding objects of a
given linear form.
    4. Block of missing objects search.
    If an accompanying object is lost, then the object is searched in the predicted
areas, using known information about the object's previous appearance and image
comparison algorithms based on special point descriptors (SIFT, SURF, ORB),
boundary shape ( ShapeMatch), or cross-correlations, depending on what the most
characteristic features an object had before losing it.
    5. Object support (block of detection and tracking of characteristic points on the
target).
    A bloc tracking designed to identify and track feature points in a series of video
frames. This bloc is implemented on Xilinx Zynq UltraScale + MPSoC ZCU104. A
Harris angle detector is used to detect characteristic points.
    Angles are areas of the image with a large difference in pixel intensity in all
directions. To find the angle, it is necessary to calculate the difference E u , v  when
shifting in all directions (1):

                            E u , v    wx, y I x  u , y  v   I x, y 
                                                                                  2
                                                                                      (1)
                                        x, y


   where I x, y  – intensity at the point;
    I x  u , y  v  – intensity in the shifted point;
     wx, y  – window function.
    The window function is either a rectangular window or a Gaussian window that
gives weight to the difference when displaced.
    To identify the angles need to maximize function E u , v  .
    The result of image processing by the Harris function is an image in grayscale
where the angles are indicated by dots.
    A modified version of the Lucas Kanade optical stream is used to track the goal.
The main part of the algorithm uses the current and subsequent frames as input and
displays a list of characteristic points to be tracked. The current image is the first
frame in the set in which the algorithm will detect and track feature points. The
number of frames in the set in which you want to track the characteristic points is
specified as input.
    In block detection and tracking characteristic points on goals using five hardware
functions of the library xfOpenCV, which are combined into one new feature
tracking:
    xf :: cornerHarris,
    xf :: cornersImgToList,
    xf :: cornerUpdate,
    xf :: pyrDown,
    xf :: densePyrOpticalFlow.
    The interconnection of hardware functions is shown in fig. 5.
Fig. 5. Tracking feature points using a rarefied optical stream

    The tracking function takes the characteristic points from the Harris angle
detector, and the dense optical flux vectors from the dense pyramidal optical flux
function, and outputs updated coordinates of the characteristic points, tracking the
input angles, using the dense flux vectors, thus simulating the behavior of the sparse
optical flux. This hardware function is clocked at 300 MHz for 10,000 characteristic
points of the image at 720p resolution, adding a minimum delay for the conveyor.
    6. Block filtering points that are tracked.
    The algorithm uses a screening out operation to remove unnecessary and
erroneous points. Unnecessary and erroneous dots appear on the image as a result of
background motion. The n extreme points farthest from the center of mass (center of
the object) are eliminated in the algorithm, and points whose motion is significantly
different from the movement of most points of the object are also rejected. The
resulting image can be considered as a binary mask where we are interested only in
    To find the bounding frame around the points in the image, we check for each non-
zero pixel that its position is outside the current rectangle. If so, we update the
location of the rectangle. After the overlapping of the detected characteristic points
and the bounding frame on the original image (Fig. 6), we obtain in each frame of the
video sequence a frame that moves behind the object (Fig. 7).
Fig. 6. Overlay of detected feature points and bounding box on the original frame image


Fig. 7. Supporting the air target (on the video)

    7. Analysis of objects at intersections with other objects and object detection
points disappearance from view. (The block of approximation of the trajectory in the
past moments and the block of distinction of intersections of goals with other goals
and forms of relief (analysis of complex maneuvers)).
    The obtained image of moving objects is compared with the previous ones, and in
case of loss of any object from the field of view, it is followed in the predicted
trajectory, and in the following frames the objects in the forecast areas are searched. It
uses all the known information about object previews and image comparison
algorithms based on special point descriptors (SIFT, SURF, ORB), boundary shape
(ShapeMatch), or cross-correlation, depending on what the most characteristic
features were have object before its loss.
    8. Block trajectory prediction.
    To predict the trajectory of the object using a Kalman filter in which each frame
the coordinates of the targets that are not marked as lost.
    9. Block analysis of maintenance errors.
    The analysis of the reasons for exceeding the error of tracking a given threshold is
performed on the basis of the method of hierarchy analysis and allows to establish one
of the possible reasons for this exceedance:
    – changing the nature of the object's movement (changing speed, maneuvering,
hovering, etc.);
    – change of video surveillance conditions (change of brightness, contrast, partial
or full overlapping of moving object, etc.).
    After establishing the causes, one proceeds to one of the previous stages of the
algorithm. This process continues until the escort error is less than the specified
threshold.


6    Results
Testing the algorithm of tracking the aerial objects and assessing its performance was
carried out in the laboratory on static and dynamic scenes using a simulated moving
air object, placed on the background, which allows to simulate different conditions of
visibility (contrast) of the accompanying object. In the experiment, the movement of
the object was recorded using a video camera, real-time video stream was processed
using FPGA, video image after processing was displayed on the monitor screen. As a
result of the video stream processing, the coordinates of the object are calculated and
the object is maintained and held in the frame.
    The algorithm performance was evaluated by measuring the time spent processing
each frame in sequence. As a result of testing, it was found that when running the
FPGA algorithm, the time spent processing the frame does not depend on the object
configuration, frame-filling, and background characteristics.
    The results of a study of the performance of the algorithm developed and a
comparison of the performance of the calculations to implement the Harris function
on a personal computer (PC) (CPU: i7 4500u, memory: 8GDDR3) and FPGA are
shown in table 1.

Table 1. The results of the comparison

        Video         frame Processing time using a PC, ms Processing time using
        resolution                                         FPGA, ms
             640x480                      17,5                      0,98
             1280x480                     68,2                      4,3
             1920x1080                   272,3                      13,6
             1920x1920                   613,5                      46,7
    Table 1 shows, that the FPGA-based approach shows much better performance
than the CPU. For example, the speed of an algorithm when executed on an FPGA at
a frame size of 1920x1080 exceeds the speed of execution on a PC more than 20
times.
    Zynq Series Xilinx boards are ideal for video processing. Benefits of the Zynq
platform:
    • the bandwidth is 12 times higher than the alternative SoCs currently on the market;
    • allows you to get 6x better images / sec / watt compared to embedded GPUs and
typical SoCs;
    • allows you to use video sensor combinations.
    The software implementation of the algorithm discussed in this article can be
implemented on the Xilinx Zynq-7000 and ZU + MPSoC platforms, including ZCU102,
ZCU104.


7       Conclusions
1. The prototype of the OES for the detection and maintenance of moving targets in
the air environment is developed. The development of a new automated OES1 and
special algorithms for its operation, aimed at working in automatic mode and in real-
time, so the detection and tracking of characteristic points on the target performed on
FPGA. We recommend the software implementation of the target point detection and
tracking unit using on the Xilinx Zynq-7000 and ZU + MPSoC platforms, including
ZCU102, ZCU104 boards.
    2. Digital video processing technology in the visible and infrared frequency bands
has been developed for the automatic detection and precision tracking of aerial
objects in real-time. The main components of the flowchart of the algorithms for
identifying, capturing and tracking targets are identified. Software implementation of
all digital video processing features on FPGA is not possible, so at this time a
compromise solution was proposed using a personal computer (CPU: i7 4500u,
memory: 8GDDR3) and ZCU104.
    3. Was a study of the performance of the algorithm developed and a comparison
of the performance of the calculations to implement the Harris function on a personal
computer and FPGA. The speed of the algorithm when it is executed on the FPGA at
frame size 1920x1080 exceeds the speed of execution on the PC more than 20 times.


References

1. Yane, B.: Cifrovaya obrabotka izobrazhenij (Digital image processing). Tehnosfera. 584 p.
   (2007) ISBN 978-5-94836-122-2


    1 The work was carried out with the assistance of the Department of
Infocommunication Engineering V.V. Popovsky, Department of Applied
Mathematics and the test plant of Kharkiv National University of Radio Electronics.
2. Lukyanica, A.A., Shishkin, A.G.: Cifrovaya obrabotka izobrazhenij (Digital image
   processing). Aj-Es-Es Press. 518 p (2009)
3. Murphy, K.P.: Models for Generic Visual Object Detection. Technical report, Department
   of Computer Science. University of British Columbia. Vancouver. Canada, p 8 (2005)
4. Viola, P., Jones, M.: Robust Real-Time Object Detection. Intl. J. Computer
   Vision. 57(2):137-154. Vol. 57(2), p. 137–154 (2004)
5. Bulychev, Yu.G., Manin, A.P.: (2000) Matematicheskie aspekty opredeleniya dvizheniya
   letatelnyh apparatov (Mathematical aspects of aircraft motion determination).
   Mashinostroenie. 256 p (2000)
6. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features.
   Computer Vision and Image Understanding (CVIU). Vol. 110, N 3, p. 346–359 (2008)
7. Titov, I.O., Emelyanov, G.M.: Sistema kompyuternogo zreniya dvizhushegosya
   vozdushnogo obekta (Computer vision system of moving air object). Kompyuternaya
   optika. T. 35, № 4. pp. 491–494 (2011)
8. Danelljan, M., Khan, F.S., Felsberg, M., van de Weijer, J.: Adaptive Color Attributes for
   Real-Time Visual Tracking. Conference on computer vision and pattern recognition,
   pp. 1090–1097. (2014) doi:10.1109/CVPR.2014.143
9. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-Speed Tracking with Kernelized
   Correlation Filters. IEEE transactions on pattern analysis and machine intelligence.
   Mar;37(3):583-96. (2015) doi: 10.1109/TPAMI.2014.2345390

</pre>