Optical-Electronic System of Automatic Detection and High-Precise Tracking of Aerial Objects in Real-Time Igor Shostko [0000-0002-5612-3080], Andriy Tevyashev [0000-0001-5261-9874], Yuliia Kulia [0000-0001-6541-7913], Anton Koliadin [0000-0001-5552-5080] Kharkiv National University of Radio Electronics, Kharkiv, 14 Nauky Ave., UKRAINE ihor.shostko@nure.ua, tad45ua@gmail.com, yuliia.kulia@nure.ua, anton.koliadin@nure.ua Abstract. The article describes the results of the development of digital video processing technology in the visible and infrared frequency bands for the automatic detection and precision tracking of aerial objects in real-time. Algorithm and software based on automatic detection and precision tracking of aerial objects in real-time developed. The algorithm is tested and its performance is evaluated. The algorithm performance was evaluated by measuring the time spent processing each frame in sequence. As a result of testing, it was found that when executing the Field-Programmable Gate Array (FPGA) algorithm, the time spent processing the frame does not depend on the object configuration, frame-filling, and background characteristics. The speed of the algorithm when it is executed on the FPGA at frame size 1920x1080 exceeds the speed of execution on the Personal computer (PC) more than 20 times. Keywords: optical-electronic system, digital video processing, detection and tracking of moving objects. 1 Introduction Optical-electronic systems (OES) with automatic detection and tracking of moving objects are used in solving various problems in systems: – machine vision; – video surveillance; – weapons management. The development of high-tech specimens of equipment using OES is increasingly demanding the detection, tracking and delivery of measurement information about the parameters of the movement of objects of observation. At this time, the most demanding of the delay in processing the image of the video stream are air defense systems, weapons management. Therefore, there is a need to develop OES capable of auto-detecting and precision real-time tracking of aerial objects. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 Relevance of research The automatic optical-electronic system in the process of detection and tracking of the target solves the following tasks: – improving image quality (reducing visible noise on the image, improving image contrast, improving clarity); – adaptation of the camera to poor working conditions (low light, fog and other atmospheric influences). – detection of all moving objects in the video stream; – search for a moving object based on a priori information about its type, shape, speed and nature of movement; – measurement of geometric parameters of the objects of observation; – automatic identification of the object type by comparison with the standard in shape, color, surface texture, speed of movement; – formation of special points of the accompanying object - the most informative or vulnerable points, the center of mass, etc. – accurate support of the object on the selected points on its surface; – predicting the trajectories of objects accompanied; – re-capture of the object in case of failure of its accompaniment. Despite the progress achieved in solving these problems separately [1 - 9], there is currently no comprehensive solution to the problem of automatic detection in the optical range of moving objects and their high-precision real-time tracking. Therefore, the task of developing a technology for detecting and tracking moving objects in the air environment in the visible and infrared frequencies in real-time is urgent. Object of study - the process of digital processing of video stream in real-time for detection and high-precision tracking of moving objects in the air environment. The purpose of the work is the development of digital video processing technology in the visible and infrared frequency bands for automatic detection and precision tracking of aerial objects in real-time. Research Methods - Mathematical, computer-aided modeling and experimental study of digital video processing for real-time high-precision detection of aerial objects. Detection and precision tracking of aerial objects consists of an orderly sequence of procedures for obtaining, processing, analyzing video information, making and implementing decisions for precision escorting an object in the context of high a priori uncertainty about the behavior of the surveillance object and the environment. The technology is implemented in the form of mathematical, algorithmic and hierarchically ordered software complex for digital processing of video streams coming from television and thermal imager channels. The software package provides the following functions: image acquisition; pre-filtering the image; detecting moving objects in the video stream; optical flow calculation; formation of object support point, high-precision support of moving objects in the air environment; Search for missing objects recapture of lost objects that were accompanied. Video streams from CCTV cameras are processed in real-time using a Xilinx Zynq UltraScale + MPSoC ZCU104 boards and a personal computer. 3 Literature review The development of high-tech weapons and military equipment is increasingly demanding the detection, tracking and delivery of measurement information on target movement parameters. Currently there are both high-precision stationary measuring complexes and fully automated optical-electronic and television measuring complexes (USA - Raptor system manufactured by IEC Infrared Systems; CZECH REPUBLIC - Sirius Sea system; NORWAY - Multi-Sensor DC system; FRANCE - Thales and its Margot 8000 system, SAFRAN and its TEOS; UK - Instro's Advanced Stabilized Sensor Platform (ASSP) system, SelexEs Nerio-URL system; RUSSIA - Frigate Station; AUSTRALIA - ETS Imaging's ATS 3000 system; TURKEY - Seaeye- CUPRA station). Today, with its pivoting platform Mantis, Instro is one of the leaders in the world. High-precision measurement in these systems is ensured by state-of-the- art technology and an elemental base, combined with the high dynamic performance of electromechanical tracking systems with the processing of measurement results on a computer. Analytical review of optical-electronic systems can conclude that there are too many systems for object tracking and detection, but they do not have the hardware for trajectory measurements of object motion in real-time and mode. Therefore, it is currently urgent to develop a new automated optical-electronic measuring system and special algorithms for its operation, aimed at working in automatic mode and in real- time. 4 Materials and methods OES is a complex of software and hardware modules: – optoelectronic module; – support-rotary device; – digital image processing module; – software for real-time detection and precision tracking of aerial objects. Optoelectronic module includes digital cameras of the tracking system in the visible wavelength range and in the infrared range. Each digital camera is an optical module with a lens, a light-sensitive sensor, an exposure and aperture control unit, as well as a control interface. The operation of the OES is possible with the receipt of a preliminary target from external means or offline. Targeting automatically captures the target for auto- tracking. The target is maintained in the visible wavelength range or in the infrared range, providing round-the-clock use of the OES. Visual accompaniment is recorded on a digital information storage. A structural diagram of the OES is shown in Fig. 1. The block diagram of the algorithm of functioning of the OES is shown in Fig. 2. Fig. 1 Block diagram OES continuation of the flowchart Fig. 2. Flowchart of the OES functioning algorithm 5 Experiment. 5.1 Development of prototype OES Development of a prototype OES detecting and tracking moving objects in the air environment was carried out using the licensed software package SolidWorks - Computer-aided design system for automation of the stages of design and technological preparation of production in the Microsoft Windows environment. SolidWorks's three-dimensional solid-state and surface parametric design principle was used in the development of the OEC reference-rotation device, which made it possible to create volumetric structural parts and assemble assemblies in the form of three-dimensional electronic models, which created two-dimensional drawings and specifications in accordance with the design requirements documentation. Three-dimensional modeling of the Support-Rotary Device has given a number of significant advantages over the traditional two-dimensional design: elimination of errors of assembly of the product at the design stage; creation by electronic model of the part of the control program for processing on the machine with the Numerical Software Control of the fork of the Support and Rotary Device. Three-dimensional parts of the Support-Rotary Device were obtained as a result of a combination of three-dimensional primitives. The consistent extension of 3D objects allowed us to eventually obtain a Supporting Rotary Device that meets all technical and technological requirements of the OES. To carry out strength studies of the design of the Support-Rotary Device, the SOLIDWORKS Simulation package was used, which allowed to accelerate the design process of the Rotary-Rotary Device with the guaranteed properties. On the basis of the developed design documentation, a prototype of the OES was made to detect and tracking moving objects in the air environment at the research plant Kharkov National University Research of Fig. 3. The OES prototype consists of: support-rotary device; optical-electronic module in the composition of the television camera, thermal imager and laser rangefinder; digital image processing module that is designed to identify and support air objects based PC and board Xilinx Zynq UltraScale + MPSoC ZCU104. Fig. 3. Production of optical-electronic system 5.2 Digital image processing module Digital image processing module (DIPM) provides: – receive video from one of the cameras, primary image processing and real- time detection, capture and tracking of goals based on video stream analysis; – the overlay on the image of the purpose of graphic information and parameters of suppor; – the calculation of the trajectory of the movement of the target being accompanied; – formation of control commands of the supporting-turning device. The following basic features and limitations related to the nature of the tasks being solved have been taken into account in the development of image analysis methods and algorithms for processing these images in the DIPM module: 1) a priori information about the characteristics of the observed objects and background is often missing or includes only approximate object sizes; 2) image processing and analysis must be performed in real-time; 3) the capture and tracking of the target must take place offline or with minimal involvement of the operator. Basic approaches to improving vision in complex surveillance conditions are based on the application of different methods of linear and nonlinear spatial-temporal filtering, algorithms for estimating the parameters of geometric image transformations. Among the approaches to the detection and measurement of coordinates of objects that have proven, there are four main classes of methods. 1. The benchmark-based methods can be used to measure the coordinates of moving and fixed objects observed on a homogeneous and heterogeneous background, with small signal-to-noise ratios. To use this method, it is necessary to further develop a database of portraits (spatial-frequency characteristics) of typical goals. 2. Statistical segmentation methods are intended to highlight moving and fixed objects that are observed on a relatively homogeneous background. They are based on the use of a priori information about the difference between the statistical properties of the object and the background. When using a thermal imaging camera, if the temperature of the target or its fragments, such as an engine, significantly exceeds the temperature of the surrounding background elements, then adjusting the contrast and brightness of the image can get rid of the background. This method is chosen as the main one working at night, when accompanying an object with a thermal imaging camera. 3. Methods for object selection using spatial filtering. These methods are based on the use of linear and non-linear spatial filtering of the object in the image by color, design feature, by the presence of graphic or text characters applied to the housing. This class of methods is most effective at highlighting moving and fixed objects against a clear or cloudy sky when the camera is moving. 4. Methods based on the allocation of dynamic changes are focused on solving the problem of selection of moving objects observed on a homogeneous or heterogeneous background. The principle of operation of such algorithms is based on the detection of changes that occur over time in the observed sequence of images. In the case of a moving camera, the offset of the picture is initially offset by the movement of the camera. This method is effective when the accompanying object moves against a sky or static background surface. Therefore, this method is chosen as the main one when working in the daytime on moving targets. Among the promising concepts for the development of information technologies for image processing and analysis in optical-electronic systems can be distinguished: 1) application of complex information received by different image registration channels: television, thermal imaging, radar; it allows to increase reliability of detection, allocation and estimation of parameters of objects at work at long distances, in conditions of low visibility and at use of various masking means; 2) the use of structural methods for detecting and evaluating the parameters of objects based on the detection and analysis of visual primitives (nodal points, segments, arcs) corresponding to the observed object; this class of methods is most effective for detecting and measuring the coordinates of objects observed under conditions of image deformation; 3) analysis of the background on which the objects are observed, and automatic selection of the algorithm for detecting and measuring the coordinates of the objects, which is most effective in the current observation conditions; this makes it possible to increase the degree of autonomy of the optical-electronic system, which eliminates in many cases the need for human operator intervention. A significant feature of the presented concept is the presence of a stage analysis of the target environment, which results in the decision to use an algorithm for tracking and measuring coordinates of objects. At the heart of the DIPM is the FPGA processor and matrix. The processor manages and the FPGA performs the calculations. The combination of these devices is due to the fact that modern processors lack performance. System performance is determined not only by the speed of image compression of the video frame, but also by the speed of pre- and post-processing (scaling, sweep conversion, filtering, color conversion, etc.). These procedures take longer than the actual compression process. The project uses Xilinx Zynq UltraScale + MPSoC ZCU104 board. The board provides camera image capture and hardware processing, which minimizes delays and increases the speed of the DIPM. Xilinx Zynq UltraScale + MPSoC ZCU104 Central Plug-in Module is a FPGA-integrated, quad-core 64-bit ARM® Cortex®-A53 processor (up to 1.5 GHz) with hardware virtualization, AMP and Trust technology. The co-processor of this system is a two-core Cortex-R5 real-time processor (up to 600 MHz). For hardware acceleration of video processing, the system has a graphical accelerator ARM Mali-400 (clock speed up to 667 MHz), as well as video codec H.264-H.265, support for DisplayPort, MIPI and HDMI. The board has high-speed USB3.0, SATA3.1, GigabitEthernet, SD / SDIO peripherals. Power management is carried out in a separate subsystem PMU, which performs the functions of control and management of power throughout the system. The reVISION platform for the ZCU104 provides software development in the SDSoC programming environment with OpenCV libraries, allowing you to develop effective real-time streaming video algorithms. The effectiveness of combining the processor and the FPGA matrix on the same platform can be demonstrated by the example of the solution of the problem of image stabilization. To reconcile the two frames, a parameter called the sum of the absolute difference (SAD) is calculated. If the core of the system contains only the processor, then the SAD calculation can take up to 65% of the processor time. Considering that this is not the only task that the processor performs, it turns out that it is fully loaded. FPGA application makes it easier for DSPs - the matrix will count SAD 10 times faster. This will free up computing capabilities for other tasks. However, when transferring tasks from the processor to the FPGA, they will exchange data. Therefore, the choice between the processor and the FPGA is made on the principle: if the calculations require the execution of hundreds of millions of accumulations per second, they are carried out on the processor; if such operations are more than a billion per second, then the FPGA. On FPGAs it is advisable to perform tasks such as median filtering or feature selection of an accompanying object. 5.3 Description of algorithms for identifying, capturing and tracking targets The algorithm is based on modern methods of finding and tracking moving objects in a video stream obtained from a camcorder or infrared camera. The algorithm can operate in full automatic mode of capture and tracking of targets, in semi-automatic mode when the target is specified from an external targeting system or by the operator, and tracking is automatic, and in full manual mode, when the operator directly controls the actuators with the joystick. The algorithm consists of individual blocks. The main blocks of the algorithm: 1) The block of receiving the image. 2) Image pre-filtering unit. 3) The block capture targets. 4) Missing object search block. 5) The block of location of support points on the target. 6) Goal support block. 7) Block of filtering of the monitored points. 8) Trajectory prediction unit. 9) Block analysis of support errors. A block diagram of the algorithms for identifying, capturing and tracking targets is shown in Fig. 4. continuation of the flowchart Fig. 4. Flowchart of goal detection, capture and tracking algorithms 1. Getting started with the algorithm. The block of receiving the image. The algorithm starts by choosing the source of the video stream - an infrared camera or a visible range camera, or both cameras together. The image receiving unit performs the tasks of managing the cameras and converting the image to an easy-to-process format. The camera controls depend on the particular camera model. The main adjustable parameters are optical zoom, focus, frame exposure time, matrix sensitivity (ISO), resolution, frame rate. Some of the settings can be adjusted automatically by the camera without external commands. An important parameter for the algorithm operation is the resolution of the frame. It depends on the speed of the algorithm and the quality of the image. Most pre- processing, object search, and frame analysis operations have complexity that is directly proportional to the number of pixels in the frame. According to preliminary tests of the software prototype, image processing at 640x480 resolution was in the range of 150-300 frames per second, at 1280x720 resolution - 40-100 frames per second. Increasing the resolution allows distinguishing more details on the target, which gives further information on facilities for algorithms using analysis of form and texture of the object. Using a color camera adds more information about objects and allows you to use algorithms that are sensitive to the target's color histogram. But due to physical limitations, increasing the resolution of the matrix or using a color camera will reduce the physical size of pixels on the matrix, which increases the noise and reduces the quality of the image (especially in low light conditions). Therefore, it is advisable to use the main camera in monochrome, with a large matrix size and a small resolution (640x480) with a frame rate of 30-60 per second. 2. Pre-filtering unit of the image. Standard image filtering operations such as linear pixel averaging, median filtration, Gaussian blur, mathematical morphology, anisotropic pixel diffusion, Wiener spatial filter, and others are used during image pre-processing. Filters are selected according to the following methods of capture and tracking purposes. 3. The block capture targets. A. Manual capture (semi-automatic mode). The goal is set manually by the operator. In this case, the auto-capture detector is applied to the operator-selected area of the so-called area of interest (ROI). B. Fully automatic mode. The choice of capture method in automatic mode depends on the time of day, weather conditions, and the parameters of the object being accompanied. Moving object detector (Highlighting dynamic changes). For the primary capture of moving targets, a cascade motion detector consisting of sequential image processing operations is used, characterized by a high speed of operation, versatility in the appearance of objects, and the ability to add or remove operations from an image processing sequence during operation. At this point, it is preferable that the platform does not move so as not to interfere with the capture stage, but if this is not possible, the offset of the image shift is pre-combined by maximizing the correlation between the two frames using fast Fourier transform and phase correlation. Detection of moving objects is performed as a result of the following operations: 1) calculation of inter-frame difference; 2) morphological operation of erosion; 3) morphological expansion operation; 4) morphological operation of selection of object boundaries; 5) iterative approach of connected domains. The detector contrast objects. Instead of a motion detector, statistical object segmentation is used to capture still objects based on the contrast of the object and background, or an object search is performed for certain features, such as color and shape (spatial filtering). That is, instead of comparing the inter-frame difference with the threshold, the values of the colors or the intensity of the pixels with the given thresholds are compared. Subsequent operations of the morphology and iterative traversal of the areas remain the same. Detector objects by color (in case of color camera use). Object Detector Structure by Color 1) highlight areas of a particular color; 2) morphological operation of erosion; 3) morphological expansion operation; 4) morphological operation of selection of object boundaries; 5) iterative bypass connected regions. Areas of a particular color are highlighted by converting the image to a HSV color space, in which each pixel is described by color and color saturation. Thus, we select areas whose color lies within the specified range and the intensity is above the threshold. Instead of selecting objects of a particular color, you can select all objects that are different from the color of a homogeneous background. Items 2 through 5 remain the same as in the contrast object detector. Detector of objects of a given contour shape. To search for a form object, the form template is first taken and vectorized by the calculation of the statistical moments. As a pre-processing of the initial image, a Gaussian filter is used to reduce the gradient at the boundaries. An object detector based on a template base. To capture objects of a certain type, a method of analyzing histograms of directional gradients is used, where the number of gradient directions and the cascade Viola-Jones cascade classifier that uses Haar traits as descriptors of singular points are given special points. These methods require its own database of objects of all types of interest to us, who sought on the image. For grading purposes, pixel sizes should be larger than 10x10 pixels, otherwise there will be a high probability of errors, so it is not always possible to determine the type or look for objects of a particular type for long-range purposes. Capture objects at long range. Only the contrast object search method, the color object search method, and partially the moving object search method, can be used to detect long-range aerial targets that appear to be a point or a few pixels. The object should have a high contrast with the background and the background should have a high uniformity. Additional features can be used to extend the range of capture conditions for a 1 by 1 pixel target. One such feature is the inversion trace, which has much larger linear dimensions, and its shape and direction unmistakably points to a moving object and enables it to detect its type. Search the inversion trace is done by finding objects of a given linear form. 4. Block of missing objects search. If an accompanying object is lost, then the object is searched in the predicted areas, using known information about the object's previous appearance and image comparison algorithms based on special point descriptors (SIFT, SURF, ORB), boundary shape ( ShapeMatch), or cross-correlations, depending on what the most characteristic features an object had before losing it. 5. Object support (block of detection and tracking of characteristic points on the target). A bloc tracking designed to identify and track feature points in a series of video frames. This bloc is implemented on Xilinx Zynq UltraScale + MPSoC ZCU104. A Harris angle detector is used to detect characteristic points. Angles are areas of the image with a large difference in pixel intensity in all directions. To find the angle, it is necessary to calculate the difference E u , v  when shifting in all directions (1): E u , v    wx, y I x  u , y  v   I x, y  2 (1) x, y where I x, y  – intensity at the point; I x  u , y  v  – intensity in the shifted point; wx, y  – window function. The window function is either a rectangular window or a Gaussian window that gives weight to the difference when displaced. To identify the angles need to maximize function E u , v  . The result of image processing by the Harris function is an image in grayscale where the angles are indicated by dots. A modified version of the Lucas Kanade optical stream is used to track the goal. The main part of the algorithm uses the current and subsequent frames as input and displays a list of characteristic points to be tracked. The current image is the first frame in the set in which the algorithm will detect and track feature points. The number of frames in the set in which you want to track the characteristic points is specified as input. In block detection and tracking characteristic points on goals using five hardware functions of the library xfOpenCV, which are combined into one new feature tracking: xf :: cornerHarris, xf :: cornersImgToList, xf :: cornerUpdate, xf :: pyrDown, xf :: densePyrOpticalFlow. The interconnection of hardware functions is shown in fig. 5. Fig. 5. Tracking feature points using a rarefied optical stream The tracking function takes the characteristic points from the Harris angle detector, and the dense optical flux vectors from the dense pyramidal optical flux function, and outputs updated coordinates of the characteristic points, tracking the input angles, using the dense flux vectors, thus simulating the behavior of the sparse optical flux. This hardware function is clocked at 300 MHz for 10,000 characteristic points of the image at 720p resolution, adding a minimum delay for the conveyor. 6. Block filtering points that are tracked. The algorithm uses a screening out operation to remove unnecessary and erroneous points. Unnecessary and erroneous dots appear on the image as a result of background motion. The n extreme points farthest from the center of mass (center of the object) are eliminated in the algorithm, and points whose motion is significantly different from the movement of most points of the object are also rejected. The resulting image can be considered as a binary mask where we are interested only in To find the bounding frame around the points in the image, we check for each non- zero pixel that its position is outside the current rectangle. If so, we update the location of the rectangle. After the overlapping of the detected characteristic points and the bounding frame on the original image (Fig. 6), we obtain in each frame of the video sequence a frame that moves behind the object (Fig. 7). Fig. 6. Overlay of detected feature points and bounding box on the original frame image Fig. 7. Supporting the air target (on the video) 7. Analysis of objects at intersections with other objects and object detection points disappearance from view. (The block of approximation of the trajectory in the past moments and the block of distinction of intersections of goals with other goals and forms of relief (analysis of complex maneuvers)). The obtained image of moving objects is compared with the previous ones, and in case of loss of any object from the field of view, it is followed in the predicted trajectory, and in the following frames the objects in the forecast areas are searched. It uses all the known information about object previews and image comparison algorithms based on special point descriptors (SIFT, SURF, ORB), boundary shape (ShapeMatch), or cross-correlation, depending on what the most characteristic features were have object before its loss. 8. Block trajectory prediction. To predict the trajectory of the object using a Kalman filter in which each frame the coordinates of the targets that are not marked as lost. 9. Block analysis of maintenance errors. The analysis of the reasons for exceeding the error of tracking a given threshold is performed on the basis of the method of hierarchy analysis and allows to establish one of the possible reasons for this exceedance: – changing the nature of the object's movement (changing speed, maneuvering, hovering, etc.); – change of video surveillance conditions (change of brightness, contrast, partial or full overlapping of moving object, etc.). After establishing the causes, one proceeds to one of the previous stages of the algorithm. This process continues until the escort error is less than the specified threshold. 6 Results Testing the algorithm of tracking the aerial objects and assessing its performance was carried out in the laboratory on static and dynamic scenes using a simulated moving air object, placed on the background, which allows to simulate different conditions of visibility (contrast) of the accompanying object. In the experiment, the movement of the object was recorded using a video camera, real-time video stream was processed using FPGA, video image after processing was displayed on the monitor screen. As a result of the video stream processing, the coordinates of the object are calculated and the object is maintained and held in the frame. The algorithm performance was evaluated by measuring the time spent processing each frame in sequence. As a result of testing, it was found that when running the FPGA algorithm, the time spent processing the frame does not depend on the object configuration, frame-filling, and background characteristics. The results of a study of the performance of the algorithm developed and a comparison of the performance of the calculations to implement the Harris function on a personal computer (PC) (CPU: i7 4500u, memory: 8GDDR3) and FPGA are shown in table 1. Table 1. The results of the comparison Video frame Processing time using a PC, ms Processing time using resolution FPGA, ms 640x480 17,5 0,98 1280x480 68,2 4,3 1920x1080 272,3 13,6 1920x1920 613,5 46,7 Table 1 shows, that the FPGA-based approach shows much better performance than the CPU. For example, the speed of an algorithm when executed on an FPGA at a frame size of 1920x1080 exceeds the speed of execution on a PC more than 20 times. Zynq Series Xilinx boards are ideal for video processing. Benefits of the Zynq platform: • the bandwidth is 12 times higher than the alternative SoCs currently on the market; • allows you to get 6x better images / sec / watt compared to embedded GPUs and typical SoCs; • allows you to use video sensor combinations. The software implementation of the algorithm discussed in this article can be implemented on the Xilinx Zynq-7000 and ZU + MPSoC platforms, including ZCU102, ZCU104. 7 Conclusions 1. The prototype of the OES for the detection and maintenance of moving targets in the air environment is developed. The development of a new automated OES1 and special algorithms for its operation, aimed at working in automatic mode and in real- time, so the detection and tracking of characteristic points on the target performed on FPGA. We recommend the software implementation of the target point detection and tracking unit using on the Xilinx Zynq-7000 and ZU + MPSoC platforms, including ZCU102, ZCU104 boards. 2. Digital video processing technology in the visible and infrared frequency bands has been developed for the automatic detection and precision tracking of aerial objects in real-time. The main components of the flowchart of the algorithms for identifying, capturing and tracking targets are identified. Software implementation of all digital video processing features on FPGA is not possible, so at this time a compromise solution was proposed using a personal computer (CPU: i7 4500u, memory: 8GDDR3) and ZCU104. 3. Was a study of the performance of the algorithm developed and a comparison of the performance of the calculations to implement the Harris function on a personal computer and FPGA. The speed of the algorithm when it is executed on the FPGA at frame size 1920x1080 exceeds the speed of execution on the PC more than 20 times. References 1. Yane, B.: Cifrovaya obrabotka izobrazhenij (Digital image processing). Tehnosfera. 584 p. (2007) ISBN 978-5-94836-122-2 1 The work was carried out with the assistance of the Department of Infocommunication Engineering V.V. Popovsky, Department of Applied Mathematics and the test plant of Kharkiv National University of Radio Electronics. 2. Lukyanica, A.A., Shishkin, A.G.: Cifrovaya obrabotka izobrazhenij (Digital image processing). Aj-Es-Es Press. 518 p (2009) 3. Murphy, K.P.: Models for Generic Visual Object Detection. Technical report, Department of Computer Science. University of British Columbia. Vancouver. Canada, p 8 (2005) 4. Viola, P., Jones, M.: Robust Real-Time Object Detection. Intl. J. Computer Vision. 57(2):137-154. Vol. 57(2), p. 137–154 (2004) 5. Bulychev, Yu.G., Manin, A.P.: (2000) Matematicheskie aspekty opredeleniya dvizheniya letatelnyh apparatov (Mathematical aspects of aircraft motion determination). Mashinostroenie. 256 p (2000) 6. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding (CVIU). Vol. 110, N 3, p. 346–359 (2008) 7. Titov, I.O., Emelyanov, G.M.: Sistema kompyuternogo zreniya dvizhushegosya vozdushnogo obekta (Computer vision system of moving air object). Kompyuternaya optika. T. 35, № 4. pp. 491–494 (2011) 8. Danelljan, M., Khan, F.S., Felsberg, M., van de Weijer, J.: Adaptive Color Attributes for Real-Time Visual Tracking. Conference on computer vision and pattern recognition, pp. 1090–1097. (2014) doi:10.1109/CVPR.2014.143 9. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-Speed Tracking with Kernelized Correlation Filters. IEEE transactions on pattern analysis and machine intelligence. Mar;37(3):583-96. (2015) doi: 10.1109/TPAMI.2014.2345390