Methodology of the Formation of Sports Matches Statistical Information Using Neural Networks Olena Sorokivska 1, Iaroslav Lytvynenko 1, Oleksandr Sorokivskyi 2, Halyna Kozbur 1, Iryna Strutynska1 1 Ternopil Ivan Puluj National Technical University, 56, Ruska Street, Ternopil, 46001, Ukraine 2 IT-Company Amazinum, Ternopil, 46001, Ukraine Abstract The article develops a methodology for the formation and processing of statistical information about held sports matches using neural networks. To find the players and the ball in the video, the authors used the Yolov5 Model. It is fast and accurate, being ideal for such a task. To determine the position of the players on the pitch, pix2pix neural networks were used. They were trained on images of players and their positions on the pitch, allowing them to accurately identify players’ positions in new images. The technology called soccerreid was used to re-identify players. It allows to distinguish one player from another based on their appearance and movements. Also, the creation of an API in Python using Flask to obtain statistical information about players and their actions on the pitch was described. This will allow coaches and analysts to receive valuable information to improve team strategy and tactics. The elaborated automated method of generating statistical data has great application potential in the industry of football match analysis. The results can be used for further research in the area of automating of the statistics formation. The obtained results can also be used to improve work in other areas related to sports. It is also worth noting that the use of artificial intelligence technologies, which are used in the development of such an intellectualized methodology, can significantly facilitate and speed up the process of analyzing video recordings, which will increase the efficiency of work and reduce the cost of time and money. Keywords 1 Machine learning, deep neural networks, football, computer vision, homography, YOLO. 1. Introduction The development of the methodology for the formation of statistical information obtained on the basis of the analysis of video recordings of football matches is a very relevant topic of research, especially in the context of modern technologies and the popularity of football in the world. It is possible, with the help of such a technique, to provide an accurate and objective analysis of the game, identify the strengths and weaknesses of the teams, and also help the coaches in preparing for the next matches. In addition, the collection and analysis of statistical data from video recordings can be useful for organizing competitions and improving their level, increasing the interest of spectators in the game, as well as the development of the football industry as a whole. It is also worth noting that the use of artificial intelligence technologies, which are used in the development of such an intellectualized methodology, can significantly facilitate and speed up the process of analyzing video recordings, which will increase the efficiency of work and reduce the cost of time and money. Proceedings ITTAP’2023: 3rd International Workshop on Information Technologies: Theoretical and Applied Problems, November 22–24, 2023, Ternopil, Ukraine, Opole, Poland EMAIL: soroka220996@gmail.com (A. 1); iaroslav.lytvynenko@gmail.com (A. 2); Gosasha401@gmail.com (A. 3); kozbur.galina@gmail.com (A. 4), strutynskairy@gmail.com (A. 5). ORCID: 0000-0001-8549-2910 (A. 1); 0000-0001-7311-4103 (A. 2); 0009-0006-6477-5878 (A. 3); 0000-0003-32970776-2910 (A. 4); 0000- 0001-5667-6569 (A. 5). ©️ 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings The purpose of the article is to develop a methodology for the formation of statistical data of personal indicators of players based on the processing of video recordings of past football matches. The main goals of the study are to provide an efficient and automated process of obtaining statistics about players based on video recordings of matches. The developed automated method of generating statistics from the recordings of football matches makes it possible to optimize and improve the existing processes in the coach’s work, reduce the number of errors and accelerate the development of the team. The results can be used for further research in the field of automating the formation of statistics. The obtained results can also be used to improve work in other areas related to sports. 2. Related Works In an early work of Assfalg, J., Bertini, M., Colombo, C., Del Bimbo, A., and Nunziati, W. [1] a method of finding a homography based on the position of the pitch and correspondence of lines was considered. The disadvantages of this technique are that its aim is only the gates and corner parts of the pitch. All work is based on the annotation of events, namely a goal kick, a corner kick and a free kick from the goalkeeper. Also, the work developed an algorithm for determining the type of shot based on the location of the players on the pitch. Finding is done by highlighting them by color. Classification of events is based on the placement of players on the pitch. If we consider works based on the study of useful statistics for teams, one of the most popular is the work of Perin C., Vuillemot R. and Fekete J. [2], which describes the importance and methods of qualitative analysis in football. Scientists suggest using a special visualization for corner kicks, long runs, pass clusters, shot distribution and others. The statistics and visualization in the work are explored in depth, but the drawback is that it lacks information about the players and their position on the pitch. This information is only available for labeled data. A recent paper by Stein, M., Janetzko, H., Lamprecht, A., Breitkreutz, T., Zimmermann, P., Goldlucke, B. and Keim, D.A. [3] deals with the full cycle of obtaining and visualizing statistical data from video recordings. To find players on the pitch, the authors use the segmentation method based on colors. To determine the position of players on the pitch, the SIFT method [4] is used, the essence of which is to find key points between images, obtaining vectors from them for combining images. The work states that the first two minutes of the match are enough to get all the camera angles and connect the images together. This allows you to get a complete diagram of the pitch. Once the schema is obtained, a predefined homography matrix is used to obtain the changes for the next frame. In general, the accuracy of these methods is not specified. In addition, the authors themselves point out that the methods of finding players on the pitch are not accurate enough. Also, the system does not include the function of automatically finding the ball on the pitch, which limits the calculation of statistical information in the future. In modern research S. Afzal, S. Ghani, M. M. Hittawe, S. F. Rashid [5] explore the current state of the art at the intersection of visualization and visual analytics, as well as image and video data analysis. The authors classify visualization articles based on various taxonomies used in visualization and visual analytics research. They review these articles in terms of task requirements, tools, datasets, and application areas, and discuss ideas based on the results of their survey. Basic scientific research is being conducted by modern scientists to evaluate the control zone in badminton doubles games using information from drones [6], as well as supplementing basketball videos with built-in gaze-guided visualizations [7]. M. B. Jurca [8] in his work constructed a robust pipeline for the sports analysis community in order to successfully extract useful information from broadcast football matches. The scientist proposed a fast and efficient solution, based on computer vision and machine learning methods and algorithms. Scientists S. Rahimi, A. Moore, P.A. Whigham [9], based on the agent's conceptual space-time model and reasoning behavior, developed guidelines for the design of a realizable vector- agent model. They applied sensitivity-variability analysis to measure the performance of different configurations of system components with respect to new movement patterns. Researchers Z. Chen, J. Beyer, H. Pfister, Q. Yang, H. Xia, X. Xie, Y. Wu [10], focused their attention on supplementing sports videos with natural language. However, none of the mentioned articles gives a clear and sufficiently comprehensive answer to the question of how to ensure an effective automated process of obtaining statistical data about players based on video recordings of matches. 3. Proposed Methodology One of the subtasks in the methodology of statistics formation is determining the position of players on the pitch. At the same time, the main information is the frames from the video. Solving this subproblem is impossible without taking into account additional variables. We offer several basic methods of solving this problem. 3.1 Determination of the homography matrix by manual method This method involves manual adjustment of one or more cameras. In order to determine the position of the players on the pitch, you need to know the boundaries of the pitch, the distance from the base of the camera to the pitch, the distance from the top of the camera to the pitch, and the homography matrix. A homography matrix is a transformation matrix used to represent three-dimensional objects in two- dimensional space or to perform other similar tasks in graphics processing. Homography is a mathematical concept that describes the relationship between two projective spaces that can be realized as images of three-dimensional objects on a plane. The homography matrix is usually determined by calibrating the camera and fitting to the points that represent the image on the plane. Knowing the homography matrix, one can perform image transformation operations such as scaling, rotation, and transmission. In a more general sense, the homography matrix is used to describe transformations between spaces of any number of dimensions, and is not limited to the field of graphics processing . Therefore, in this approach, the matrix is adjusted manually. First, one wide-format camera or several cameras are placed in a fixed position, key points are selected from 3D space and 2D space , and the homography matrix is found. In the case of one camera, it needs to be set evenly, because an error of a few millimeters can become an error of several meters, due to the fact that the football pitch is large - about 100 meters. After finding the homography matrix for the initial position of one camera, it is necessary to update the matrix using information about camera rotation angles and zoom. 3.2 Determination of the homography matrix by an automatic method The automatic method is based on the determination of camera parameters based on the gradient analysis of image edges. The normal camera calibration process must use points on the image that can be difficult to determine and may require additional training. At the same time, the edges of the image, which can be easily determined, have a high sensitivity to changes in camera parameters, which made them an object of research. The authors propose to use the gradient of the edges of the image to define the parameters of the camera without the need to use points. The method for determining the camera parameters consists in finding the optimal mapping between the corresponding edges in two images, using the gradients of these edges. The research was conducted on different types of images and it was demonstrated that the method can be effective for determining camera parameters, especially in situations where the points in the image are difficult to define, but the edges of the image can be easily found. The main result of the study is the use of image edge gradients to determine camera parameters, which may be useful for developing more efficient and accurate camera calibration methods in the future. An example of the result of this approach is shown in Figure 1. Figure 1: An example of the result of the gradient approach Another approach to automatically determine the homography matrix is to develop a neural network that determines the positions of key points in the image to find the homography. For this, a similar model architecture to UNet is used, using an encoder and a decoder. U- Net is a deep neural network architectural approach for semantic image segmentation, that is, for dividing an image into subregions and assigning a class to each subregion. It is widely used in biomedical imaging, in particular in the tasks of segmentation of cells, tissues, organs and pathological changes. Net architecture consists of two main parts: an encoder and a decoder. The encoder consists of several convolutional layers (convolutional layers) and pooling layers (pooling layers), which reduce the size of the image and increase the number of channels. This part of the network performs the function of feature extraction, which is then used by the decoder. The decoder consists of transposed convolutional layers (transposed convolutional layers) and concatenation with previous encoder outputs. The decoder gradually increases the image size and decreases the number of channels to obtain a segmentation map. Concatenation helps transfer local information from the encoder to the decoder, allowing for a more detailed segmentation map. In general, the encoder provides the extraction of a hierarchy of features that can be used for image segmentation, and the decoder provides the reproduction of the image from the segmentation map using the information from the encoder. Next, the training images were labeled using 91 key points. Also, the usual calculation of weights for filters in the convolutional model has been replaced by dynamic filter generation. For this, each of the 91 points was recoded into a vector and was used to train the decoder model. IoU metric was used to measure accuracy. Using World Cup Dataset with the help of the algorithm an accuracy of 98% was achieved. An example of the result of the algorithm is shown in Figure 2. Another approach is to determine the homography matrix using synthetic data. This approach uses a dual GAN model to obtain a picture with pitch corners. GAN (Generative Adversarial Networks) are neural network models that consist of two deep neural networks: a Generator and a Discriminator. The generator creates new images while the discriminator tries to distinguish them from the real ones. The model is trained in a confrontation process between the generator and the discriminator, where the generator tries to create such images that cannot be distinguished from real ones, and the discriminator tries to distinguish between real and synthesized images. In the learning process, the generator and the discriminator interact and learn from each other, improving their quality and ability to produce realistic images. The generator and the discriminator are competing models that work together to provide GAN training. The generator takes a random noise or vector as input that is used to generate new images. The discriminator takes an image as input and determines whether it is real or generated. The most important element of GAN is the loss function, which measures the error rate of the generator and the discriminator. The loss function must be configured so that the discriminator can distinguish between real and generated images and that the generator produces images that are close to the real ones. In the process of GAN training, the generator tries to change its parameters to reduce the discriminator error and improve the quality of its images. Figure 2: An example of the result of the algorithm using a neural network After obtaining information from the GAN model, HOG transformations are used to obtain information about the image, which then selects the best match from a database containing similar images and their corresponding homography matrices. HOG (Histogram of Oriented Gradients) is a method for determining features in images, which is widely used in computer vision and image processing. HOG is based on the concept that the shape of objects can be determined based on the orientations of pixel gradients in an image. The resulting vectors are considered image features that can be used to recognize objects in the image. An example of the algorithm is shown in Figure 3. Figure 3: Example of homography definition using GAN models However, the image shows that the model does not always predict the position of the pitch as accurately as possible. The maximum accuracy of this task is beyond the scope of work. 3.3 Analysis of approaches to player re-identification Re-identification of players on the pitch is an important task for analyzing football matches and understanding how players move and interact on the pitch. Computer vision can be used to automatically identify players based on their appearance. A system called Torchreid is commonly used to re-identify people. Torchreid is an open source software (open-source software), which provides a framework for developing and experimenting with image-based re-identification algorithms. This framework provides support for numerous datasets, including Market1501, DukeMTMC-reID, CUHK03, and others. In addition, it contains tools for data processing, model building, result validation, and visualization. Torchreid main features: 1. Support for various neural network architectures for re-identification, including ResNet, DenseNet, NASNet, EfficientNet, etc. 2. Support for various loss functions, including cross-entropy loss, triplet loss, quadruplet loss, circle loss etc. 3. Support for various methods to reduce data dimensionality, including Principal Component Analysis (PCA), Linear Discriminator Analysis (LDA), t-SNE, etc. 4. Ability to build hybrid models that use images and video simultaneously to improve re- identification results. 5. Support for various methods for data collection, including manual annotation, automatic annotation using computer vision algorithms, etc. 6. Ability to use pre-trained models to achieve better results on new data. 7. Support for various metrics for evaluating re-identification results, including Cumulative Matching Characteristics (CMC) curve and mean average precision ( mAP ). Re-identification technology SportsReID is based on this technology for finding football players, which is a system for re-identification of players in video matches using computer vision. The system uses deep learning technologies, including neural networks, to determine the identity of players in videos. To do this, the video is first processed, during which frames with the image of the players are extracted from it and algorithms are used to determine features such as body shape, clothes, shoes and other details of the appearance. The obtained features are compared with data from a database containing images of players and their identification data. The search is performed using image comparison algorithms such as histogram oriented gradients (HOG), deep neural networks, and others. SportsReID is used in large team sports events where it is necessary to track the movements of many players at the same time. The system can help in training teams, analyze the game and identify the best players in order to improve performance. 4. Results 4.1. Highlighting the main requirements for the methodology of the formation of statistical information about the held sports matches on the basis of video recordings In the process of the research, the authors started from the task that the method for generating statistical information should be able to find players and the ball in video, determine their position on the pitch and calculate variable data that can be used by coaches in the future to improve the team’s work. The methodology should also be able to be integrated into any application. For the successful integration of the technique into applications, it was chosen to create an API for accessing the technique, which can be used to obtain information about the movement of players and calculated statistical data. Statistical data that should be in this technique for each player: • Time of ball possession; • Distance traveled; • Number of passes; • Number of interceptions; • Time on your side; • Time spent on the side of the competitor; • Average speed. Due to these statistics, you can make a complete picture of the player’s work during the game, as well as understand the key advantages and disadvantages of certain strategies. 4.2 Finding the players and the ball A dataset with placed players and a ball was used to train ball detection. The players and the ball were highlighted by rectangles, and the coordinates of the rectangles were recorded in a special format for training the model. As a result, 1000 pictures were selected for training the model, which contained, on average, 14.5 players in the frame. An example dataset is shown in Figure 4. Figure 4: An example of a dataset for finding players and the ball In order to choose the best model for the task of finding, two models were compared - YOLO and R-CNN according to the following parameters: • Architecture: YOLO is a single-stage architecture, i. e. it detects objects in the entire image at once, while R-CNN is a multi-stage architecture, i. e. it first extracts regions containing objects and then detects them. • Speed: YOLO is generally faster than R-CNN because it performs computations on the entire image at the same time without requiring additional computations on regions that do not contain objects. • Accuracy: R-CNN usually has higher object detection accuracy because it can use more sophisticated methods to identify regions of images that contain objects. However, due to the more complex architecture, it works slower than YOLO. • Computational resource requirements: YOLO generally requires less resources than R-CNN because it has fewer layers and operations. This makes it more popular for use on resource- constrained devices such as mobile phones. • Video processing: YOLO usually works better with video because it can identify objects in each frame at the same time. R-CNN, on the other hand, requires additional time to process each frame. In addition, there are many ready-made solutions for training YOLO models that simplify this task, so it was chosen as the main model for finding objects. At the time of development, there were two most popular YOLO models - v4 and v5. Both versions of the model have advantages and disadvantages: • Architecture: YOLOv4 has a more complex architecture compared to YOLOv5, which has a simpler and more efficient structure. •Speed: YOLOv5 is generally faster than YOLOv4 because it has fewer computations and layers. • Accuracy: YOLOv4 generally has higher object detection accuracy because it uses more sophisticated training methods such as Scaled-YOLOv4 and YOLOv4-P5. However, due to the more complex architecture, it works slower than YOLOv5. • Image processing: YOLOv5 has better object detection accuracy in small images because it uses high-resolution image processing techniques. YOLOv4, on the other hand, tends to retrain on small images. • Computational resource requirements: YOLOv5 generally requires less resources than YOLOv4 because it has fewer layers and operations. This makes it more popular for use on resource-constrained devices such as mobile phones. YOLOv5 training requires a special data structure. An example of such a structure is shown in Figure 5. Figure 5: An example of a data structure for model training The comparative result of model training is shown in Figure 6. Figure 6: The result of the model of finding players and the ball The image above shows what the video looks like before processing with the model, and the image below shows what it looks like after. 4.3 Determining the position of the players and the ball on the pitch To determine the position of the players on the pitch, you first need to find the homography matrix. An algorithm was chosen that uses synthetic data for training for this aim, because its accuracy is sufficient for the given task. An example of the algorithm is shown in Figure 7. Figure 7: An example of the operation of the homography determination algorithm The Two-GAN model consists of two pix2pix models – one of which is responsible for the segmentation of grass on the pitch, and the other one is responsible for finding lines on the pitch. Pix2pix is a deep learning model used to generate images using conditional GANs (generative adversarial network). It is capable of generating high-quality images that match the input data. The pix2pix model uses a pair of images – input and output – for training. For example, the input can be a black and white image, and the output can be a color image. The model is trained to find the relationship between input and output images using conditional GANs. An example of the operation of such a model is shown in Figure 8. Figure 8: An example of the pix2pix model In the training process, pix2pix generates images from stochastic noise, which are then transmitted to a discriminator that recognizes whether the image is plausible. Separately, the image is transmitted to the input of the generator, which creates a new output image based on the input image and the learned dependencies. The pix2pix model is very flexible as it can be used for many image generation tasks, such as creating photorealistic images from artificial descriptions, transforming image styles, generating a city map from a satellite image, and more. However, for the pix2pix model to work successfully, a large amount of training data and high- power computing resources are needed, especially when using large images. An example of a trained model combined with a player and ball location model is shown in Figure 9. Figure 9: The result of combining the player finding model and the homography matrix definition model As you can see from the images, the algorithm performs quite accurately, considering that only one camera is used. 4.4 Re-identification of players To re-identify players, Sportsreid technology is used, which is specifically trained to re-identify players on different frames. Sportsreid contains several different types of models. Comparative characteristics of the models are shown in Table 1. Table 1 Comparative characteristics of models Name Size Resolution mAP rank- 1 ResNet50-fc512 24.6M 256x128 81.8 76.1 Name Size Resolution mAP rank- 1 OSNet_x1_0 2.2M 256x128 83.4 78.0 DeiT-Tiny/16 5.5M 224х224 82.2 76.2 DeiT-S/16 21.7M 224х224 84.3 79.4 ViT-B/16 57.7M 224х224 86.0 81.5 ViT-L/16* 303.6M 224х224 89.8 86.7 For comparison, such metrics as mAP and rank-1 are used. mAP (mean average precision) and rank- 1 are metrics used to evaluate the effectiveness of computer vision algorithms in the tasks of recognizing objects or people in images or videos. mAP is a metric that measures the average accuracy of object recognition in an image. It takes into account both the accuracy of the found objects and their number. Usually, to calculate mAP , first the threshold for marking an object as found or not found is determined, then the recognition accuracy (precision) is calculated and it is checked whether the correct number of objects were found. The final result is calculated as the average accuracy value for each threshold. Rank-1 is a metric that measures the accuracy of recognizing people in an image or video. It indicates what percentage of people were recognized correctly when comparing them with the database. Ranking takes place using different algorithms, for example, using the Euclidean distance method or the cosine similarity method between vectors of facial features. Typically, these metrics are used to compare the performance of different algorithms and models for object or person recognition, helping researchers figure out which algorithm is most effective for a particular task. According to the table, the most optimal option is a model with 2.2m parameters and an image size of 256x128. Two frames with a difference of 100 shots were selected to test the re-identification performance. For better accuracy, 10 neighboring frames were selected and object detection with tracking via DeepSort was used. After identified unique personalities from DeepSort. next, all found football players were clipped and converted to vectors using Sportsreid. After that, the distance between all vectors was calculated using the cosine distance formula. An example of how the cosine distance formula works is shown in Figure 10. Figure 10: An example of cosine distance operation After finding the distance, the best candidate among all is selected and it is checked whether the distance is less than a certain threshold. If so, the candidate remains and the process continues. The result of finding candidates is shown in Figure 11. Figure 11: Candidates and their counterparts The respondents are shown above, and their candidates below. In total, you can find 9 correspondences between frames. In general, the accuracy of the technique is sufficient for the initial tasks, but it needs to be refined in the future. 4.5 Calculation of statistical data For a football player, the importance of different metrics may depend on the role he plays on the pitch. Time of ball possession. This metric indicates how long a player keeps the ball at his feet. This is important for players who are responsible for deploying the team’s attack. The longer a player holds the ball, the more time he has to make the right decision and pass the ball to his partners. To calculate this metric, information about the position of the ball in the frame and the player is used. If the ball is close to the player (30 pixels) or the last coordinates of the ball are the coordinates of the player’s feet, then the time of possession of the ball is credited to the player. The frame rate in the video is also taken into account for a more accurate calculation of the metric. Distance traveled. This metric indicates how many meters the player covered during the match. This is important for players who are responsible for covering a large area on the pitch. Players such as forwards, defenders and midfielders must be responsible for moving from their side to the opposing side and back to help their team in attack and defence. Data to calculate this metric is collected only when the player is visible in the frame. During the video, the homography matrix is calculated, compared with the player’s movement, and calculates the approximate number of meters he has traveled. Number of passes. This metric indicates how many times a player has passed to his partners. This is important for players who are responsible for organizing the team’s attacks. Midfield and attacking players usually need to be good passers, as their passes can lead to goals or other scoring opportunities. To calculate this metric, the position of the players and the ball is used. If the ball was close to the player (30 pixels), or the trajectory of the ball started from the feet of one player of the team and passed into the near zone or the feet of another player from the same team, then such an event is counted as a pass. Number of interceptions. This metric indicates how many times a player stopped an opponent’s attack by intercepting the ball. This is important for players who are responsible for the team’s defense. Defensive and midfield players usually need to be good interceptors, as their ability to stop opposing attacks can be critical to a team’s success. If the ball was close to the player (30 pixels), or the trajectory of the ball started from the feet of one player of the team and passed into the near zone or the feet of another player from the opposite team, then such an event is counted as an interception. Time Spent on Own Side and Time Spent on Opponent’s Side: These metrics indicate how much time a player has spent on their own side and on the opponent's side. This is important for all players as it helps to understand where and how a player spends his time on the pitch and how it can be used to help the team succeed. To calculate these metrics, information about the position on the pitch is used, which is obtained using the homography matrix. Overall, these metrics help players and coaches analyze and improve the performance of players and the team as a whole. And if a player can improve in one of these metrics, it can have an immediate impact on the team’s success. 4.6 Application Programming Interface (API) API (Application Programming Interface) is a set of protocols, tools and standards used to develop software and ensure interaction between various software components. An API defines how different software components should interact with each other and what actions and operations can be performed from these components. An example of returning API information is shown in Figure 12. Figure 12: Example of information returned from the API The application API is implemented in Python using Flask. Flask is a lightweight framework for building web applications in Python. API in this case implements the following functions: • Receive video: The API can receive video that needs to be decoded for further processing. • Video Processing: Encoded video is processed using machine learning models to derive metrics and player data. These metrics include time of possession, distance covered, number of passes, number of interceptions and other characteristics that are important. • Returning results: The last step of the API is to return the results of video processing as a JSON object. The overall API architecture includes components such as router, machine learning model, database, and others. 5. Conclusions In this article, a methodology for the formation and processing of statistical information about sports matches based on the use of neural networks was developed. The Yolov5 model was used to locate the players and the ball in the image. It is fast and accurate, so it is ideal for such a task. To determine the position of the players on the pitch, pix2pix neural networks were used. They were trained on images of players and their positions on the pitch, allowing them to accurately identify players’ positions in new images. Soccerreid technology was used to re-identify players. It allows you to distinguish one player from another based on their appearance and movements. Also, the creation of an API in Python using Flask to obtain statistical information about players and their actions on the pitch was described. This allows coaches and analysts to obtain valuable information to improve team strategy and tactics. Further research may concern improving the accuracy of the developed methodology, its speed and enrichment of statistics. The scientific novelty of the obtained results lies in the fact that the developed automated method of generating statistical data has a great potential for application in the industry of football match analysis. The results can be used for further research in the field of automating the formation of statistics. The obtained results can also be used to improve work in other areas related to sports. 6. References [1] J. Assfalg, M. Bertini, C. Colombo, A. Del Bimbo, W. Nunziati, Semantic annotation of soccer videos: automatic highlights identification, Computer Vision and Image Understanding (2003) 285–305. doi: 10.1016/j.cviu.2003.06.004. [2] Ch. Perin, R. Vuillemot, J.-D. Fekete, SoccerStories: A Kick-off for Visual Soccer Analysis, Proceedings of the IEEE Transactions on Visualization and Computer Graphics 19(12):2506-15, Dec. 2013. doi:10.1109/TVCG.2013.192. [3] M. Stein, H. Janetzko, A. Lamprecht, T. Breitkreutz, P. Zimmermann, Goldlucke, B., Schreck, T., Andrienko, G., Grossniklaus, M. and Keim, D. Bring it to the Pitch: Combining Video and Movement Data to Enhance Team Sport Analysis. IEEE Transactions on Visualization and Computer Graphics (2018) 13–22. doi: 10.1109/TVCG.2017.2745181. [4] D. G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision (2004). URL: ijcv04.pdf (ubc.ca). [5] S. Afzal, S. Ghani, M. M. Hittawe, S. F. Rashid. Visualization and Visual Analytics Approaches for Image and Video Datasets: A Survey, The ACM Transactions on Interactive Intelligent Systems 13 (2023). doi: 10.1145/3576935. [6] N. Ding, W. Jin, K. Takeda, Y. Bei, K. Fujii. Estimation of control area in badminton doubles with pose information from top and back view drone videos, Multimedia Tools and Applications (2023). doi: 10.1007/s11042-023-16362-1. [7] Z. Chen, Q. Yang, J. Shan, T. Lin, J. Beyer, H. Xia, H. Pfister. iBall: Augmenting Basketball Videos with Gaze-moderated Embedded Visualizations, Human-Computer Interaction (2023). doi: 10.48550/arXiv.2303.03476. [8] M. B. Jurca. A modern approach for positional football analysis using computer vision, Proceedings of the 2022 IEEE 18th International Conference on Intelligent Computer Communication and Processing (ICCP), Sept. 2022. doi: 10.1109/ICCP56966.2022.10053962. [9] S. Rahimi, A. Moore, P.A. Whigham. A vector-agent approach to (spatiotemporal) movement modelling and reasoning, Scientific Reports (2022). doi: 10.1038/s41598-022-22056-9. [10] Z. Chen, J. Beyer, H. Pfister, Q. Yang, H. Xia, X. Xie, Y. Wu. Sporthesia: Augmenting Sports Videos Using Natural Language. Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Oct. 2022. doi: 10.1109/TVCG.2022.3209497. [11] J. Wang, K. Hu, Zh. Zhou, J. Ma. Tac-Trainer: A Visual Analytics System for IoT-based Racket Sports Training, Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Oct. 2022. doi: 10.1109/TVCG.2022.3209352. [12] X. Xie, Y. Wu, D. Deng, Y. Wu. OBTracker: Visual Analytics of Off-ball Movements in Basketball. Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Sept. 2022. doi: 10.1109/TVCG.2022.3209373. [13] B. Jackson, T. Y. Lau, D. Schroeder, KC. Jr. Toussaint, D. F. Keefe. A lightweight tangible 3D interface for interactive visualization of thin fiber structures. Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Dec. 2013. doi: 10.1109/TVCG.2013.121. [14] K. Moreland. A survey of visualization pipelines. Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Mar. 2013. doi: 10.1109/TVCG.2012.133. [15] S. Rahimi, A. B. Moore, P. A. Whigham, A vector-agent approach to (spatiotemporal) movement modelling and reasoning, Scientific Reports (2022). doi: 10.1038/s41598-022-22056-9. [16] J. Beernaerts, B. De Baets, M. Lenoir, N. Van de Weghe. Qualitative Team Formation Analysis in Football: A Case Study of the 2018 FIFA World Cup, Frontiers in Psychology (2022). doi: 10.3389/fpsyg.2022.863216 [17] A. Benito Santos, R. Theron, A. Losada, J. E. Sampaio, C. Lago-Peñas. Data-Driven Visual Performance Analysis in Soccer: An Exploratory Prototype. Frontiers in Psychology (2018). doi: 10.3389/fpsyg.2018.02416. [18] C. D. Stolper, M. Kahng, Z. Lin, F. Foerster, A. Goel, J. Stasko, D. H. Chau. GLO-STIX: Graph- Level Operations for Specifying Techniques and Interactive eXploration. Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Dec. 2014. doi: 10.1109/TVCG.2014.2346444. [19] F. Lord, D. B. Pyne, M. Welvaert,J. K. Mara. Capture, analyse, visualise: An exemplar of performance analysis in practice in field hockey, PLoS One (2022). doi: 10.1371/journal.pone.0268171. [20] P. Isenberg, P. Dragicevic, W. Willett, A. Bezerianos, J.D. Hybrid-image visualization for large viewing environments. Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Dec. 2013. doi: 10.1109/TVCG.2013.163. [21] X. Hu, L. Bradel, D. Maiti, L. House, C. North, S. Leman. Semantics of directly manipulating spatializations. Proceedings of the IEEE Transactions on Visualization and Computer Graphics, Dec. 2013. doi: 10.1109/TVCG.2013.188. [22] M. Krzywinski, I. Birol, S. J. M. Jones, and M. A. Marra.Hive plotsrational approach to visualizing networks. Briefingsin Bioinformatics, 13(5):627–644, Sept. 2012. doi: 10.1093/bib/bbr069. [23] I Konovalenko, P Maruschak, V Brevus. Steel surface defect detection using an ensemble of deep residual neural networks, Journal of Computing and Information Science in Engineering 2022, 22(1), 014501, https://doi.org/10.1115/1.4051435. [24] I. V. Lytvynenko; P. O. Maruschak; S. A. Lupenko; Yu. I. Hats; A. Menou; S. V. Panin. Software for segmentation, statistical analysis and modeling of surface ordered structures, AIP Conf. Proc. 1785, 030012, 2016, https://doi.org/10.1063/1.4967033. [25] 3. Shymchuk, G., Lytvynenko, I., Hromyak, R., Lytvynenko, S., Hotovych, V. Gas Consumption Forecasting Using Machine Learning Methods and Taking into Account Climatic Indicators. CEUR Workshop Proceedings, 2023, 3468, pp. 156–163.