Prediction of Football Games Results Roman Nestoruk 1 and Grzegorz Słowiński 2 1 Sollers Consulting Sp. z o.o., ul. Koszykowa 54, Warsaw 00-675, Poland 2 University of Technology and Economics, Engineering Department, ul. Jagiellońska 82f, Warsaw 03-301, Poland Abstract For creation of 3 machine learning models, dataset of 50, 100 and 200 games are being used. All the models are built, using deep learning (DL) and machine technology (ML) technique with the goal to prove, that even ML algorithms can be used to predict football games result. The data set consists of different real games results, collected from the most recognizable tournaments, such as: English Premier League, Italian Seria A, German Bundesliga, Spanish La Liga and French League 1. The target values of the work are prediction of exact game score (Average accuracy obtained after the last wave of testing – 11.6%) and prediction of game result (Average accuracy obtained after the last wave of testing – 39%). Keywords machine learning, football games prediction, deep learning 1. Introduction Mainly, the regular person thinking that football is unpredictable and sometimes, analogical game, but we are living in the 21st century, where technologies have become one of the biggest parts of our lives. We are using virtual assistance, image and voice recognition, autopilots, we almost meet the era of self- driving cars. The brain of all these discoveries is Artificial Intelligence, with neural networks inside. We think these technologies are very helpful for achieving the main target of this work – proving that even football, where every match consists of thousands of different moments, can be predicted by Artificial Intelligence better than by benchmark. 2. Used Tools and technology As football statistic is not available in the format of data files, or API communication response, scraping algorithm is needed. To not enhance existing stack with extra languages, scrapping algorithm was written in Python and with use of Selenium Web Driver framework & BeautifulSoup4 library. For machine learning processes TensorFlow and keras frameworks has been used and CSV library for storing data. 3. Data for training and validation One of the most recognized kinds of statistics in football games are possession and shots, but for this algorithm, some more data are also useful: • Average game mark: Shows the performance of the team, during the season. • The average amount of goals, per game: Result of dividing the number of goals, scored by the look at team, by the number of played games. • Average possession: Average percentage of possession of the ball during the games. • Pass accuracy: Counting by diving number of all successfully completed passes, by the number of all passes of the team. • Shots per game: Anyone, who is connected to football knows, that goals are mainly the result of shots. • Average players mark from most possible starting line up: Shows the performance of every single player, during the season. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Figure 1: Table with player’s mark 4. Model creation For this experiment, model with 3 dense layers is being used. As shown on figure 2, model is consisting of: Figure 2: Model summary • Hidden layer 1 Consist of 56 units, with RELU activation. • Hidden layer 2 Consist of 28 units, with RELU activation. • Output layer Consist just of 2 units, with Linear activation. [Введите текст] Figure 3: ReLU activation function graph [Source: 8] For the first two layers, RELU activation helps to decrease all negative values, as team can not score -1 goals. 4.1. Data preparation Considering that almost never in football one team is scoring more than 10 goals, expected result was transformed to the format of 0-1 value by dividing it by 10. For example: Actual score: 1:3, score after transformation: 0.1:0.3. To be able, to better validate the result, extra 10% of the data was used for testing and validation of the model. 5. Models structure and usage As a result of experiment, 3 models where created: • Model 1: Trained on 50 examples of games with no unexpected result and validated on 5 extra examples. • Model 2: Trained on 100 examples of games with no unexpected result and validated on 10 extra examples. • Model 3: Trained on 200 examples and validated on 20 extra examples. To make a summary, how effective are models in daily games prediction. Upcoming days games statistic were taken as input data for model. Figure 4: Example of information, used for the result prediction To simplify the process of validation, result of models prediction is stored table with following format: • Team playing home as t1 • Team playing away as t1 • Number of goals predicted by the first model, for the home teams as m1t1 • Number of goals predicted by the first model, for the away teams as m1t2 • Number of goals predicted by the second model, for the home teams as m2t1 • Number of goals predicted by the second model, for the away teams as m2t2 [Введите текст] • Number of goals predicted by the last model, for the home teams as m3t1 • Number of goals predicted by the last model, for the away teams as m3t2 Figure 5: Data stored into the “results” variable. 6. Models evaluation After all matches, we were interested in, have been finished. We can start comparing our predictions with the actual result. To simplify the result verification, we should transfer output of DL model to the integer, for that purpose, values where multiplied by. As a standard for this transformation, regular rules of rounding where used: • Values are less than integer and half will be rounded to closes lower integer. For example: 1.5252- >2, 2.9842->3, 0.5->1 • Values are less then integer and half, will be rounded to closes lower integer. For example: 0.4999- >0, 1.1->1, 2.332->2. Following these rules, a result like 0.5 vs 0.49 will be considered as 1 vs 0, but a result of 1.49 vs 0.5 will be considered as 1 vs 1. The most known kind of prediction is a white guessing. Considering that probability of randomly guessing the result of any football game is 1 by the amount of possible – 3 (Winning of the home team, draw, or winning of away team), technically it is 33.3%. The possibility of predicting the exact score of the game is more complicated because all possible combinations of the score should be considered. The chance of scoring more than 4 goals is too small, to be considered. So, to calculate the chance of predicting the exact score of the game, we should calculate the combination of 5 elements (score from 0 to 4) into 2 places (for 2 playing teams, home and away). We can calculate it by using the formula 6 – 1. The result of this calculation gives as 15 and the chance of prediction of the exact score of the game is 1 by 15 or 6.7%. (𝑛 + 𝑟 − 1)! 𝐶(𝑛 + 𝑟 − 1, 𝑟) = 𝑟! (𝑛 − 1)! Formula 6: Combination with repetition 6.1. Model 1 evaluation As mentioned in Chapter 6, model 1 was trained on the smallest amount of real data – 50 games. • The average percentage of the predicted exact score of the games is on the level 10.3%. It's around 1.5 times more than mathematical chances to predict it. Of course, 0 predicted games for the 02- 07-20 is looking not promising, but we had a very small amount of data to predict. Next days, this amount increased and was more than 2 times greater than the mathematical probability. • The average percentage of the predicted winner or draw category is much higher. Of course, in Picture 8.2 we can see that the first day was failed again. Next days we can see results 4 and 3 times higher than on the 02-07-20, but this time average is below mathematical. [Введите текст] Figure 7: Statistic of prediction from model 1 Figure 8: Diagram for the statistic of prediction from model 1 6.2. Model 2 evaluation Model 2 was trained on the higher amount of real data – 100 games. • This time we can see good progress on the average percent of predicted games, mainly because of the predicted games from day 02-07-20. The average percentage for the exact score category was much more stable and stands on the level 2 times more than mathematical probability. • For the Predicted winner or draw category we can again see a big difference on the first row, but almost similar results on the day two and three. The average prognosis stands on the level 44.2%, it's now 33% more effective than mathematical probably for games prediction. Figure 9: Statistic of prediction from model 2 [Введите текст] Figure 10: Diagram for the statistic of prediction from model 2 6.3. Model 3 evaluation Model 3 was trained on the highest amount of real data – 200 games. • The last model, we are taking into the evaluation shows a very interesting result. The average percentage of predicting the exact score is above the mathematical chances, average percentage against stands on the level of around 10%, but much more stable than the first model. • On the Predicted winner or draw category we can see a big difference in percentages, the first day was amazingly predicted with 6 out of 8 games. This is more than two times higher than the random guess probably. But after the checking next two days we can see, that this percentage drops so much to an extremely low level. Figure 11: Statistic of prediction from model 3 Figure 12: Diagram for statistic of prediction from model 3 [Введите текст] 6.4. Models comparison For the model's comparison, we are using predictions, they made for real games. To discuss mainly the advantages and disadvantages of the model, we will be concentrated only on those 3 dates: 02-07-20, 11- 07-20 and 12-07-20 All of them are very useful, in terms of finding the problems, which can be improved during the training of the future model. 6.4.1. First day of experiment After the first look at the model's result, we can be somehow disappointed about their prediction for the first day. Two of them were doing very well and predicted more than 60% of the games, but one completely failed the experiment. To find the reason for this situation we should look at the games, we were trying to predict Figure 15. Figure 13: Games, took place at the date of the experiment (02-07-20) Most of these games were very important because it was games between the table "place mates". The difference between the data was very small, but for almost all the games, except the game Roma vs Udinese, we have seen the success of a team, having a little bit better statistic. After this analysis, we can assume that our model 1 is having a very small amount of data with different quality, which impacts a bigger range of possible results. For example: Let's imagine the ball falls to the ground, we need to predict the height at which the ball will rise after falling. If we have seen too few examples, with different results we will think like this: The ball might have small pressure and will raise only for 40% of the initial height or this ball might have different materials and jump for 60% or even higher. Neural networks trying to consider as most data, as it can, so some of the examples from the data set can make only mistakes in prediction. Now, we can investigate model 3, For the first day we have an amazing result, but why we have such a big drop for the next days? The problem is very similar to the one above. Because we have a big amount of data, saying for example: "Ball almost every time jumps to the 50% of the initial height", the model will [Введите текст] ignore extra data and always trying to make a prediction without caring about extra data. This problem is usually called overtraining. Let us finally discuss model 2. There we have an average amount of data, so after the training model "thinks", usually the ball jumps for 50% of the initial height, but let us consider materials, this ball was created from, etc. This is the reason, why this model is not failing very much during all the games. 6.4.2. Second day of experiment After reviewing day 2, we can see, how the overtraining problem having an impact on forecasting. According to our statistics, the best prediction for this day was made by the first model. To make the right conclusion about this day, we should investigate the games and their results (Figure 15) from football analytics way. Some of them finished with the surprising and hardly predictable result even for the specialists in analytics, like draw in the game between the English champion and the team, from the second part of the table. Figure 14: Games, took place at the date of the experiment (11-07-20) The first model showed us the best result 12 out of 25 or 48% of correct predictions for the "Predicted winner or draw" category, while the second model was working also well, with 40% of predicted games. The worst result was shown by the third model. 7 out of 25 and 28%, which is also fine if it happens rarely. 6.4.3. Third day of experiment All of the models having similar results, but they have predicted different games. Because of 20 games with different kinds of teams, we can see a very good percent of exact prediction on the 10-15%. But as well this variety of teams is having a big impact on the amount of predicted winner/draw. All of them are on the 25- 30%. For this day, as well as the previous one, we can see the worst result is made by model 3. [Введите текст] Figure 15: Games, took place at the date of the experiment (12-07-20) 7. Summary The main idea of this work – is to create a new kind of working model, for football result prediction. While other peoples are trying to predict the winner of the game, we decided to look more for trying to predict the number of goals each team should score. After the following experiments, we can conclude that amount of data, taken for the training is not having the main impact on successful predictions. Mainly the quality of the data and a little bit of luck are making success in this field. For getting these 3 different neural networks, we have done more than 500 tries to fit them, because, for such kind of sport like football, the models should have some understanding about the usability of every component and not every time we can receive the same result even with the same statistic. [Введите текст] 8. References [1] Maureen Caudill, Neural Network Primer: Part I", February 1989 [2] Francois Chollet, Deep Learning with Python Paperback, 10 January 2018 [3] Andreas C. Müller, Sarah Guido, Introduction to Machine Learning with Python: A Guide for Data Scientists, 2015 [4] Sebastian Raschka, Vahid Mirjalili, Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow, 2 December 2019 [5] Yann LeCun, Gökhan BakIr, Thomas Hofmann, Bernhard ,Alexander J. Smola, Predicting Structured Data (Neural Information Processing series), July 2007 [6] David J. Livingstone, Artificial Neural Networks, 2009 [7] https://medium.com/@toprak.mhmt/activation-functions-for-deep-learning-13d8b9b20e Access date: 11.07.2021 [Введите текст]