© 2021 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0)
Prediction of Football Games Results
Roman Nestoruk
1
and Grzegorz Słowiński
2
1
Sollers Consulting Sp. z o.o., ul. Koszykowa 54, Warsaw 00-675, Poland
2
University of Technology and Economics, Engineering Department, ul. Jagiellońska 82f, Warsaw 03 -301, Poland
Abstract
For creation of 3 machine learning models, dataset of 50, 100 and 200 games are being used. All the models are
built, using deep learning (DL) and machine technology (ML) technique with the goal to prove, that even ML
algorithms can be used to predict football games result. The data set consists of different real games results,
collected from the most recognizable tournaments, such as: English Premier League, Italian Seria A, German
Bundesliga, Spanish La Liga and French League 1. The target values of the work are prediction of exact game
score (Average accuracy obtained after the last wave of testing – 11.6%) and prediction of game result (Average
accuracy obtained after the last wave of testing – 39%).
Keywords
machine learning, football games prediction, deep learning
1. Introduction
Mainly, the regular person thinking that football is unpredictable and sometimes, analogical game, but we
are living in the 21st century, where technologies have become one of the biggest parts of our lives.
We are using virtual assistance, image and voice recognition, autopilots, we almost meet the era of self-
driving cars. The brain of all these discoveries is Artificial Intelligence, with neural networks inside. We
think these technologies are very helpful for achieving the main target of this work – proving that even
football, where every match consists of thousands of different moments, can be predicted by Artificial
Intelligence better than by benchmark.
2. Used Tools and technology
As football statistic is not available in the format of data files, or API communication response, scraping
algorithm is needed. To not enhance existing stack with extra languages, scrapping algorithm was written
in Python and with use of Selenium Web Driver framework & BeautifulSoup4 library. For machine learning
processes TensorFlow and keras frameworks has been used and CSV library for storing data.
3. Data for training and validation
One of the most recognized kinds of statistics in football games are possession and shots, but for this
algorithm, some more data are also useful:
• Average game mark: Shows the performance of the team, during the season.
• The average amount of goals, per game: Result of dividing the number of goals, scored by the look
at team, by the number of played games.
• Average possession: Average percentage of possession of the ball during the games.
• Pass accuracy: Counting by diving number of all successfully completed passes, by the number of
all passes of the team.
• Shots per game: Anyone, who is connected to football knows, that goals are mainly the result of
shots.
• Average players mark from most possible starting line up: Shows the performance of every single
player, during the season.