\documentclass[10pt,a4paper]{article} % Packages \usepackage{fancyhdr} % For header and footer \usepackage{multicol} % Allows multicols in tables \usepackage{tabularx} % Intelligent column widths \usepackage{tabulary} % Used in header and footer \usepackage{hhline} % Border under tables \usepackage{graphicx} % For images \usepackage{xcolor} % For hex colours %\usepackage[utf8x]{inputenc} % For unicode character support \usepackage[T1]{fontenc} % Without this we get weird character replacements \usepackage{colortbl} % For coloured tables \usepackage{setspace} % For line height \usepackage{lastpage} % Needed for total page number \usepackage{seqsplit} % Splits long words. %\usepackage{opensans} % Can't make this work so far. Shame. Would be lovely. \usepackage[normalem]{ulem} % For underlining links % Most of the following are not required for the majority % of cheat sheets but are needed for some symbol support. \usepackage{amsmath} % Symbols \usepackage{MnSymbol} % Symbols \usepackage{wasysym} % Symbols %\usepackage[english,german,french,spanish,italian]{babel} % Languages % Document Info \author{90Quantile} \pdfinfo{ /Title (football-advanced-metrics.pdf) /Creator (Cheatography) /Author (90Quantile) /Subject (Football advanced metrics Cheat Sheet) } % Lengths and widths \addtolength{\textwidth}{6cm} \addtolength{\textheight}{-1cm} \addtolength{\hoffset}{-3cm} \addtolength{\voffset}{-2cm} \setlength{\tabcolsep}{0.2cm} % Space between columns \setlength{\headsep}{-12pt} % Reduce space between header and content \setlength{\headheight}{85pt} % If less, LaTeX automatically increases it \renewcommand{\footrulewidth}{0pt} % Remove footer line \renewcommand{\headrulewidth}{0pt} % Remove header line \renewcommand{\seqinsert}{\ifmmode\allowbreak\else\-\fi} % Hyphens in seqsplit % This two commands together give roughly % the right line height in the tables \renewcommand{\arraystretch}{1.3} \onehalfspacing % Commands \newcommand{\SetRowColor}[1]{\noalign{\gdef\RowColorName{#1}}\rowcolor{\RowColorName}} % Shortcut for row colour \newcommand{\mymulticolumn}[3]{\multicolumn{#1}{>{\columncolor{\RowColorName}}#2}{#3}} % For coloured multi-cols \newcolumntype{x}[1]{>{\raggedright}p{#1}} % New column types for ragged-right paragraph columns \newcommand{\tn}{\tabularnewline} % Required as custom column type in use % Font and Colours \definecolor{HeadBackground}{HTML}{333333} \definecolor{FootBackground}{HTML}{666666} \definecolor{TextColor}{HTML}{333333} \definecolor{DarkBackground}{HTML}{E2CBAC} \definecolor{LightBackground}{HTML}{FBF8F4} \renewcommand{\familydefault}{\sfdefault} \color{TextColor} % Header and Footer \pagestyle{fancy} \fancyhead{} % Set header to blank \fancyfoot{} % Set footer to blank \fancyhead[L]{ \noindent \begin{multicols}{3} \begin{tabulary}{5.8cm}{C} \SetRowColor{DarkBackground} \vspace{-7pt} {\parbox{\dimexpr\textwidth-2\fboxsep\relax}{\noindent \hspace*{-6pt}\includegraphics[width=5.8cm]{/web/www.cheatography.com/public/images/cheatography_logo.pdf}} } \end{tabulary} \columnbreak \begin{tabulary}{11cm}{L} \vspace{-2pt}\large{\bf{\textcolor{DarkBackground}{\textrm{Football advanced metrics Cheat Sheet}}}} \\ \normalsize{by \textcolor{DarkBackground}{90Quantile} via \textcolor{DarkBackground}{\uline{cheatography.com/195166/cs/40827/}}} \end{tabulary} \end{multicols}} \fancyfoot[L]{ \footnotesize \noindent \begin{multicols}{3} \begin{tabulary}{5.8cm}{LL} \SetRowColor{FootBackground} \mymulticolumn{2}{p{5.377cm}}{\bf\textcolor{white}{Cheatographer}} \\ \vspace{-2pt}90Quantile \\ \uline{cheatography.com/90quantile} \\ \end{tabulary} \vfill \columnbreak \begin{tabulary}{5.8cm}{L} \SetRowColor{FootBackground} \mymulticolumn{1}{p{5.377cm}}{\bf\textcolor{white}{Cheat Sheet}} \\ \vspace{-2pt}Published 8th November, 2024.\\ Updated 8th November, 2024.\\ Page {\thepage} of \pageref{LastPage}. \end{tabulary} \vfill \columnbreak \begin{tabulary}{5.8cm}{L} \SetRowColor{FootBackground} \mymulticolumn{1}{p{5.377cm}}{\bf\textcolor{white}{Sponsor}} \\ \SetRowColor{white} \vspace{-5pt} %\includegraphics[width=48px,height=48px]{dave.jpeg} Measure your website readability!\\ www.readability-score.com \end{tabulary} \end{multicols}} \begin{document} \raggedright \raggedcolumns % Set font size to small. Switch to any value % from this page to resize cheat sheet text: % www.emerson.emory.edu/services/latex/latex_169.html \footnotesize % Small font. \begin{multicols*}{2} \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Common metrics}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{{\bf{Expected Goals (xG)}} \newline % Row Count 1 (+ 1) Probability that a shot be converted into a goal, mostly based on the distance to goal and the shooting angle. The xG models can be enriched by using the type of action (set piece or open play), the type of pass received, the number of defenders in the cone in front of the goal, the ball height and other contextual information. \newline % Row Count 8 (+ 7) {\bf{Expected Assists (xA)}} - {\emph{Common definition}} \newline % Row Count 9 (+ 1) Calculates the probability of a pass resulting in a goal, considering various factors such as pass type, pass distance, assist location, and the nature of the attacking move. (\{\{popup="https://i.vimeocdn.com/video/656727145-8d4b65a442299e610f29b043de47361c5fe2dba7e76a1360041f4dbaf260cc89-d?mw=800\&mh=450\&q=70"\}\}see an example of why it could be misleading\{\{/popup\}\}) \newline % Row Count 17 (+ 8) {\bf{Expected Assisted Goals (xAG):}} \newline % Row Count 18 (+ 1) Quantifies the probability of an assist resulting in an expected goal by considering all passes that lead to a scoring chance, regardless of whether the chance is ultimately converted into a goal. \newline % Row Count 22 (+ 4) {\bf{Possession State Value Models:}} \newline % Row Count 23 (+ 1) • {\bf{Expected Threat (xT)}}: \newline % Row Count 24 (+ 1) Divides the pitch into a 16x12 grid, assigning each cell a probability of an action initiated there to result in a goal in the next N actions. An action can be a shot or a ball move (i.e. a pass or a carry). \newline % Row Count 29 (+ 5) xT is calculated by summing up two terms: \newline % Row Count 30 (+ 1) } \tn \end{tabularx} \par\addvspace{1.3em} \vfill \columnbreak \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Common metrics (cont)}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{1. The product of shot probabilty and goals/shots rate from each zone \newline % Row Count 2 (+ 2) 2. The sum over each zone of the probability of moving the ball to another cell (using the transition matrix) times the xT of each of the zone the ball can be moved to. \newline % Row Count 6 (+ 4) The formula is iterative and thus it needs a starting state that is xT of all cells equal to 0. \newline % Row Count 8 (+ 2) Performing N=5 iterations should imply convergence, where N is the number of actions at which we look after the one being evaluated. \newline % Row Count 11 (+ 3) (\{\{popup="https://karun.in/blog/expected-threat.html"\}\}see the math explained by Karun Singh\{\{/popup\}\}) \newline % Row Count 14 (+ 3) • {\bf{Valuing Actions by Estimating Probabilities (VAEP)}}: \newline % Row Count 16 (+ 2) Values all actions performed by players - not just passes and carries, but shots and defensive actions too. It also considers the impact an action has on a team's chances of conceding as a result of the action - not just the impact on their chances of scoring. Considering a pre and a post action state, it is calculated as the difference of two subtractions: \newline % Row Count 24 (+ 8) - The scoring probability before the action - the scoring probability after the action \newline % Row Count 26 (+ 2) - The probability of conceding a goal before the action - the same probability after the action. \newline % Row Count 28 (+ 2) {\bf{Expected Pass (xPass)}} \newline % Row Count 29 (+ 1) Is the likelihood of a pass being completed. It factors in distance, angle, pressure, body part, pattern of play (open play or set piece) and possibly other contextual information. \newline % Row Count 33 (+ 4) } \tn \end{tabularx} \par\addvspace{1.3em} \vfill \columnbreak \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Common metrics (cont)}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{{\bf{Expected Goals on Target (xGoT)}} \newline % Row Count 1 (+ 1) Is a post-shot goal probability meaning that it takes into account where the ball finished in the goal mouth. The model has only three variables: \newline % Row Count 4 (+ 3) - xG of the shot on target that encodes the positional and contextual information \newline % Row Count 6 (+ 2) - x coordinate of the ball destination in the goal mouth \newline % Row Count 8 (+ 2) - y coordinate of the ball destination in the goal mouth. \newline % Row Count 10 (+ 2) {\bf{Goals prevented and Shooting Goals Added (SGA)}} \newline % Row Count 12 (+ 2) Stemming from xGoT: \newline % Row Count 13 (+ 1) - Goals prevented measure the ability of a goalkeeper to save shots by calculating the difference between xGoT and Goals allowed \newline % Row Count 16 (+ 3) - `SGA = (xGoT - xG) / xG` measures the quality of the shots of a player by calculating the difference between xGoT and Goals scored. \newline % Row Count 19 (+ 3) {\bf{Expected Points (xPts)}} - {\emph{Common definition}} \newline % Row Count 20 (+ 1) Quantifies match outcomes based on the total Expected Goals (xG) for each team. It simulates matches several times and extract the probabilities of winning, drawing and losing for both teams based on the fraction of victories, draws and defeats over the simulations. \newline % Row Count 26 (+ 6) {\bf{Field tilt}} \newline % Row Count 27 (+ 1) The share of possession in the final third in terms of touches and passes. \newline % Row Count 29 (+ 2) {\bf{Passes allowed Per Defensive Action (PPDA)}} \newline % Row Count 30 (+ 1) } \tn \end{tabularx} \par\addvspace{1.3em} \vfill \columnbreak \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Common metrics (cont)}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{Measures the pressure that the defending team puts on the opposition players when they are in possession of the ball in the attacking third. It is calculated by dividing the number of opponents' passes by the number of defensive actions of the defending team in that zone of the pitch.% Row Count 6 (+ 6) } \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Proprietary metrics}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{{\bf{Expected Assists (xA)}} - {\emph{Soccerment}} \newline % Row Count 1 (+ 1) Applied to any pass, quantifies the likelihood of a pass leading to a goal. It takes into account various factors such as the pass location, the position of the receiving player, and the historical probability of similar passes resulting in goals. xA helps in evaluating the creative contribution of a player in creating goal-scoring opportunities. \newline % Row Count 8 (+ 7) {\bf{Expected Offensive Value Added (xOVA)}} - {\emph{Soccerment}} \newline % Row Count 10 (+ 2) Measures the offensive value that a player adds with respect to that received from their teammates. \newline % Row Count 12 (+ 2) The formula is: `xOVA = (non-penalty xG + xA) – xA received` \newline % Row Count 14 (+ 2) {\bf{Possession State Value Models:}} \newline % Row Count 15 (+ 1) • {\bf{Possession Value (PV)}} - {\emph{Stats Perform}} \newline % Row Count 16 (+ 1) It is trained on goals scored and uses a time-based approach, measuring the probability that a team in possession will score during the next 10 seconds of play. \newline % Row Count 20 (+ 4) PV opened the street to a metric built on top of it called \{\{popup="https://theanalyst.com/2021/11/what-is-match-momentum/"\}\}Match Momentum\{\{/popup\}\}. \newline % Row Count 24 (+ 4) • {\bf{On-Ball Value (OBV)}} - {\emph{StatsBomb}} \newline % Row Count 25 (+ 1) It is trained on StatsBomb xG and evaluates all actions containing Goals For and Goals Against to accurately measure the risk/reward of each action. It does not include possession history features, such as details of previous events in the possession to avoid team strength bias. \newline % Row Count 31 (+ 6) } \tn \end{tabularx} \par\addvspace{1.3em} \vfill \columnbreak \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Proprietary metrics (cont)}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{{\bf{Expected Points (xPts)}} - {\emph{Soccerment}} \newline % Row Count 1 (+ 1) The common definition suffers the issue of {\emph{probabilities of non-independent events}} when multiple shots occur in the same action. This flaw is addressed by simulating matches not based on shots (and their xG) but on possessions. Indeed, the possession xG is calculated as `p(goal) = 1 – p(no goal)` and `p(no goal) = (1 – xG1) * (1 – xG2) * ...` \newline % Row Count 9 (+ 8) {\bf{Buildup Disruption Percentage (BDP)}} - {\emph{Soccerment \& Antonio Gagliardi}} \newline % Row Count 11 (+ 2) Tells how successful the pressing is in disrupting the opponents' buildup phase. \newline % Row Count 13 (+ 2) The metric is calculated by computing the opponent team's pass completion rate for each match, comparing it with the team's average rate, and computing a percentage difference. Then switching the point of view and looking at the team whose BDP is measured, average these differences weighting them by the opponent's average pass accuracy and change the sign. \newline % Row Count 21 (+ 8) {\bf{Gegenpressing Intensity (GPI)}} - {\emph{Soccerment \& Antonio Gagliardi}} \newline % Row Count 23 (+ 2) It is the fraction of times a team immediately attempts to regain the ball in its offensive half after losing possession in the attacking 40\% of the pitch. The tally takes into account defensive actions performed in the attacking half in the six seconds following a change of possession happened in the last 40\% of the rectangle as well as a wrong opponents' pass that starts in their half and is recovered in the other one. \newline % Row Count 32 (+ 9) } \tn \end{tabularx} \par\addvspace{1.3em} \vfill \columnbreak \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Proprietary metrics (cont)}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{{\bf{One-twos}} - {\emph{Soccerment}} \newline % Row Count 1 (+ 1) Open-play completed passes followed by another completed pass of the same team, received by the same player who made the first pass then filtered using a progression threshold and a temporal threshold. Consider only the exchanges where the progression between the start coordinates of the first pass and the end coordinates of the second pass bring the initiating player closer to either the center of the goal or the goal line by at least 25\%. Plus, no more than four seconds pass between the first and the second pass. Finally, discard all events where the carry distance between the end of the first pass and the start of the second is longer than five meters. \newline % Row Count 15 (+ 14) {\bf{Aerial Elo Rating Optimization (AERO)}} - {\emph{Soccerment}} \newline % Row Count 17 (+ 2) Measures the aerial skills, based on the Elo ranking algorithm. It takes into account the skill level matchup of each individual duel and is divided in Offensive and Defensive AERO. \newline % Row Count 21 (+ 4) Starting with a common Elo of 1000, after each aerial duel, the score of both players involved is updated by `K*(W/L–P(W))` where `W/L` is a dummy equal 1 if player wins and 0 if they lose, `P(W)` is the probability of winning the duel and `K` is a scaling constant usually equal to 32. \newline % Row Count 27 (+ 6) The probability of winning a duel is calculated as: \newline % Row Count 29 (+ 2) `P(W) = 1 / (1+10\textasciicircum{}(( ELOa-ELOb ) / 400))`% Row Count 30 (+ 1) } \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{8.4cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{8.4cm}}{\bf\textcolor{white}{Normalizations}} \tn \SetRowColor{white} \mymulticolumn{1}{x{8.4cm}}{{\bf{Normalization P90 minutes}} \newline % Row Count 1 (+ 1) Normalizatio per 90 minutes to compare players with different playing time or teams with a different number of matches played. \newline % Row Count 4 (+ 3) {\bf{Normalization P100 touches}} \newline % Row Count 5 (+ 1) Normalization per 100 ball touches intended as events on the ball and not actual touches. \newline % Row Count 7 (+ 2) Useful for metrics like loose balls or key passes. \newline % Row Count 9 (+ 2) {\bf{Normalization per Possession}} \newline % Row Count 10 (+ 1) Normalization based on the team's possession percentage to account for the more time to potentially create attempts. \newline % Row Count 13 (+ 3) {\bf{Team strength adjustment}} \newline % Row Count 14 (+ 1) Based on a team strength indicator (like \{\{popup="http://clubelo.com/"\}\}Elo\{\{/popup\}\}), adjusts the metrics of two or more teams to factor in their relative strength. \newline % Row Count 18 (+ 4) *A very difficult as well as interesting improvement would be an Elo system for leagues. \newline % Row Count 20 (+ 2) {\bf{Standardization (Z-score)}} \newline % Row Count 21 (+ 1) Transforming data to have a mean of 0 and a standard deviation of 1 to compare players or teams in terms of standard deviations from the mean. \newline % Row Count 24 (+ 3) {\bf{Time decay}} \newline % Row Count 25 (+ 1) Normalization technique that assigns less weight to the metrics regarding matches far in time.% Row Count 27 (+ 2) } \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} % That's all folks \end{multicols*} \end{document}