Sportify approach to Expected Goal Value

In the simplest terms, expected goal value (xG) is a measure of scoring chance quality. Diving in, we take details of a scoring chance (such as location of the chance, how the chance came about, who the chance has fallen to etc.) and compare those details to those of hundreds of thousands of other chances, returning the likelihood that a chance with those particular details will end up in the back of the net. Expected goal models for soccer have been calculated and refined since at least 2013. It is a fairly new “statistic”, but it is founded in long-established mathematical theory.

 

Sportify, a platform built by the data science team at Alt/S for soccer analysts, has built its models from 8,000 games across 8+ leagues and over 200,000 shot data points, adding to the sample each time a game is played on the professional stage. One important differentiator in the Sportify model is that while many engines have a lot of data going into calculating each expected goal value, Sportify uniquely calculates expected goal value at different levels, so what specifically goes into the value depends on which level we are referring to.

 

The first value calculated, as it applies to any shot is what we call ‘chance quality’ or cQxG. This uses only information that is available before the point of contact of a shot; for example, where on the field the chance will occur and how it has come about. This value does not include anything related to the actual contact of the ball, it is the value of the chance itself and nothing more.

 

The next value, ‘shot quality’ is the expected value of the shot (sQxG) instead of simply the chance. sQxG builds off of chance quality but looks at post shot contact information as well; for example, did the player strike the ball particularly well, is it headed towards the top corner or is it rolling slowly towards the center of the goal line? xG is the likelihood that the shot, knowing both where it’s taken and where it’s going, will go in. Given this distinction, all shots that are going off target have an sQxG of 0 as there is no chance of scoring a goal from a shot that crosses the end line wide or high of the goalmouth.

 

While these various values are based on hundreds of thousands of historical shots, they do have their limitations, especially as a tool to assess individual shots. Take for example this shot by Danny Welbeck against Monaco (go to minute 21:10):

Arsenal Vs Monaco 1-3 Highlights 24-02-2015 by Eye Highlights on Dailymotion.

 

Welbeck is squarely in front of the goal, at close range, the chance follows a rebound, all indicators of a high quality chance. The problem is that Elderson, Wallace,  Danijel Subašić and importantly Theo Walcott are all partially blocking Welbeck’s path to goal, but without those pieces of information we are missing important pieces of the picture. Though there are limitations when looking on a shot-by-shot basis, when extended to the game level and the season level, these measures are far more indicative of performance than simply tallying shots, or shots on target or any traditional box score stat.