Expected Goals: Rethinking the cons

Earlier this week, Richard Whittall wrote about Expected Goals models, their pros and cons, over at Paste. We wanted to add a few points to the discussion that would perhaps make him and the soccer community reevaluate the cons:

Con 1: They don’t work well with really good teams/players

The problem is illustrated in the chart below:

c0 (1)
Data provided by Opta


Good players like M/S/N will always exceed their expected goal score. But there’s a way to fix that to make it useful for those players as well.

c2 (1)
Data provided by Opta

We at Sportify calculate expected goals at multiple levels. The first chart above is what we call C0 – the pure opportunity score. This on the left is C2 – which is C0 normalized for the context of the shot – who’s taking the shot, the defense and goalkeeper the shooter is facing, etc. This puts each scoring opportunity in perspective, and lets us see how even the best players are doing relative to their individual expectation.

Con 2: They don’t mean much in single games

Yes, a team can potentially create dozens of clear cut chances, but get shutout if they don’t have the finishing ability to match their creativity. But more often than not, the team that creates the best chances does win. In a sample of over 4,500 games not ending in a draw, the winning team had the higher total expected goal value 3,368 times – almost 75% of the time.

Con 3: xG is potential bait for Goodhart’s Law

Look, if someone is using xG as a silver bullet to gauge performance, sure Goodhart’s Law applies. Soccer is a complex sportStill our favorite  with 11 pieces trying to fit together to succeed both offensively and defensively; if there’s any single metric that’s the holy grail for winning games, expected goals isn’t it. xG models do provide a level of analysis beyond an increasingly stale boxscore though. We’ve used them to develop a system that breaks down offense/defense/goalkeeping/possession – each using multiple statistically relevant metrics (among xG related ones). And we point readers back to #4 on Whittall’s good side of the list – “4. They’re a Building Block for Other Cool Stats Work”.