Welcome to our sports data lab: Sportify

With the digital revolution and “big data” percolating through our daily lives, every industry looks to statistical insights to improve productivity and performance. Sports are no exception.

Advanced analytics has become a necessity at elite sports organizations, from scouting new and opposition talent to evaluating outcomes, and increasingly, to communicating better with  their customers – the fans.

To meet the growing demand  for more information, an incredible amount of data is collected and much of it is available to fans. But how much of that information enhances the game-watching experience? How much of that data helps identify trends? potential outcomes? which player a fan should sign to her fantasy team?

Using advanced data analytics and complex modeling techniques, we have determined with great certainty what numbers in the piles of data increasingly available in the world’s favorite sport, soccer, are significant. At the foundation of our analysis is the idea that not all shots are created equal and thus expected goal value.

What is expected goal value?

In the simplest terms, expected goal value is a measurement of shot quality. In slightly less simple terms, expected goal value is a function of player and team efficiency, location and a host of other factors measuring a shot’s likelihood of turning into a goal. Expected goal models for soccer have been calculated and refined since 2013. It is a fairly new “statistic”, but it is founded in long-established mathematical theory.

Sportify, a team of specialized data and soccer analysts, has calculated its model from 8,000 games and 200,000 plus shot data points, adding to the sample each time a game is played on the professional stage. A lot of data goes into calculating each expected goal value, among many other data points in the foundation, models typically take into account the type, location and buildup of a scoring attempt.

By assigning a value between 0 to 1, where zero equals no chance of scoring and one equals absolute chance of scoring, we can compare shots more uniformly. Take Cristiano Ronaldo, for example. This past week in Champions League, he scored a goal for Real Madrid away at AS Roma. In that game, based on the multiple factors used to calculate expected goal, the Sportify model estimated that Ronaldo created an expectation of .305 goals. This score compared to his production – one goal – tells us the Portuguese exceeded the expectation.

That particular goal had an expected goal value of .038, much lower than the value he created during the entirety of the game. This tells us that Ronaldo made the most of a small opportunity and it also tells us that a goal under those circumstances will likely not recur.

This analysis is informative game-by-game, but over time, it becomes even more valuable and potentially influential to teams, scouts and fans, as we examine the established patterns and tendencies of players their teams. We can examine and answer important questions like: Are Cristiano Ronaldo’s teammates on his level?