From following various soccer blogs, it seems that defensive stats aren't as polished and explored as offensive ones. I'm curious what stats beyond things such as tackles/interceptions are looked at. For example, Maldini has the reputation of being one of the top defenders ever, but also is known for his quote "If i have to tackle, then I have already made a mistake" (paraphrased). His tackling stats seem to support that style of play in that he made fewer challenges than most. Perhaps he made tons of interceptions, and in that sense, never had to tackle? I'm unable to find a good source of his playing stats.
Anyways, what sort of things do you look at for defensive players? It seems that its when I look at things such as WhoScored's statistical team of the season, it has players such as Mustafi, who generally has a negative reputation for his play. I suspect he is so high because the rating metric used by whoscored overvalues offensive contributions of defenders, vs. pundits more likely look for a defender's defensive contributions. Are there any form of 'advanced stats' for defenders beyond the basic measured stats of challenges, interceptions, etc, that you and/or the industry looks for?
Defensive metrics are very difficult for a couple reasons.
The first problem is the data. The soccer viewing public is largely familiar with event-level data, typically provided by (my previous employer) Opta. They've done a great job normalizing soccer statistics on the cultural level, but the information they've collected at scale isn't that useful for creating good defensive metrics.
Other companies have sensed an opportunity here and have started providing more detailed data around things such as defensive pressure. Suddenly, you can contextualize each offensive event with the level of defensive pressure applied to it. I think this will be a game changer, but we're in early days there.
Other companies provide player-tracking solutions that give you real-time position of all players on the field. This is great because you have a "complete" picture of the game, but it requires a lot of work to build more sophisticated spatial/geometric models.
There's also the "Howard Effect", coined largely as a basketball term, but it's similar to the Maldini example you provided. Some defenders are so good that they don't have to be "active" defenders. That's something which is really difficult to adequately control for.
Thanks! Also, what are some sources for stats that would be available for free online? I've looked at some 538 stuff, some Statsbomb stuff, whoscored, and football-data.
THere's similar things like this in other sports, like American Football. You don't tell an amazing cornerback by just counting how many passes he broke down or intercepted: You also look at how often the player he was covering was targeted at all, and compare to their baseline. When a received that is normally amazing barely gets passed to, and when he does, the plays are not very fruitful, it's the defender's fault. When it comes to easy to digest statistics, that's handled by counting passing attempts towards the defender's area.
You can do the exact same thing in soccer, if you have the data: You can assign responsibilities to players, just through computer vision. If Messi gets 3 touches in an attacking position when Sergio Ramos is defending him, you can credit that to Ramos, and compare that to Messi's touches vs the average Barcelona opponent.
What Maldini meant by that quote was that as a defender, you should be in the right position so that the dangerous pass is never made, or the attacker doesn't have the chance to run past you. Tackling is sort of a last-ditch effort to stop the other player, and comes with risks of conceding fouls/penalties.
Hmm yes, my interest is that this sort of defender may be hard to identify purely from event driven stats, as stats mostly track actions done (tackles succeeded/attempted, interceptions made). A player such as Maldini wouldn't have shown up on lists sorted for tackles made, although he may have shown up on tackle success % or interceptions (not sure because I can't find a good source of data for his playing stats). Despite this, he has the reputation of being one of the best defenders to play the game. I'm just curious about what sort of metrics could identify a player such as Maldini.
Do you have any recommendations for how a hobbyist can get more into the field? Someone who has a technical background, but works in a different field unrelated to sports at all. Any reading material to get familiar with some of the stats and metrics that are used in the industry?
There isn't a ton of great public resources out there. For the most part, anyone who's writing smart analysis in the public space gets hired away to a club. It's exactly what happened in other sports.
But, I would read "The Numbers Game: Why Everything You Know About Soccer Is Wrong" and read the backlogs of the StatsBomb blog. That will get you up to speed pretty quickly.
It doesn't really explain what 'Premier League analyst' is. Is this someone who works for a particular club? Or writes for some publication? Or walks the earth like Caine of Kung Fu, offering analysis to those in need?
I'm the other analyst mentioned in the article, but my position is probably quite similar.
I work with a team's coaching staff and management to help them make more data-driven decisions. This ranges from topics around opposition analysis to player recruitment.
It seems more interesting to hear from an insider under those circumstances. For starters, an MLS analyst might actually be able and willing to chitchat with you. MLS is an odd place compared to the top European leagues - there's no threat of relegation, no scrambling to earn Champions or Europa League spots and, comparatively, also no money. There's the business with the 'designated players'. The dominant sport in this particular analyst's market is ice hockey. Etc.
I stumbled into it largely by coincidence and luck. As you could imagine, there aren't many people that study computer science who were also relatively high level soccer players. I'm very fortunate to occupy the very narrow intersection of that obscure venn diagram.
I stumbled across some high resolution soccer data during college and began writing a blog that became popular in soccer analytics circles. The company that produced the data that I was scraping eventually hired me to their data science team. I spent a few years with them before I was recruited to my current team.
I have never worked in gambling but there are a few people in my sort of role that have that sort of background. It requires a pretty similar set of skills. I am not sure that I could beat the market by a large enough margin to make it worthwhile. It's quite efficient. But for the most part, I'm more interested in understanding the underlying mechanics of tactics than I am interested in predicting the result of games.
I'm the analyst mentioned in the article. Happy to answer any questions about our weird little industry.