Monday, May 23, 2016

How much of Leicester City's championship was luck?

How much of Leicester City's run to the Premier League championship was just luck? I was curious to get a better gut feel for how random it might have been, so I wrote a simulation. 

Specifics of the simulation are in small font below. The most important shortcoming, I think, was that I kept teams symmetrical, instead of creating a few high-spending "superteams" like actually exist in the Premier League (Chelsea, Manchester United, Arsenal, etc.). Maybe I'll revisit that in a future post, but I'll just go with it as is for now.


Details: For each simulated season, I created 20 random teams. I started each of them with a goal-scoring and goal-allowing talent of 1.35 goals per game (about what the observed figure was for 2015-16). Then, I gave each a random offensive and defensive talent adjustment, each with mean zero and SD of about 0.42 goals per game. For each season, I corrected the adjustments to sum to zero overall. I played each game of the season assuming the two teams' adjustments were additive, and used Poisson random variables for goals for and against. I didn't adjust for home field advantage. 


At the beginning of the season, Leicester City was a 5000:1 longshot. What kind of team, in my simulation, actually showed true odds of 5000:1? We can narrow it down to teams with a goal differential (GD) talent of -4 to -9 for the season. In 500,000 random seasons, here's how many times those teams won:

tal   #tms   ch  odds
 -9  166135  20  8307
 -8  168954  25  6758
 -7  171272  26  6587
 -6  173327  22  7879
 -5  175017  53  3302
 -4  177305  61  2907
    1032010 207  4986

In 500,000 seasons of the simulation, 1,032,010 teams had a GD talent between -3 and -9. Only 207 of them won a championship, for odds of 4,985:1 against, which is close to the 5000:1 we're looking for. 

Even half a million simulated seasons isn't enough for randomness to even out, which is why the odds don't decrease smoothly as the teams get better. I'll maybe just go with a point estimate of -8. In other words, for Leicester City to be a 5000:1 shot to win the league, their talent would have to be such that you'd expect them to be outscored by 8 goals over the course of the 38-game season. Maybe it might be 7 goals instead of 8, but probably not 6 and probably not 9.  (I guess I could run the simulation again to be more sure.)


Leicester City actually wound up outscoring their opponents by 32 goals last year. Could that be luck? What's the chance that a team that should be -8 would actually wind up at +32? That's a 40 goal difference -- Leicester City would have had to be lucky by more than a goal a game.

The SD of goal differential is pretty easy to figure out, if you assume goals are Poisson. Last season, the average game had 1.35 goals for each team. In a Poisson distribution, the variance equals the mean, so, for a single game, the variance of goal differential is 2.70. For the season, multiply that by 38 to get 102.6. For the SD, take the square root of that, which is about 10.1. Let's just call it 10.

So, a 40-goal surprise is about four SDs from zero. Roughly speaking, that's about a 1 in 30,000 shot.


If we were surprised that Leicester City won the championship, we should be even *more* surprised that they went +32. In fact, we should be around six times more surprised!

Why are the "+32" odds so much worse than the "championship" odds? Because, on those rare occasions when a simulated -8 team wins the championship, it usually does it with much less than a +32 performance. Maybe it goes +20 but gets "Pythagorean luck" and winds lots of close games. Maybe it goes +17 but the other teams have bad luck and it squeaks in.

If you assume that a team that actually scores +32 in a season has, say, a 3-in-10 chance of winning the championship, then the odds of both things happening -- a -8 talent team going +32 observed, and that being enough to win -- is 1 in 100,000. Well, maybe a bit less, because the two events aren't completely independent.


The oddsmakers have priced Leicester City at around 25:1 for next season. That's a decent first guess for what they should have been this year.

Except ... in retrospect, Leicester should probably have been even better than 25:1 this season (you'd expect them to decline in talent next year -- they have an older-than-average team, they may lose players in the off-season, and other teams should catch on to their strategy). On the other hand, MGL says oddsmakers overcompensate for unexpected random events that don't look random. 

Those two things kind of cancel out. But, commenter Eduardo Sauceda points out that bookmakers build a substantial profit margin into a 20-way bet, so let's lower last season's "true" odds to 35:1, as an estimate.

According to the simulation, for a team to legitimately be a 35:1 shot, its expected goal differential for the season would have to be around +16.

Taking all this at face value, we'd have to conclude:

1. The bookies and public thought Leicester City was a -8 talent, when, in reality, it was a +16 talent. So, they underestimated the club by 24 goals.

2. Leicester City outperformed their +16 talent by 16 goals.

3. And, while I'm here ... the simulation says a team with a +32 GD averages 74 points in the standings. Leicester wound up at 81 points. So, maybe they were +7 points in Pythagorean luck.  


One thing you notice, from all this, is how difficult it is to set good odds on longshots, when you can't estimate true talent well enough.

Suppose you analyze a team as best you can, and you conclude that they should be a league average team, based on everything you know about their players and manager. (I'm going to call them a ".500 team," which ignores, for now, the Premier League scoring asymmetry of three points for a win and one point for a draw.)

You run a simulation, and you find that a .500 team wins the championship once every 770 simulated seasons. If the simulation is perfect, can you just go and set odds of 769:1, plus vigorish?

Not really. Because you haven't accounted for the fact that you might be wrong that Everton is a .500 team. Maybe they're a .450 team, or a .600 team, and you just didn't see it. 

But, isn't there a symmetry there? You may be wrong that they're exactly average in talent, but if your analysts' estimate is unbiased, aren't they just as likely to be -8 as they are to be +8? So, doesn't it all even out?

No, it doesn't. Because even if the error in estimating talent is symmetrical, the resulting error in odds is not. 

By the simulation, a team with .500 talent is about 1 in 940 to win the championship. But, what if half the time you incorrectly estimate them at -8, and half the time you incorrectly estimate them at +8?

By my simulation, a team with -8 GD talent is 1 in 6,758 to win. A team with +8 talent is 1 in 157. The average of those two is not 1 in 940, but, rather, 1 in 307. 

If you're that wildly inaccurate in your talent evaluations, you're going to be offering 939:1 odds on what is really only a 307:1 longshot. Even if you're conservative, going, say, 600:1 instead of 939:1, you're still going to get burned.

This doesn't happen as much with favorites. In my simulation, a +30 team was 1 in 5.4. The average of a +22 team and a +38 team is 1 in 4.7. Not as big a difference. Sure, it's probably still enough difference to cost the bookmakers money, but I bet the market in favorites is competitive enough that they've probably figured out other methods to correct for this and get the odds right.


Anyway, the example I used had the bookies being off by exactly 8 goals. Is that reasonable? I have no idea what the SD of "talent error" is for bookmakers (or bettors' consensus). Could it be as high as 8 goals? 

For the record, the calculation of SD(talent) for 2014-15 (the season before Leicester's win), using the "Tango method," goes like this:

SD(observed) = 22.3 goals
SD(luck)     = 10   goals
SD(talent)   = 19.9 goals

For a few other seasons I checked:

2015-16  SD(talent) = 19.9
2013-14  SD(talent) = 27.8
1998-99  SD(talent) = 18.3 

In MLB, the SD of talent is about 9 wins. How well, on average, could you evaluate a baseball team's talent for the coming season? Maybe, within 3 wins, on average? That's a third of an SD.

In the Premier League, a third of an SD is about 6 goals. But evaluation is harder in soccer than in baseball, because there are strategic considerations, and team interactions make individual talent harder to separate out. So, let's up it to 9 goals. Offsetting that, the public consensus for talent -- as judged by market prices of players -- reduces uncertainty a bit. So, let's arbitrarily bring it back down to 8 goals. 

That means ... well, two SDs is 16 goals. That means that in an average year, the public overrates or underrates one team's talent by 16 goals. That seems high -- 16 goals is about 10 points in the standings. But, remember -- that's just talent! If luck (with an SD of 10 goals) goes the opposite direction from the bad talent estimate, you could occasionally see teams vary from their preseason forecast by as many as 36 or 46 goals.

Does that happen? What's the right number? Anyone have an idea? At the very least, we now know it's possible once in a while, to be off by a lot. In this case, it looked like everyone underestimated the Foxes by (maybe) 24 goals in talent.


In light of all this, bookmaker William Hill announced that, next year, they will not be offering any odds longer than 1,000:1. When I first read that, I thought, what's the point? If they had offered a thousand to one on Leicester City, they still would have lost a lot of money, if the true odds were 35:1.

But ... now I get it. Maybe they're saying something like this: "A Premier League team with middle-of-the-road talent -- one that you'd expect to score about as many goals as it allows -- has about a 1 in 1,000 chance of winning the championship. We're not confident enough that we can say, of any bad team, that they can't change their style of play to become average, or that they haven't improved to average over the off-season, or that they've been a .500 team all along but we've just been fooled by randomness. So, we're never again going to set odds based on an evaluation that a team's talent is significantly worse than average, because the cost of a mistake is just too high."

That makes a certain kind of sense. And the logic makes me wonder: were the odds on extreme longshots always strongly biased in bettors' favor, but nobody realized it until now?

(My previous post on the Leicester City is here.)

Labels: , , ,


At Monday, May 23, 2016 11:04:00 PM, Anonymous Tangotiger said...

The uncertainty in the estimate of the talent is simply the difference between the Vegas odds and the observed wins, minus random variation.

So in MLB, the RMSE between Vegas odds and observed wins is around 8.5 to 9 wins.

One sd of random variation is 6.4.

The difference is one SD = 5.5 to 6.0 wins. That's the uncertainty of your true talent.

Do the same for soccer.

At Tuesday, May 24, 2016 12:41:00 AM, Blogger Phil Birnbaum said...

Right, I did that for baseball a couple of years ago, check a bunch of predictions to see who was overconfident about their ability.

But where do I find over/under predictions for the EPL? I googled but couldn't find any over/under season betting markets. Just odds to win the championship or top 4.

At Tuesday, May 24, 2016 3:06:00 AM, Blogger Phil Birnbaum said...

I could estimate the oddsmakers' guesses by using the simulation to convert the odds back into talent. That'll probably be close enough...

At Friday, May 27, 2016 11:41:00 PM, Anonymous Anonymous said...

Did William Hill mention how much they raked in on the other 19 teams? I don't think it is out of the question that they actual made money on futures this year.

At Saturday, May 28, 2016 3:38:00 PM, Anonymous Anonymous said...

For over/under total points/season try here:

scroll down.

At Monday, May 30, 2016 1:36:00 PM, Blogger MR C said...

Phil, when you did the simulations what percent of the games ended in ties? The reason I am asking is that when I have used poisson in a similar way to what you describe I end up with around 21% to 22% of games ending in draws but in reality it is closer to 26% to 27% most seasons in the EPL. Not sure if it changes your conclusions much but I am curious. thanks

At Monday, May 30, 2016 2:15:00 PM, Blogger Phil Birnbaum said...

MR C: in the simulation, it was about 22%. Interesting that there are more draws than expected in the EPL ... in the NHL, there are more regulation ties than expected because more total points (between teams) are awarded for a tie, but in the EPL fewer points are awarded for a draw. Either EPL teams are very risk averse, or the Poisson model doesn't hold nearly well enough for other reasons.

At Monday, May 30, 2016 6:04:00 PM, Anonymous Tangotiger said...

Can you apply the Poisson model that assumes there's only 1 period in hockey and one half in soccer? The reason I say that is that because it's possible that early on, the goals scored/allowed are independent of each other, as each team won't change its style of play much.

But, when it is still tied/late, behaviour changes.

At Monday, May 30, 2016 10:06:00 PM, Blogger MR C said...

I have not checked at the half level ... will see if I have the data to do that for the EPL. It is almost certainly end game behavior. I have lost some of my files but I recall that 0-0 ties and 1-1 ties occurred more frequently than expected but total goals did follow poisson; it was the split between the two teams that did not which is why ties were under-predicted by poisson even though total goals did work well.

At Monday, May 30, 2016 10:41:00 PM, Blogger MR C said...

Ok, found some data for half time. I checked ~5200 games from the EPL. 41.6% were tied at half time. I ran poisson and found that there would be 42.4% ties... BUT.

1) I accounted for home field advantage
2) I did not account for differences in team quality (all teams assumed average)

so the 42.4% overstates the number of ties since team talent varies. I am going to try to adjust for team quality if I can connect a couple of my files.

At Monday, May 30, 2016 11:38:00 PM, Blogger MR C said...

I did an adjustment for the quality of the teams based on the betting lines; I got 41.4% ties using poisson. So confirms what Tango suggested and I expected. It is end game tactics and behavior.


Post a Comment

<< Home