Sabermetric Research: May 2016

Sunday, May 29, 2016

Leicester City and EPL talent evaluation

Last post, I wondered how accurate bookmakers are at evaluating English Premier League (EPL) team talent. In the comments, Tom Tango suggested a method to figure that out, but one that requires knowing the bookies' actual "over/under" forecasts. I couldn't find those numbers, , but bloggers James Grayson and Simon Gleave came to my rescue with three seasons' worth of numbers, and another commenter provided a public link for 2015-16.

(BTW, Simon's blog contains an interesting excerpt from "Soccermatics," a forthcoming book on the sabermetrics of soccer. Worth checking out.)

So, armed with numbers from James and Simon, what can we figure out?

1. SD of actual team points

Here is the SD of team standings points for the previous three seasons:

2015-16: 15.4
2014-15: 16.3
2013-14: 19.2
-------------
Average 17.1

The league was quite unbalanced in 2013-14 compared to the other two seasons. That could have been because the teams were unbalanced in talent, or because the good teams got more lucky than normal and the bad teams more unlucky than normal. For this post, I'm just going to use the average of the three seasons (by taking the root mean square of the three numbers), which is 17.1.

2. Theoretical SD of team luck

I ran a simulation to figure the SD of points just due to luck -- in other words, if you ran a random season over and over, for the same team, how much would it vary in standings points even if its talent were constant? I would up with a figure around 8 points. It depends on the quality of the team, but 8 is pretty close in almost all cases.

3. Theoretical SD of team talent

By the pythagorean relationship between talent and luck, we get

SD(observed) = 17.1
SD(luck) = 8.0
-------------------
SD(talent) = 15.1

4. SD of bookmaker predictions

If teams vary in talent with an SD about 15 points, then, if the bookmakers were able to evaluate talent perfectly, their estimates would also have an SD of 15 points. But, of course, nobody is able to evaluate that well. For one thing, talent depends on injuries, which haven't happened yet. For another thing, talent changes over time, as players get older and better or older and worse. And, of course, talent depends on the strategies chosen by the manager, and by the players on the pitch in real time.

So, we'd expect the bookies' predictions to have a narrower spread than 15 points. They don't:

16.99 -- 2015-16 Pinnacle (opening)
14.83 -- 2015-16 Pinnacle (closing)
16.17 -- 2015-16 Sporting Index (opening)
15.81 -- 2015-16 Spreadexsports (opening)
17.30 -- 2014-15 Pinnacle (opening)
17.37 -- 2014-15 Pinnacle (closing)
16.95 -- 2014-15 Sporting Index (opening)
15.80 -- 2014-15 Sporting index (closing)
15.91 -- 2013-14 Pinnacle

Only one of the nine sets of predictions is narrower than the expectation of team talent, and even that one, barely. This surprised me. In the baseball case, the sports books projected a talent spread that was significantly more conservative than the actual spread.

Either the EPL bookmakers are overoptimistic, or the last three Premier League seasons had less luck than the expected 8.0 points.

5. Bookmaker accuracy

If the bookmakers were perfectly accurate in their talent estimates, we'd expect their 20 estimates to wind up being off by an SD of around 8 points, because that's the amount of unpredictable performance luck in a team-season.

In 2014-15, that's roughly what happened:

7.85 -- 2014-15 Pinnacle (opening)
6.37 -- 2014-15 Pinnacle (closing)
6.90 -- 2014-15 Sporting Index (opening)
7.75 -- 2014-15 Sporting index (closing)

Actually, every one of the bookmakers' lines was more accurate than 8 points! In effect, in 2014-15, the bookmakers exceeded the bounds of human possibility -- they predicted better the speed of light. What must have happened is: in 2014-15, teams just happened to be less lucky or unlucky than usual, playing almost exactly to their talent.

But the predictions for 2015-16 were way off:

15.17 -- 2015-16 Pinnacle (opening)
14.96 -- 2015-16 Pinnacle (closing)
15.13 -- 2015-16 Sporting Index (opening)
14.96 -- 2015-16 Spreadexsports (opening)

And 2013-14 was in between:

9.77 -- 2013-14 Pinnacle

Again, I'll just go with the overall SD of the three seasons, which works out to about 11 points.

15 points -- 2015-16
7 points -- 2014-15
10 points -- 2013-14
--------------------
11 points -- average

Actually, 11 points is pretty reasonable, considering 8 is the "speed of light" best possible long-term performance.

6. Bookmaker talent inaccuracy

If 11 points is the typical error in estimating performance, what's the average error in estimating talent? That's an easy calculation, by Pythagoras:

11 points -- observed error
8 points -- luck error
----------------------------
8 points -- talent error

That 8 points for talent should really be 7.5, but I'm arbitrarily rounding it up to create a rule of thumb that "talent error = luck error".

7. Bookmaker bias

In step 4, it looked like the bookmakers were overconfident, and predicting a wider spread of talent than actually existed. In other words, it looked like they were trying to predict luck.

If they did that, it would have to mean they were overestimating the good teams, and underestimating the bad teams. That's the only way to get a wider spread.

But, in 2013-14, it was the opposite! The correlation between the bookies' prediction and the eventual error was -0.07. (The "error" includes the sign, of course. The argument isn't that the bookies are more wrong for good teams and bad teams, it's that they're more likely to be wrong in a particular direction.)

In other words, even though Pinnacle seemed to be trying to predict team luck, it worked out for them!

Which means one of these things happened:

1. Pinnacle got really lucky, and their guesses for which teams would have good luck actually worked out;

2. We're wrong in thinking Pinnacle was overconfident by that much. In other words, the spread of talent is wider than we thought it was. Remember, 2013-14, was more unbalanced than the other two seasons we looked at.

I think it's some of each. The SD(talent) estimate for 2013-14 came out to 17.5 points. In that light, Pinnacle's 15.9-point SD isn't *that* overconfident.

... In 2014-15, on the other hand, Pinnacle *did* overestimate the spread. The better the closing line on the team, the less extreme it performed, with a correlation of +0.35. Sporting Index, with their more conservative line, correlated only at +0.11.

Part of the reason the correlations are so high is because that was the year random luck balanced out so much more than usual. If teams were moving all over the place in the standings for random reasons, that would tend to hide the bookmakers' tendency to rate the teams too extreme.

... Finally, we come to 2015-16. Now, we see what looks like very strong evidence of overconfidence. For the Pinnacle closing line, the correlation between estimate and overestimate is +.46. The other bookmakers are even higher, at +.52 and +.50.

Much of that comes from two teams. First, and most obviously, Leicester City, predicted at 40.5 points but actually winding up at 81. Second, Chelsea, forecasted the best team in the league at 83 points, but finishing with only 50.

These don't really seem to fit the narrative of "the bookies know who the good and bad teams are, but just tend to overestimate their goodness and badness." But, they kind of do fit the narrative. Favorites like Chelsea are occasionally going to have bad years, so you're going to have an occasional high error. But, that error will be even higher if you overestimated them in the first place.

-------

OK, there's the seven sets of numbers we got from James' and Simon's data. What can we conclude?

Well, the question I wanted to answer was: how much are the bookmakers typically off in estimating team talent? Our answer, from #5: about 8 points.

But ... I'm not that confident. These are three weird seasons. Last year, we have Leicester City and their 5000:1 odds. The season before, we have "better than speed of light" predictions, meaning luck cancelled out. And, two years ago, as we saw in #1, we had a lot more great and awful teams than the other two seasons, which suggests that 2013-14 might be an outlier as well.

I'd sure like to have more seasons of data, maybe a decade or so, to get firmer numbers. For now, we'll stick with 8 points as our estimate.

An eight-point standard error means that, typically, one team per season will be mispriced by 16 points or more. That's not necessarily exploitable by bettors. For one thing, bookmaker prices match public perception, so it's hard to be the one genius among millions who sees the exact one team that's mispriced. For another thing, some of what I'm calling "talent" is luck of a different kind, in terms of injuries or players learning or collapsing.

We still have the case that Leicester City was off by around 40 points. That's 5 SDs if you think it was all talent. It's also 5 SDs if you think it was all luck.

The "maximum likelihood," then, if you don't know anything about the team, would be if it were 2.5 SDs of each. The odds of that happening are about 1 in 13,000 (1 in 26,000 for each direction).

My best guess, though, is to trust the bookmakers' current odds of about 30:1 as an estimate of what Leicester City should have been. How do we translate 30:1 into expected points? As it turns out, Liverpool was 28:1, with an over/under of 66. So let's use 66 points as our true talent estimate.

Under that assumption, Leicester City beat their talent by 15 points of luck (81 minus 66), or a bit less than 2 SD. And their assumed true talent of 66 points beat the bookmakers' estimate of 40 by 26 points, which is 3.25 SD.

That seems much more plausible to me.

Becuase ... I think it's reasonable to think that luck errors are normally distributed. But I don't think we have any reason to believe that human errors, in estimating team talent, also follow a normal distribution. It seems to me that Leicester City could be a black swan, one that just confounded the normal way bettors and fans thought about performance. They may have been a Babe Ruth jumping into the league -- someone who saw you could win games by breaking the assumptions that led to the typical distribution of home runs.

So, when we see that Leicester was 3.25 SD above the public's estimate of their true talent ... I'm not willing to go with the usual probability of a 3.25 SD longshot (around 1 in 1700). I don't know what the true probability is, but given the "Moneyball" narrative and the team's unusual strategy, I'd suspect those kinds of errors are more common than the normal distribution would predict.

Even if you disagree ... well, with 20 teams, a 1 in 1700 shot comes along every 85 years. It doesn't seem too unreasonable to assume we just saw the inevitable "hundred year storm" of miscalculation.

And, either way, the on-field luck you have to assume -- 15 points -- is less than two standard deviations, which isn't that unusual at all.

So that's my best guess at how you can reasonably get Leicester City to 81 points.

Labels: betting, distribution of talent, Leicester City, luck, predictions, Premier League, soccer, talent

Monday, May 23, 2016

How much of Leicester City's championship was luck?

How much of Leicester City's run to the Premier League championship was just luck? I was curious to get a better gut feel for how random it might have been, so I wrote a simulation.

Specifics of the simulation are in small font below. The most important shortcoming, I think, was that I kept teams symmetrical, instead of creating a few high-spending "superteams" like actually exist in the Premier League (Chelsea, Manchester United, Arsenal, etc.). Maybe I'll revisit that in a future post, but I'll just go with it as is for now.

------

Details: For each simulated season, I created 20 random teams. I started each of them with a goal-scoring and goal-allowing talent of 1.35 goals per game (about what the observed figure was for 2015-16). Then, I gave each a random offensive and defensive talent adjustment, each with mean zero and SD of about 0.42 goals per game. For each season, I corrected the adjustments to sum to zero overall. I played each game of the season assuming the two teams' adjustments were additive, and used Poisson random variables for goals for and against. I didn't adjust for home field advantage.

------

At the beginning of the season, Leicester City was a 5000:1 longshot. What kind of team, in my simulation, actually showed true odds of 5000:1? We can narrow it down to teams with a goal differential (GD) talent of -4 to -9 for the season. In 500,000 random seasons, here's how many times those teams won:

tal #tms ch odds
---------------------
-9 166135 20 8307
-8 168954 25 6758
-7 171272 26 6587
-6 173327 22 7879
-5 175017 53 3302
-4 177305 61 2907
---------------------
1032010 207 4986

In 500,000 seasons of the simulation, 1,032,010 teams had a GD talent between -3 and -9. Only 207 of them won a championship, for odds of 4,985:1 against, which is close to the 5000:1 we're looking for.

Even half a million simulated seasons isn't enough for randomness to even out, which is why the odds don't decrease smoothly as the teams get better. I'll maybe just go with a point estimate of -8. In other words, for Leicester City to be a 5000:1 shot to win the league, their talent would have to be such that you'd expect them to be outscored by 8 goals over the course of the 38-game season. Maybe it might be 7 goals instead of 8, but probably not 6 and probably not 9. (I guess I could run the simulation again to be more sure.)

------

Leicester City actually wound up outscoring their opponents by 32 goals last year. Could that be luck? What's the chance that a team that should be -8 would actually wind up at +32? That's a 40 goal difference -- Leicester City would have had to be lucky by more than a goal a game.

The SD of goal differential is pretty easy to figure out, if you assume goals are Poisson. Last season, the average game had 1.35 goals for each team. In a Poisson distribution, the variance equals the mean, so, for a single game, the variance of goal differential is 2.70. For the season, multiply that by 38 to get 102.6. For the SD, take the square root of that, which is about 10.1. Let's just call it 10.

So, a 40-goal surprise is about four SDs from zero. Roughly speaking, that's about a 1 in 30,000 shot.

------

If we were surprised that Leicester City won the championship, we should be even *more* surprised that they went +32. In fact, we should be around six times more surprised!

Why are the "+32" odds so much worse than the "championship" odds? Because, on those rare occasions when a simulated -8 team wins the championship, it usually does it with much less than a +32 performance. Maybe it goes +20 but gets "Pythagorean luck" and winds lots of close games. Maybe it goes +17 but the other teams have bad luck and it squeaks in.

If you assume that a team that actually scores +32 in a season has, say, a 3-in-10 chance of winning the championship, then the odds of both things happening -- a -8 talent team going +32 observed, and that being enough to win -- is 1 in 100,000. Well, maybe a bit less, because the two events aren't completely independent.

------

The oddsmakers have priced Leicester City at around 25:1 for next season. That's a decent first guess for what they should have been this year.

Except ... in retrospect, Leicester should probably have been even better than 25:1 this season (you'd expect them to decline in talent next year -- they have an older-than-average team, they may lose players in the off-season, and other teams should catch on to their strategy). On the other hand, MGL says oddsmakers overcompensate for unexpected random events that don't look random.

Those two things kind of cancel out. But, commenter Eduardo Sauceda points out that bookmakers build a substantial profit margin into a 20-way bet, so let's lower last season's "true" odds to 35:1, as an estimate.

According to the simulation, for a team to legitimately be a 35:1 shot, its expected goal differential for the season would have to be around +16.

Taking all this at face value, we'd have to conclude:

1. The bookies and public thought Leicester City was a -8 talent, when, in reality, it was a +16 talent. So, they underestimated the club by 24 goals.

2. Leicester City outperformed their +16 talent by 16 goals.

3. And, while I'm here ... the simulation says a team with a +32 GD averages 74 points in the standings. Leicester wound up at 81 points. So, maybe they were +7 points in Pythagorean luck.

------

One thing you notice, from all this, is how difficult it is to set good odds on longshots, when you can't estimate true talent well enough.

Suppose you analyze a team as best you can, and you conclude that they should be a league average team, based on everything you know about their players and manager. (I'm going to call them a ".500 team," which ignores, for now, the Premier League scoring asymmetry of three points for a win and one point for a draw.)

You run a simulation, and you find that a .500 team wins the championship once every 770 simulated seasons. If the simulation is perfect, can you just go and set odds of 769:1, plus vigorish?

Not really. Because you haven't accounted for the fact that you might be wrong that Everton is a .500 team. Maybe they're a .450 team, or a .600 team, and you just didn't see it.

But, isn't there a symmetry there? You may be wrong that they're exactly average in talent, but if your analysts' estimate is unbiased, aren't they just as likely to be -8 as they are to be +8? So, doesn't it all even out?

No, it doesn't. Because even if the error in estimating talent is symmetrical, the resulting error in odds is not.

By the simulation, a team with .500 talent is about 1 in 940 to win the championship. But, what if half the time you incorrectly estimate them at -8, and half the time you incorrectly estimate them at +8?

By my simulation, a team with -8 GD talent is 1 in 6,758 to win. A team with +8 talent is 1 in 157. The average of those two is not 1 in 940, but, rather, 1 in 307.

If you're that wildly inaccurate in your talent evaluations, you're going to be offering 939:1 odds on what is really only a 307:1 longshot. Even if you're conservative, going, say, 600:1 instead of 939:1, you're still going to get burned.

This doesn't happen as much with favorites. In my simulation, a +30 team was 1 in 5.4. The average of a +22 team and a +38 team is 1 in 4.7. Not as big a difference. Sure, it's probably still enough difference to cost the bookmakers money, but I bet the market in favorites is competitive enough that they've probably figured out other methods to correct for this and get the odds right.

-------

Anyway, the example I used had the bookies being off by exactly 8 goals. Is that reasonable? I have no idea what the SD of "talent error" is for bookmakers (or bettors' consensus). Could it be as high as 8 goals?

For the record, the calculation of SD(talent) for 2014-15 (the season before Leicester's win), using the "Tango method," goes like this:

SD(observed) = 22.3 goals
SD(luck) = 10 goals
-------------------------
SD(talent) = 19.9 goals

For a few other seasons I checked:

2015-16 SD(talent) = 19.9
2013-14 SD(talent) = 27.8
1998-99 SD(talent) = 18.3

In MLB, the SD of talent is about 9 wins. How well, on average, could you evaluate a baseball team's talent for the coming season? Maybe, within 3 wins, on average? That's a third of an SD.

In the Premier League, a third of an SD is about 6 goals. But evaluation is harder in soccer than in baseball, because there are strategic considerations, and team interactions make individual talent harder to separate out. So, let's up it to 9 goals. Offsetting that, the public consensus for talent -- as judged by market prices of players -- reduces uncertainty a bit. So, let's arbitrarily bring it back down to 8 goals.

That means ... well, two SDs is 16 goals. That means that in an average year, the public overrates or underrates one team's talent by 16 goals. That seems high -- 16 goals is about 10 points in the standings. But, remember -- that's just talent! If luck (with an SD of 10 goals) goes the opposite direction from the bad talent estimate, you could occasionally see teams vary from their preseason forecast by as many as 36 or 46 goals.

Does that happen? What's the right number? Anyone have an idea? At the very least, we now know it's possible once in a while, to be off by a lot. In this case, it looked like everyone underestimated the Foxes by (maybe) 24 goals in talent.

-------

In light of all this, bookmaker William Hill announced that, next year, they will not be offering any odds longer than 1,000:1. When I first read that, I thought, what's the point? If they had offered a thousand to one on Leicester City, they still would have lost a lot of money, if the true odds were 35:1.

But ... now I get it. Maybe they're saying something like this: "A Premier League team with middle-of-the-road talent -- one that you'd expect to score about as many goals as it allows -- has about a 1 in 1,000 chance of winning the championship. We're not confident enough that we can say, of any bad team, that they can't change their style of play to become average, or that they haven't improved to average over the off-season, or that they've been a .500 team all along but we've just been fooled by randomness. So, we're never again going to set odds based on an evaluation that a team's talent is significantly worse than average, because the cost of a mistake is just too high."

That makes a certain kind of sense. And the logic makes me wonder: were the odds on extreme longshots always strongly biased in bettors' favor, but nobody realized it until now?

(My previous post on the Leicester City is here.)

Labels: Leicester City, longshots, Premier League, soccer

Thursday, May 12, 2016

How did Leicester City do it?

At the start of the season, you could get 5000:1 odds on Leicester City F.C. winning the 2015-16 English Premier League Championship. Of course, Leicester did win, in what one writer called "the unlikeliest feat in sports history".

A friend wrote me that Leicester City is often said to have "defied the odds" to finish on top. That's a metaphor, of course; odds aren't something you can literally "defy," like a bad law or your supervisor's instructions. What does the metaphor mean? To me, it implies that the odds actually *were* 5000:1, that the team did actually hit the longshot outcome, that they were the "1" instead of one of the "5000".

Let's suppose, for whatever reason, I offer you 5000:1 odds that a fair coin will land heads. You bet $10, the coin does land heads, and I pay you $50,000. Did you really "defy" the 5000:1 odds? That doesn't sound right. At best, you "defied" odds of 1:1.

So, the question is: at the start of the year, was Leicester City's expectation really a 1 in 5,001 chance? Were the Foxes really that bad a team, in terms of talent? The evidence suggests not.

After 17 of the season's 38 matches, Leicester City sat at the top of the table (I think "top of the table" is English for "first in the standings"), two points up on second-place Arsenal, and fourth overall in goal differential (+13, behind teams at +17, +14, and +14).

Suppose Leicester were truly a bad team, and had just had a run of good luck. In that case, what would their chance be of hanging on to win the championship? Still pretty poor, right? They're only two points up, with several superior teams right on their tail.

But, the bookmakers had them solidly in the mix. Here are the revised odds on December 21, after 17 matches:

Odds Pts
--------------------------------
1. Leicester 10:1 38
2. Arsenal 10:11 36
3. Manchester City 15:8 32
4. Tottenham 20:1 29
5. Manchester United 18:1 29
--------------------------------
9. Liverpool 22:1 24
--------------------------------
15. Chelsea 66:1 18

Clearly, Leicester City is still considered a lower-quality team than its rivals. Despite Arsenal trailing by two points, bookmakers still give it ten times as much chance of winning as Leicester.

But ... the odds against Leicester hanging on were only 10:1. If the Foxes were really as gawd-awful a team as was thought at the beginning, their odds would be much worse.

After every Premier League season, the bottom three teams are "relegated" to the lower-tier Championship League, while that lower league's best three teams are promoted to replace them. At the beginning of the season, Leicester City was thought to have a 25 percent chance of relegation -- you could get 3:1 odds that they'd be in the bottom three. If they were still thought to be that bad, wouldn't they be much worse than 10:1 to win it all?

For a mirror-image comparison, look at Chelsea, one of the league's elite teams. After those first 17 matches, Chelsea sat 20 points behind Leicester, in fifteenth place out of twenty teams. Despite the poor start, they were still given a 1-in-67 chance (66:1 against) of coming all the way back to take first place. That's because Chelsea was understood to be a very skilled team -- they were the 13:8 favorite when the season started.

Now: if Leicester were as bad as Chelsea were good, you'd think the chance of them dropping to relegation would be about the same as the odds of Chelsea rising to the top. Right? It's kind of symmetrical. Not completely, because Leicester is at the top and Chelsea is only *near* the bottom. But, that's mitigated by Chelsea needing to be *the* top team, and Leicester City needing only to be in the bottom four.

Not perfectly symmetrical, but reasonable.

But: the symmetry doesn't extend to those mid-season odds. A Chelsea comeback was pegged at 66-1. But a Leicester collapse was 3500:1.

Clearly, by December 21, the betting market evaluated that Leicester was a pretty good team. Not a great team, but a good team. For more evidence of that, they're pegged at around 25:1 to repeat next year. The bookmakers clearly don't think Leicester City was a bad team that just got very, very, very lucky.

So, I'd argue that Leicester didn't "defy the odds." They were just a better team than 5000:1 from the beginning.

--------

Well, that might not be strictly true. Maybe as the season started, they were an awful team, but they got better quickly. Maybe in week 2, they signed the soccer equivalent of Babe Ruth and Wayne Gretzky and Peyton Manning and Michael Jordan and Pele. (Sorry, I don't know anything about soccer ... "Pele" was the best I could do.) Or maybe the coach figured out a strategy to take a bunch of mediocre players and make them great (or make the team able to win despite the players' mediocrity).

But, it seems more likely to me that the team was just good from the beginning, and the oddsmakers got it wrong. Well, more importantly, the community of betting soccer fans got it wrong. Because, you'd think, if even a few sharp bettors figured out that Leicester was a pretty good team, they would have moved the odds -- not just by betting on a championship, but probably on available side bets too.

So, what happened? That's a huge question. I think this is the biggest betting market inefficiency I've ever seen. It's not like the 1969 Mets, who probably *did* just get lucky ... or the 1980 "Miracle on Ice" team, who, when you watch the game, you can tell were very much inferior but very much lucky. This is a case where a legitimately very good team got evaluated as very bad -- by *everyone*, even the best sabermetric soccer types and bettors.

Of course, Leicester City isn't actually the most talented team in the Premier League -- at least, not according to betting markets. For next season, the Foxes' 25:1 odds rank only seventh -- the favorite, Manchester City, is at 3:2. Ignoring vigorish, the oddsmakers think Man City has ten times the chance than Leicester does.

Assuming that Leicester should have been that same 25:1 this season instead of 5000:1 ... well, that still seems like an exploitable opportunity that, in theory, never should have happened in an efficient betting market.

So, what did happen? How did everyone miss that Leicester City was a good, if not great, team? That's not a question about oddsmaking. It's a question about soccer. What happened? What made Leicester so good, that nobody had seen coming? How did they build such a successful team on their very low budget? Is there a "Moneyball" secret?

I've seen some articles that broke down some stats, about passing and shots on goal and such. But this is a situation where we need more than that. It's like, if an expansion MLB team goes 105-57 with castoff players, it doesn't help much to say, "well, they won because they had a high OPS and their pitchers struck out a lot of guys." The question is -- how did they get those replacement-level players to have that high OPS and strike out a lot of guys? Was it something they saw in those players? Was it coaching? Was it sign stealing?

-----

OK, after wondering about all this, I figured, hey, the internet exists, maybe I should do some research. (Which consisted of Googling, and getting advice from my friends.)

And, yes, there seems to be an explanation. Apparently there is indeed a bit of a Moneyball story here. My friend John steered me to an article by Leicester City fan John Micklethwait (who this year neglected to make his annual 20 pound bet on Leicester, and missed out on what would have been a 100,000 pound win).

Leicester City made three major acquisitions in the off-season, all with "Moneyball" overtones of underappreciated players. First, N’Golo Kanté, who was number one in France last year in the (apparently overlooked) statistic of interceptions made. Second, Jamie Vardy, who is known for speed (which Leicester used to strategic advantage, as we will see). And, third, Riyad Mahrez, who is known for "a rare ability to dribble past people."*

(* Correction: two of the three were actually signed by Leicester in earlier seasons. See note at end of post.)

They got those guys really cheap.

Then, they analyzed video, and came up with this:

"Leicester players even seem to foul scientifically, slowing down their opponents by taking turns to obstruct them, so that few of the Leicester players get booked or sent off."

And, finally, the most interesting strategic twist: the rapid counterattack.

"This in itself is another innovation. All teams have always counterattacked, but few have based their game so completely around it. In most matches, the team that keeps control of the ball more scores more goals. Teams like Barcelona and Arsenal are famous for never letting their opponents touch it. Not Leicester. Last weekend, Swansea had possession 62 percent of the time, but they still lost 4-0. Leicester’s tactic is to let their opponents have the ball, wait until they make a mistake and then attack at remarkable speed: Hence all those quick players and the unusual disciplined approach."

Well, I love that theory! Because, it's what I argued the Toronto Maple Leafs might have been doing a couple of years ago, when they made the playoffs despite possession stats near the bottom of the league.

Not only do I like that theory the best, but I also believe it's the most plausible. Because, when the Foxes bought those three players, it was public knowledge. It was no secret that Kanté is a great tackler, and Vardy is crazy fast, and Mahrez has dribbling skills (videos are easily found on Google). So, the odds should have accurately reflected those acquisitions.

On the other hand, the counterattacking strategy? Probably, nobody knew manager Claudio Ranieri was going to try that tactic until the season was underway. Even then, it would take a while before it became apparent how well it worked. So, that *could* explain why even the most knowledgeable soccer experts didn't see it coming at all.

For what it's worth, here's an article about Leicester's counterattacking strategy, with accompanying video of some of their quick transition goals. And, something my friend Bob wrote me, from observation:

"They used the counterattack as their primary mode of offense. They had several wins early in the year with possession below 30%. Manager Claudio Ranieri would frequently position one or two players close to midfield on opponents' corner kicks and free kicks in order to better exploit his team's speed advantage. As the season progressed, teams adapted to this and Leicester's possession totals increased. Leicester adapted by tightening up its defense, winning a string of 1-0 games (4 out of 5 in one stretch)."

-------

From all this, here's my wild-ass bottom line, the working hypothesis that my imperfect Bayesian brain is pulling out of its metaphorical butt:

1. Leicester City improved over the off-season by pulling a Moneyball, by acquiring underappreciated players at bargain prices.

2. They implemented a novel strategy emphasizing speedy counterattacking, and it worked, but became less effective as the opposition recognized it and learned to adapt to it.

3. They did play unexpectedly well, in the traditional sense, apart from the strategy.

4. That unexpectedly good play might have been playing over their heads. They were significantly luckier than their talent, judging by the odds during and after the season.**

(**Any championship team is, in retrospect, likely to have played better than its talent, but I'm arguing in this case for even more luck than for a usual champion.)

It's kind of a vague hypothesis, I know, a little bit of everything. But my best guess is ... it *was* a little bit of everything. Because: (a) can you really turn a bad team into a champion with just three players? (b) the odds insist Leicester was luckier than its talent; and (c) even if you discount the opinions of observers, the stats show the Leicester did repeatedly win with very low possession time.

-------

So, what does that mean for next year?

The improvement in players (#1) will remain, if the club doesn't sell them off. And, of course, we know luck doesn't persist (#3 and #4).

That leaves #2, the counterattack. Will the strategy continue to work, or will the opposition adapt to it enough that the advantage will dwindle? Normally, I'd just check the betting market, but I'm not sure what the odds are telling us. As mentioned, Leicester is only the seventh favorite to win next year, at 25:1. That's much smaller than 5000:1, for sure. But how much of the difference is from the skill of their new players, and how much is from an expectation that a less traditionally-skilled team can still win by implementing a disruptive counterattack strategy?

-------

In any case, I think the big story here isn't that Leicester City beat 5000:1 odds. I think the big story here is how Leicester City found a "Moneyball" way to beat the system on a low budget.

I'd argue that this is the "real" Moneyball story, the one we were theoeretically waiting for.

The original story, about the 2002 Oakland A's, isn't that impressive to me. Sure, the 2002 Oakland A's won 103 games on a low budget, but we kind of know how they did it. Yes, they used sabermetrics, but those gains were marginal. Most of their advantage was having a supply of excellent, pre-free-agent players who came cheap, as well as a large dose of luck. (I don't have public luck estimates handy for 2002, but I once figured the A's were lucky by 12 wins that year. Seven of those were from beating their Pythagorean Projection.)

This Leicester City story is different. This is a legitimately bad team, picking up three "free agents" who were legitimately undervalued and overlooked, and then implementing a system that effectively overcame the skill advantage of some of the best and most expensive football talent on the planet.

Even if only half of Leicester City's improvement was Moneyball, and the other half was luck ... well, even then, the Foxes of Leicester City created millions of pounds worth of wins out of basically nothing.

Could that really be what happened?

-----
UPDATE: James Yorke on Twitter has pointed out that two of the three players I mentioned have been with Leicester more than one season. Jamie Vardy was actually signed in 2012, and Mahrez in 2014. Only Kanté was new for 2015-16.

Mr. Yorke also points out that other, lesser players were signed over the past few years, too.

So, Kanté was the main signing in the most recent off-season. This suggests that most of Leicester's success is #2-#4, with less of it being #1 (being mostly Kanté).

Labels: football, football -- but English football that's actually soccer and not NFL or CFL or aussie rules, Leicester City, luck, Moneyball, Premier League, soccer

Sabermetric Research

Sunday, May 29, 2016

Leicester City and EPL talent evaluation

Monday, May 23, 2016

How much of Leicester City's championship was luck?

Thursday, May 12, 2016

How did Leicester City do it?

About Me

My stuff

Hardcore Sabermetric Research Links

Other Sports Research Links

Medium Core Sabermetric/Baseball Links (more to come)

More Baseball Stuff

Blogroll

Previous Posts

Archives