Sunday, June 01, 2014

Team fatigue and the NBA playoffs

(UPDATE: after I posted this, I discovered that my results weren't as strong as I originally thought.  I've edited to emphasize that.)

Is fatigue a factor affecting NBA playoff success?  Yes, it's a huge factor, a FiveThirtyEight article concluded last month.

Looking at the past 11 post-seasons, Nate Silver's study found that the more games it took at team to win its first-round series, the worse it did compared to expectations in the second round. That was after carefully accounting for team talent and home court advantage.

Teams that won the first round in four games beat second round expectations by 3 points per game. Teams that took seven games underperformed by 5.7 points. Here's their graph that illustrates it best:

But ... I think part of what the study found is a statistical artifact. I think there is a mathematical reason you will *always* get a certain amount of that kind of effect, in any sport, regardless of scheduling and fatigue.


To keep things simple, I'm going to use wins for my examples, instead of points. Also, I'm going to ignore home court advantage. (The argument would still work otherwise; it would just make the math too complicated.)

So: let's suppose you have a best-of-seven series, and the favorite (Team "F") has an independent .667 probability of winning each game. So, the probability that F wins Game 1 is .667. The probability that F wins Game 2 is .667. And so on, so that the probability that F wins Game 7 (if necessary) is also .667.

What is team F's expected winning percentage for the series?

It's not .667. 

And that's the key, understanding why it's not .667. 


Suppose that you have a large number of Team Fs, and, true to form, they win exactly the expected number of games. That means their overall winning percentage is .667. 

But what's the average winning percentage of their *series*?

It would be the same .667, if every series goes like this:

4-2    .667
4-2    .667
4-2    .667
avg    .667

It would also be .667 if all series were four straight:

4-0   1.000
4-0   1.000
0-4    .000
avg    .667

But, for any other combination, the average will have to be *more* than .667. For instance, this:

4-3    .571
4-1    .800
4-2    .667
avg    .679

Team F still won .667 of its games -- going 12-6 -- but the average of the three series is .679. Here's another one:

4-2    .667
1-4    .200
4-0   1.000
4-0   1.000
3-4    .429
4-0   1.000
4-2    .667
avg    .709

This time, Team F's series average is .709, even though it still wins games at the expected .667 clip (24-12). 

What's going on?  Well, when you look at the overall winning percentage by summing games, you're treating all *games* equally. But when you look at the overall percentage by averaging the individual series, you're treating all *series* equally.

But all series aren't equal. They have different numbers of games. If you're giving a four-game series the same weight as a seven-game series, you're weighting each of the "four-game" games higher than each of the "seven-game" games. (75% higher, in fact.)

The shorter the series, the more you're overweighting individual games. 

Now, here's the key: the shorter the series, the higher the expected winning percentage. Why is that?  Becuase of the heavy favorite. 

If a series goes 7 games, either the favorite went 4-3 (.571) or 3-4 (.429). Obviously, 4-3 is more likely. In our example, if you do the arithmetic, it comes out exactly twice as likely. That means for series that go 7 games, the favorite's expected winning percentage is .524. (That's the average of two .571s and one .429.)

If a series goes 4 games, it could be 4-0 (1.000) or 0-4 (.000). Again, 4-0 is more likely. But this time it's much, much more likely -- *sixteen* times as likely. So, in sweeps, the favorite's expected winning percentage is .941. (That's the average of sixteen 1.000s and one .000.)

The difference makes sense intuitively. The underdog might have a decent chance to beat the favorite in a close series, but to beat them four games straight?  Not likely.


1. shorter series are overweighted
2. shorter series have higher winning percentages


3. higher winning percentages are overweighted

And, that's why you the expected winning percentage for a series to be higher than the expected winning percentage for games.


Here's a second way to look at it. Suppose I give you five wins and five losses, and ask you to split them into series, any size you like, so that you win the highest percentage of series. Here's what you'll do:

Series 1: W
Series 2: W
Series 3: W
Series 4: W
Series 5: W
Series 6: LLLLL

You went 5-5 in games, but 5-1 in series. The average winning percentage in games is .500. The average winning percentage in series is .833. (Five series of 1.000, and one series of .000).

They don't match. And, the reason they don't match is that you stuck the losses together, so all the losses can only "ruin" one series. That is: there's a high variance of losses between series.

It's the same in the NBA. The losses aren't always evenly distributed from one series to another. If all the series go 4-2, everything matches. If there's any deviation from that, even randomly, the losses are distributed with higher variance, which means the losses cluster, which means they "ruin" fewer series, which means you get a higher winning percentage.


Here's a third way to prove it: just do the math. Here are all the possible series outcomes, with the binomial probability of occurrence, and the resulting winning percentage:

result  prob   pct
4-0    .198  1.000
4-1    .263   .800
4-2    .219   .667
4-3    .146   .571
3-4    .073   .429
2-4    .055   .333
1-4    .033   .200
0-4    .012   .000

Out of 1000 series, Team F will go 4-0 in 198 of them, obtaining a 1.000 winning percentage. They'll go 4-1 in 263 of them, obtaining an .800 winning percentage. And so on.

If you average out the 1,000 results, you get .694. That is: when a team goes into a series with a .667 probability of winning each game, its expected winning percentage *for the series* is .694.


The difference between the expected series percentage (.694) and the expected game percentage (.667) is .027. Let's call that the "discrepancy".

When the favorite is only .500, the discrepancy is zero, since it's no longer true that a short series has a higher winning percentage than a long series.

When the favorite is close to 1.000, the discrepancy is also zero, since every series is the exact same 4-0. 

Since the discrepancy goes from zero to .027 back to zero, it must be that there's a peak somewhere. I ran a simulation, and, not surprisingly, it turns out that the peak is exactly halfway between, at .750. Here are some winning percentages (for the favorite) and the resulting discrepancy:

wpct discrepancy
.500  .000
.550  .010
.600  .020
.650  .027
.667  .028
.700  .030
.750  .032
.800  .030
.850  .025

So, roughly speaking, the closer the favorite is to .750, the higher the discrepancy. In practical terms, the stronger the favorites, the more they will appear to outperform.


Can that explain the FiveThirtyEight results?

Recall what the FiveThirtyEight study found: that teams who won the first round in four games did better in the second round relative to expectations. That is: they had a higher discrepancy.

That fits. Teams that won the first round 4-0 are likely to be stronger teams than teams that won 4-3. In fact, teams that won 4-3 might even be underdogs. That would explain why the 4-0 teams came out "too high," and the 4-3 teams came out "too low".

The FiveThirtyEight study used point differential, rather than wins. Let me try to rejig the example to be in points, so we can compare results. 

Assume that when a .667 team wins, it wins by 10 points on average. When it loses, it's by 5 points. That works out to +5 points per game overall (the average of +10, +10, and -5). Those +5 points correspond to a .667 talent. (The rule of thumb is around 30 points per win, so 5 points above average results in .167 wins above average.)

When the favorite goes 4-0, it winds up at +10 points per game. When it goes 4-1, it's +7 points per game (average of four +10s and one -5). And so on. Here's the chart:

result  prob  pts/g
4-0    .198    +10
4-1    .263    +7
4-2    .219    +5
4-3    .146    +3.5
3-4    .073    +1.4
2-4    .055     0
1-4    .033    -2
0-4    .012    -5
average        +5.4 

Under our assumptions, strong favorites would show 0.4 points better than their talent in a best-of-seven series.

(UPDATE: I originally said 5.4 points better, because I forgot to subtract off the 5 points we started with.  Oops!)

How does that compare to the real study? FiveThirtyEight found that teams going 4-0 in the first round overperformed by 3 points ... and teams going 4-3 in the first round underperformed by 5.7 points.

Hmmm ... not very close!  

Maybe it's our back-of-the-envelope assumptions: (a) that 4-0 teams would be favored by +5 in the second round, and (b) that the +10/-5 breakdown for wins and losses is reasonable. 

What if we assume it's +12 when they win, and -9 when they lose?  Then, it shows 0.6 points better -- not much difference.

I thought the effect would be bigger.  Maybe not.  


In any case, it's easy to back out this effect to see if any "fatigue" factor remains.  All you have to do is: for that chart, where there's one circle for every series ... just change it so there's one circle for every *game*. I'm betting that if you do that, the sloped line becomes closer to horizontal.  But, it appears, maybe not that much closer.


Here's something else that's probably part of the effect. 

The study based the second round expectations on regular season point differentials.  But, a team's first round performance gives you additional information about the team.

Suppose at team goes 4-0, with an average +10 point differential per game. If you combine that with a regular season +5 rating, you get a new rating of 5.23 points.  

In footnote #4, the article points out that a playoff game gives you two to three times the information about future performance than a regular season game. If we use "twice," the +5 rating becomes 5.44.  If we use "three times," it becomes 5.63.

So, we have: +0.4 points for the "weighting by series" anomaly, and another 0.5 points or so for the added information the 4-0 gives us.  We're up to +0.9 of the observed +3 points.

Still, that leaves 2.1 points for which fatigue is still the only explanation on the table. 


(More to come.)

Labels: , , ,


Post a Comment

Links to this post:

Create a Link

<< Home