Wednesday, August 12, 2009

Testing Tim Donaghy's allegations of NBA playoff manipulation

Here's another academic study trying to find bias in the NBA. From the latest issue of JQAS, it's called "Testing For Bias and Manipulation in the National Basketball Association Playoffs," by Timothy Zimmer and Todd H. Kuethe.

Last year, Tim Donaghy, the NBA referee convicted of betting on games, suggested that there is a conspiracy between the referees and the NBA. The league, Donaghy alleged, wants large-market teams to advance in the playoffs, and it wants series to go the maximum number of games to maximize excitement and TV revenue. He accused certain referees, "company men," of calling the critical games differently to try to achieve the league's desired result.

In this study, Zimmer and Kuethe attempt to look at the evidence for Donaghy's charge. Is there a "big city" bias in playoff series? And do the underdog teams have an increased chance of winning games when they are behind in the series?

The authors ran a regression, trying to predict margin of victory based on (a) what game in the series it was, (b) the difference in conference seed position between the two teams (so that the #2 team playing the #7 team would be a 5-seed difference), (c) which team was at home, and (d) a couple of other factors that proved unimportant.

They ran the regression only on the first three rounds of the playoffs; they ignored the finals, due to concerns that "seeding" didn't really make sense when the teams are from different conferences. The regression covered 2003 to 2008; I have no idea why they chose to use only six seasons, when there's so much more data available and they could have got much more reliable results. (Gratuitous link to

Anyway, the results were that the stronger team's margin of victory is roughly:

-4.94 points
plus 1.55 points * the seed difference
plus 10.19 points if they're at home
plus 0.02 points for every extra 100,000 population
minus 2.67 points if it's game 1
minus 1.48 points if it's game 2
minus 4.68 points if it's game 3
minus 0.00 points if it's game 4
minus 4.04 points if it's game 5
minus 0.25 points if it's game 6
minus 1.59 points if it's game 7

The seed and the home field advantage were significant, as you would expect. But so was the population difference (about 2 SDs), and the Game 3 difference (2.6 SDs).

The authors conclude that there is some evidence for Donaghy's claims; the large-market teams have an advantage, and the significantly increased performance of the underdog in Game 3 shows something funny is going on there.

I don't think that's right, in either case.

First, for population: Zimmer and Kuethe figured the difference in team quality only by standings position; so the #1 team facing the #8 team was scored as 7, no matter how good those teams actually were. But isn't it possible that when a #1 team is a big-market team, they're a better team than a typical #1 team from a smaller market? It seems likely to me. The authors acknowledge that big cities might have better teams than small cities, but they argue that

"If large-market teams attract better players, either through pay or lifestyle, the regular-season winning percentage will reflect this disparity in pay."

Yes, but not all of it the authors don't use winning percentage -- they use standings position. When the highest-paid resident of your street is a CEO, he's probably going to make more money than if the highest-paid resident of your street is a just a professor. Looking at the ranking captures a lot of the information of salary, or team quality, but not all of it. I'd bet that what's being measured by the regression is just that leftover, that #1 teams from large-markets are just better than #1 teams from small markets.

I haven't done any work to prove that. But still, I wonder why the authors chose standings rank instead of season wins, when wins is just as easy to collect and would likely be more accurate.

Also, I'm not sure if it's reasonable to calculate the standard error the way the authors did, as if every observation is independent. Suppose one large-market team has an unlucky season, finishing (say) fifth in its conference when its talent was really good enough for second. If that team goes tearing through three playoff rounds, winning all three as the apparent underdog, those results are certainly not independent. And so the SE of the "population" coefficient is understated; since it was only 2 standard deviations from zero anyway, it's very likely that it's no longer significant once you adjust for the fact that teams with "inaccurate" seedings would be more likely to appear in subsequent rounds.

So, I don't think the population results mean much. What about the Game 3 result of significance?

First, the result of a 4.68 point differential isn't relative to every other game; it's only relative to Game 4. That is, the underdog performed 4.68 points better in Game 3 than in Game 4. Is that consistent with referee bias? I suppose it could be; when the favorite goes 2-0, the referees try to have them lose Game 3, for a longer series. But why not have them lose Game 4 instead? If the idea is to prolong the series without affecting who wins, going from 3-0 to 3-1 is much safer than from 2-0 to 2-1.

But you can't predict the NBA's methods of cheating, I suppose, so let's assume that they do shoot for a Game 3 underdog win. Still, the favorite is going to win some of those games anyway, and go 3-0. Wouldn't the NBA want to see the underdog win Game 4, then? You'd think so: but Game 4 actually shows the *best* performance by the favorite; every other game coefficient is negative, meaning the favorite loses points in those games relative to the fourth game. So why would that be? Why would the underdog do best in Game 3, but worst in Game 4, if the NBA is trying to orchestrate a longer series? That doesn't make a lot of sense to me.

Second: If you take the results as presented, Game 3 is the only one of the six games that shows statistically significant results. But it's only 2.6 SDs away. That's significant at almost exactly the 1% level (assuming a two-tailed test, 2.6 SDs on either end of the curve). The chance that at least one of six variables would show that kind of 1% significance is ... about 6%. So, really, unless you had good reason to suspect Game 3 in the first place, the result isn't really significant enough by the typical 5% standard for these sorts of things.

It's even less significant when you look a little deeper. Game 3 is only significant when compared to Game 4: and Game 4 just happens to be the most extreme observation in the other direction! So you're looking at the difference of the two extremes out of seven.

The chance that the *most positive* of seven normal variables will be more than 2 SDs (of itself) away from the *least positive* of seven normal variables is pretty high. There are actually 21 pairs of the seven variables; if every pair has a 1% chance of showing a result, then, even though the 21 pairs aren't independent, on average you'll find 0.21 apparently significant results. That is: if you run the experiment 100 times, with different sets of data, you'll find 21 significant results. It's therefore not all that surprising -- and certainly not statistically significant at any reasonable level -- when this study finds exactly 1.

It *looks* significant, sure, but that's because the authors of the study happened, luckily, to have randomly chosen Game 4 as their reference point. Had they chosen, say, Game 1, they would have found no significant-looking effect at all.

So, in summary:

-- the population effect is probably (in my judgment) due to the study not adjusting for the fact that big-market teams are better than small-market teams;

-- even if that turns out not to be the case, the effect found is probably not significant anyway, due to underestimation of the SE;

-- the Game 3 effect is significant ONLY when compared to Game 4;

-- there are 21 possible significant game vs. game effects, so the fact that exactly one of those 21 was found to be 2.6 SDs away from zero is not a very low-probability event;

-- the observation that Game 3 most favors the underdog and Game 4 most favors the favorite does not, on its face, appear to be very consistent with Donaghy's conspiracy theory.

So I don't think there's much here at all. Of course, the authors only analyzed six seasons, so there's lots more data if someone wants to investigate further.

UPDATE: had the numbers wrong in the table above. Now fixed.

Labels: , ,


At Wednesday, August 12, 2009 1:01:00 PM, Anonymous Ryan J. Parker said...

Enjoyed the post Phil. I'd definitely be interested in any future posts on properly estimating the SE (instead of underestimating it). :)

At Wednesday, August 12, 2009 1:19:00 PM, Blogger Andy said...

I ran a regression on the same playoff data, with the following differences:
- Used win difference instead of seed difference.
- Used variables to measure whether the game was potentially a deciding game (except game 7) for either the high seed or low seed.

The only ones that were significant were win difference and home advantage.

At Wednesday, August 12, 2009 3:30:00 PM, Blogger Phil Birnbaum said...

Ryan: I'm not sure how to estimate it ... it would be difficult and you'd have to make some assumptions. I'd maybe try to get independence instead, perhaps by taking one round for 12 years instead of 3 rounds for 6 years. The Lakers aren't truly independent from 2003 to 2004, but much more so than from the 1st round of 2003 to the 2nd round of 2003.

Andy: excellent! I would have thought that would be what you'd find ...


Post a Comment

<< Home