NCAA point-shaving study convincingly debunked
There have been a couple of Justin Wolfers papers in the news recently. There was the “racial bias in the NBA” study a few months ago (co-authored with Joseph Price), which I reviewed here. And, last May, there was an article on NCAA point shaving, which I’ll talk about now. “Among teams favored to win by 12 points or more, 46.2% of teams won but did not cover, while 40.7% were in the comparison range of outcomes.”
Wolfers analyzed the outcomes and betting lines for over 70,000 NCAA Division I games. He found that, overall, the favorite covered the spread almost exactly half the time (50.01%), exactly as you would expect. But heavy favorites (defined as a spread of more than 12 points) covered only 48.37% of the time.
He then found that, for heavily favored teams, the results weren’t symmetrical. That is, for teams favored by, say, 4.5 points, they would cover by 0-4 almost exactly as often as by 5-8. But for large spreads, like 15.5 points, teams covered by 0-15 a lot more than by 16-31.
The difference is about 6% of those games, which we can call the “Wolfers Discrepancy” for his dataset.
Wolfers concludes that 6% figure is evidence of cheating – that point shaving caused 3% of potential covers to become non-covers. Since only half of games can be shaved (games where the favorite is leading), that means that the proportion of corrupt games was double that. Conclusion: the fix was in in about 6% of 12-point-plus-spread games.
This study, and the 6% figure, got a lot of publicity.
“Among teams favored to win by 12 points or more, 46.2% of teams won but did not cover, while 40.7% were in the comparison range of outcomes.”
Not receiving anywhere near as much publicity is a rebuttal paper by Dan Bernhardt and Steven Heston. And that’s too bad, because it’s a great study, and it thoroughly debunks Wolfers’ conclusions. These guys nail it.
Bernhardt and Heston argue that there’s no reason to expect the symmetry that Wolfers assumes. For instance,
"Consider a 14 point favorite that is up by only seven points with five minutes to play. To maximize its chances of winning, the favorite will “manage the clock,” holding the ball to reduce the number of opportunities that the other team has to score ... In contrast, the same favorite up by 21 is sure to win, and has no need to manage the clock, raising the expected increment in winning margin, generating an asymmetric distribution in winning margins."
That’s very plausible, and the authors back it up with some clever analysis. They argue that if the Wolfers Discrepancy is caused by cheating, then it should be much higher than 6% in games that are more likely to be corrupt, and less then 6% in games that are less likely to be corrupt. And so they check – four different ways.
First, they note that a fixed game will attract heavy betting on the underdog, and all that money will move the betting line to reduce the spread. So they split games into two groups: games where margin moved the “wrong” way, towards the favorite, and all other games. It turns out that the Wolfers’ Discrepancy is almost exactly the same between the two groups, suggesting that it’s a natural part of the scoring distribution rather than an artifact of corruption.
Second, they note that if game fixing happens, it happens in games where the final score is very close to the spread. So, for a 14-game spread, instead of comparing finals of 1-14 points to finals of 15-28 points, they compare 8-14 to 15-21, to narrow the range. Now, the number of corrupt games in these smaller samples should be equal to the number in the bigger sample. But the denominator, the number of total games, is smaller. Therefore, the percentage of apparently corrupt games – the Wolfers Discrepancy -- should rise.
It didn’t. In fact, it *dropped*, by more than two-thirds.
As a third test, they check games where the line did move towards the underdog, games where corruption is plausible. You’d expect, in those games, that the more the line moved, the more likely the game is fixed, and so the larger the Wolfers Discrepancy for those games.
The result -- nothing statistically significant:
5.61% -- line moved towards the favorite
6.64% -- line didn’t move
7.17% -- line moved towards the underdog by 0 - 0.5 points
8.43% -- line moved towards the underdog by 0 - 1 point
6.83% -- line moved towards the underdog by 0 - 1.5 points.
Finally, they analyzed games where there was no betting line, which usually happens when there isn’t enough interest in the game for the sports books to bother. (For those games, Bernhardt and Heston had to estimate a spread using team strength ratings.)
If there is no betting, there’s no incentive to shave points. And so, if the Wolfers Discrepancy is really caused by corruption, it should be zero for those games.
But it wasn’t zero. In fact, it was 6% -- almost exactly the same as Wolfers found in his own study!
Truly outstanding work by Bernhardt and Heston. They took a statistical effect that Wolfers claimed showed corruption, and proved, four different ways, that the effect exists even when corruption is unlikely. In addition, they provided a plausible explanation of what the effect might be.
The NCAA should buy these guys a beer.
Hat tip: Zubin Jelveh