Sunday, September 22, 2013

Selective sampling could explain point-shaving "evidence"

Remember, a few years ago, when a couple of studies came out that claimed to have found evidence for point shaving in NCAA basketball?  There was one by Jonathan Gibbs (which I reviewed here), and another by Justin Wolfers (.pdf).  I also reviewed a study, from Dan Bernhardt and Steven Heston, that disagreed.  

Here's a picture stolen from Wolfers' study that illustrates his evidence.



It's the distribution of winning margins, relative to the point spread.  The top is teams that were favored by 12 points or less, and the bottom is teams that were favored by 12.5 points or more.  The top one is roughly as expected, but the bottom one is shifted to the left of zero.  That means heavy favorites do worse than expected, based on the betting line.  And, heavy favorites have the most incentive to shave points, because they can do so while still winning the game.

After quantifying the leftward shift, Wolfers argues,
"These data suggest that point shaving may be quite widespread, with an indicative, albeit rough, estimate suggesting that around 6 percent of strong favorites have been willing to manipulate their performance."

But ... I think it's all just an artifact of selective sampling.

Bookmakers aren't always trying to be perfectly accurate in their handicapping.  They may have to shade the line to get the betting equal on both sides, in order to minimize their risk.

It seems plausible to me that the shading  is more likely to be in the direction consistent with the results -- making the favorites less attractive.  Heavy favorites are better teams, and better teams have more fans and followers who would, presumably, be wanting to bet that side.  

I don't know whether that's actually true or not, but it's not actually necessary.  Even if the shading is just as likely to happen towards the underdog side as the favorite side, we'd still get a selective-sampling effect.

Suppose the bookies always shade the line by half a point, in a random direction.  And, suppose we do what Wolfers did, and look at games where a team is favored by 12 points or more.  

What happens?  Well, that sample includes every team with a "true talent" of 12 points or more --  with one exception.  It doesn't include 12-point teams where the bookies shaded down (for whom they set the line at 11.5).

However, the sample DOES include the set of teams the bookies shaded *up* -- 11.5-point teams the bookies rated at 12.

Therefore, in the entire sample of favorites, you're looking at more "shaded up" lines than "shaded down" lines.  That means the favorites, overall, are a little bit worse than the line suggests.  And that's why they cover less than half the time.  

You don't need to have point shaving for this to happen.   You just need for bookies to be sufficiently inaccurate.  That's true even if the inaccuracy is on purpose, and -- most importantly -- even if the inaccuracy is as likely to go one way as the other.

------

To get a feel for the size of the anomaly, I ran a simulation.  I created random games with true-talent spreads of 8 points to 20 points.  I ran 200,000 of the 8-point games, scaling down linearly to 20,000 of the 20-point games.  

For each game, I shaded the line with a random error, mean zero and standard deviation of 2 points.  I rounded the resulting line to the nearest half point.  Then, I threw out all games where the adjusted line was less than 12 points. 

(Oops! I realized afterwards that Wolfers used 12.5 points as the cutoff, where I used 12 ... but I didn't bother redoing my study.)  

I simulated the remaining games as 100 possessions each team, two-point field goals only.

The results were consistent with what Wolfers found.  Excluding pushes, my favorites went 355909-325578, which is a .478 winning percentage.  Wolfers' real-life sample was .483.  

--------

So, there you go.  It's not proof that selective sampling is the explanation, but it sounds a lot more plausible than widespread point shaving.  Especially in light of the other evidence:

-- If you look again at the graph of results, it looks like the entire curve moves left.  That's not what you'd expect if there were point shaving -- in that case, you'd expect to see an extraordinarily large number of "near misses".  

-- as the Bernhardt/Heston study showed, the effect was the same for games that weren't heavily bet; that is, cases where you'd expect point-shaving to be much less likely.

-- And, here's something interesting.  In their rebuttal, Bernhardt and Heston estimated point spreads for games that had no betting line, and found a similar left shift.  Wolfers criticized that, and I agreed, since you can't really know what the betting line would be. 

However: that part of the Bernhardt/Heston study perfectly illustrates this selective sampling point!  That's because, whatever method they used to estimate the betting line, it's probably not perfect, and probably has random errors!  So, even though that experiment isn't a legitimate comparison to the original Wolfers study, it IS a legitimate illustration of the selective sampling effect.

---------

So, after I did all this work, I found that what I did isn't actually original.  Someone else had come up with this explanation first, some five years ago.  

In 2009, Neal Johnson published a paper in the Journal of Sports Economics called "NCAA Point Shaving as an Artifact of the Regression Effect and the Lack of Tie Games."  

Johnson identified the selective sampling issue, which he refers to as the "regression effect."  (They're different ways to look at the same situation.)  Using actual NCAA data, he comes up with the result that, in order to get the same effect that Wolfers found, the bookmakers' errors would have to have had a standard deviation of 1.35 points.  

I'd quibble with that study on a couple of small points.  First, Johnson assumed that the absolute value of the spread was normally distributed around the observed mean of 7.92 points.  That's not the case -- you'd expect it to be the right side of a normal distribution, since you're taking absolute values.  The assumption of normality, I think, means that the 1.35 points is an overestimate the amount of inaccuracy needed to produce the effect.

Second, Johnson assumes the discrepancies are actual errors on the part of the bookmakers, rather than deliberate line shadings.  He may be right, but, I'm not so sure.  It looks like there's an easy winning strategy for NCAA basketball -- just bet on mismatched underdogs, and you'll win 51 to 53 percent of the time.  That seems like something the bookies would have noticed, and corrected, if they wanted to, just by regressing the betting lines to the mean.  

Those are minor points, though.  I wish I had seen Johnson's paper before I did all this, because it would have saved me a lot of trouble ... and, because, I think he nailed it.


Labels: , , , , , ,