Wednesday, April 26, 2017

Guy Molyneux and Joshua Miller debate the hot hand

Here's a good "hot hand" debate between Guy Molyneux and Joshua Miller, over at Andrew Gelman's blog.

A bit of background, if you like, before you go there.


In 1985, Thomas Gilovich, Robert Vallone, and Amos Tversky published a study refuting the "hot hand" hypothesis, which is the assumption that after a player has recently performed exceptionally well, he is likely to be "hot" and continue to perform exceptionally well.

The Gilovich [et al] study showed three results:

1. NBA players were actually *worse* after recent field goal successes than after recent failures;

2. NBA players showed no significant correlation between their first free throw and second free throw; and

3. In an experiment set up by Gilovich, which involved long series of repeated shots by college basketball players, there was no significant improvement after a series of hits.


Then, in 2015-2016, Joshua Miller and Adam Sanjurjo found a flaw in Gilovich's reasoning. 

The most intuitive way to describe the flaw is this:

Gilovich assumed that if a player shot (say) 50 percent over the full sequence of 100 shots, you'd expect him to shoot 50 percent after a hit, and 50 percent after a miss.

But this is clearly incorrect. If a player hit 50 out of 100, then, if he made his (or her) first shot, what's left is 49 out of 99. You wouldn't expect 50%, then, but only about 49.5%. And, similarly, you'd expect 50.5% after a miss.

By assuming 50%, the Gilovich study set the benchmark too high, and would call a player cold or neutral when he was actually neutral or hot.

(That's a special case of the flaw Miller and Sanjurjo found, which applies only to the "after one hit" case. For what happens after a streak of two or more consecutive hits, it's more complicated. Coincidentally, the flaw is actually identical to one that Steven Landsburg posted for a similar problem, which I wrote about back in 2010. See my post here, or check out the Miller paper linked to above.)


The Miller [and Sanjurjo] paper corrected the flaw, and found that in Gilovich's experiment, there was indeed a hot hand, and a large one. In the Gilovich paper, shooters and observers were allowed to bet on whether the next shot would be made. The hit rate was actually seven percentage points higher when they decided to bet high, compared to when they decided to bet low (for example, 60 percent compared to 53 percent).

That suggests that the true hot hand effect must be higher than that -- because, if seven percentage points was what the participants observed in advance, who knows what they didn't observe? Maybe they only started betting when a streak got long, so they missed out on the part of the "hot hand" effect at the beginning of the streak.

However, there was no evidence of a hot hand in the other two parts of the Gilovich paper. In one part, players seem to hit field goals *worse* after a hit than after a miss -- but, corrected for the flaw, it seems (to my eye) that the effect is around zero. And, the "second free throw after the first" doesn't feature the flaw, so the results stand.


In addition, in a separate paper, Miller and Sanjurjo analyzed the results of the NBA's three-point contest, and found a hot hand there, too. I wrote about that in two posts in 2015. 


From that, Miller argues that the hot hand *does* exist, and we now have evidence for it, and we need to take it seriously, and it's not a cognitive error to believe the hot hand represents something real, rather than just random occurrences in random sequences. 

Moreover, he argues that teams and players might actually benefit from taking a "hot hand" into account when formulating strategy -- not in any specific way, but, rather, that, in theory, there could be a benefit to be found somewhere.

He also uses an "absence of evidence is not evidence of absence"-type argument, pointing out that if all you have is binary data, of hits and misses, there could be a substantial hot hand effect in real life, but one that you'd be unable to find in the data unless you had a very large sample. I consider that argument a parallel to Bill James' "Underestimating the Fog" argument for clutch hitting -- that the methods we're using are too weak to find it even if it were there.


And that's where Guy comes in. 

Here's that link again. Be sure to check the comments ... most of the real debate resides there, where Miller and Guy engage each other's arguments directly.

Labels: , , ,