Sabermetric Research: The hot hand debate vs. the clutch hitting debate

In the "hot hand" debate between Guy Molyneux and Joshua Miller I posted about last time, I continue to accept Guy's position, that "the hot hand has a negligible impact on competitive sports outcomes."

Josh's counterargument is that some evidence for a hot hand has emerged, and it's big. That's true: after correcting for the error in the Gilovich paper, Miller and co-author Adam Sanjurjo did find evidence for a hot hand in the shooting data of Gilovich's experiment. They also found a significant hot hand in the NBA's three-point shooting contest.

I still don't believe that those necessarily suggest a similar hot hand "in the wild" (as Guy puts it), especially considering that to my knowledge, none has been found in actual games.

As Guy says,

"Personally, I find it easy to believe that humans may get into (and out of) a rhythm for some extremely repetitive tasks – like shooting a large number of 3-point baskets. Perhaps this kind of “muscle memory” momentum exists, and is revealed in controlled experiments."

-------

Of course, I keep an open mind: maybe players *do* get "hot" in real game situations, and maybe we'll eventually see evidence for it.

But ... that evidence will be hard to find. As I have written before, and as Josh acknowledges himself, it's hard to pinpoint when a "hot hand" actually occurs, because streaks happen randomly without the player actually being "hot."

I think I've used this example in the past: suppose you have a 50 percent shooter when he's normal, but he turns in to a 60 percent shooter when he's "hot," which is one-tenth of the time. His overall rate is 51 percent.

Suppose that player makes three consecutive shots. Does that mean he's in his "hot" state? Not necessarily. Even when he's "normal," he's going to have times where he makes three consecutive shots just by random luck. And since he's "normal" nine times as often as he's "hot," the normal streaks will outweigh the hot streaks.

Specifically, only 19 percent of three-hit streaks will come when the player is hot. In other words, four out of five streaks are false positives.

(Normally, he makes three consecutive shots one time in 8. Hot, he makes three consecutive shots one time in 4.63. In 100 sequences, he'll be "normal" 90 times, for an average 11.25 streaks. In his 10 "hot" times, he'll make 2.16 streaks. That's about a 4:1 ratio.)

Averaging the real hotness with the fake hotness, the player will shoot 51.9 percent after a streak. But his overall rate is 51.0 percent. It takes a huge sample size to notice the difference between 51 percent and 51.9 percent.

Even if you do notice a difference, does it really make an impact on game decisions? Are you really going to give the player the ball more because his expectation is 0.9 percent higher, for an indeterminate amout of time?

-------

And that's my main disagreement with Josh's argument. I do acknowledge his finding that there's evidence of a "muscle memory" hot hand, and it does seem reasonable to think that if there's a hot hand in one circumstance, there's probably one in real games. After all, *part* of basketball is muscle memory ... maybe it fades when you don't take shots in quick succession, but it still seems plausible that maybe, some days you're more calibrated than others. If your muscles and brain are slightly different each day, or even each quarter, it's easy to imagine that some days, the mean of your instinctive shooting motion is right on the money, but, other days, it's a bit short.

But the argument isn't really about the *existence* of a hot hand -- it's about the *size* of the hot hand, whether it makes a real difference in games. And I think Guy is right that the effect has to be negligible. Because, even if you have a very large change in talent, from 50 percent to 60 percent -- and a significant frequency of "hotness", 10 percent of the time -- you still only wind up with a 0.9 percent increased expectation after a streak of three hits.

You could argue that, well, maybe 50 to 60 percent understates the true effect ... and you could get a stronger signal by looking at longer streaks.

That's true. But, to me, that argument actually *hurts* the case for the hot hand. Because, with so much data available, and so many examples of long streaks, a signal of high-enough strength should have been found by now, no?

-------

This debate, it seems to me, echoes the clutch hitting debate almost perfectly.

For years, we framed the state of the evidence as "clutch hitting doesn't exist," because we couldn't find any evidence of signal in the noise. Then, a decade ago, Bill James published his famous "Underestimating the Fog" essay, in which he argued (and I agree) that you can't prove a negative, and the "fog" is so thick that there could, in fact, be a true clutch hitting talent, that we have been unable to notice.

That's true -- clutch hitting talent may, in fact, exist. But ... while we can't prove it doesn't exist, we CAN prove that if it does exist, it's very small. My study (.pdf) showed the spread (SD) among hitters would have to be less than 10 points of batting average (.010). "The Book" found it to be even smaller, .008 of wOBA (a metric that includes all offensive components, but is scaled to look like on-base percentage).

To my experience, a sizable part of the fan community seizes on the "clutch hitting could be real" finding, but ignores the "clutch hitting can't be any more than tiny" finding.

The implicit logic goes something like,

1. Bill James thinks clutch hitting exists!
2. My favorite player came through in the clutch a lot more than normal!
3. Therefore, my favorite player is a clutch hitter who's much better than normal when it counts!

But that doesn't follow. Most strong clutch hitting performances will happen because of luck. Your great clutch hitting performance is probably a false positive. Sure, a strong clutch performance is more likely to happen given that a player is truly clutch, but, even then, with an SD of 10 points, there's no way your .250 hitter who hit .320 in the clutch is anything near a .320 clutch hitter. If you did the math, maybe you'd find that you should expect him to be .253, or something.

Well, it's the same here, with the hot hand:

1. Miller and Sanjurjo found a real hot hand!
2. Therefore, hot hand is not a myth!
3. My favorite player just hit his last five three-point attempts!
4. Therefore, my player is hot and they should give him the ball more!

Same bad logic. Most streaks happen because of luck. The streak you just saw is probably a false positive. Sure, streaks will happen given that a player truly has a hot hand, but, even then, given how small the effect must be, there's no way your usual 50-percent-guy is anything near superstar level when hot. If you had the evidence and did the math, maybe you'd find that you should expect him to be 52 percent, or something.

-------

For some reason, fans do care about whether clutch hitting and the hot hand actually happen, but *don't* care how big the effect is. I bet psychologists have a cognitive fallacy for this, the "Zero Shades of Grey" fallacy or the "Give Them an Inch" fallacy or the "God Exists Therefore My Religion is Correct" fallacy or something, where people are unwilling to believe something into existence -- but, once given license to believe, are willing to assign it whatever properties their intuition comes up with.

So until someone shows us evidence of an observable, strong hot hand in real games, I would have to agree with Guy:

"... fans’ belief in the hot hand (in real games) is a cognitive error."

The error is not in believing the hot hand exists, but in believing the hot hand is big enough to matter.

Science may say there's a strong likelihood that intelligent life exists on other planets -- but it's still a cognitive error to believe every unexplained light in the sky is an alien flying saucer.

Labels: clutch, hot hand, streakiness

3 Comments:

At Thursday, July 13, 2017 6:09:00 PM, Cyril Morong said...: For either clutch or hot hand, is it something that the manager or coach can use to make decisions? (supposing they are real) If some guy makes 5 shots in a row, do you decide to have him take the last shot to win the game instead of Michael Jordan? Are these researchers making any recommendations? It seems like it would be hard to incorporate any of this into the decision making in a game. Someone would have to be telling the coach who is "hot" and then he has to have some estimate of that guy's current "true" FG% and compare it to the guy who is typically your best shooter or scorer.
At Wednesday, September 20, 2017 1:06:00 PM, Joshua Miller said...: Hi Phil

I think that overall, we agree more than we disagree. So please take my mild push-back below in context, it is really around the edges.

I think we can agree that there are many mechanisms which can induce a *temporary* shift in a player's probability of success (in the wild). Most are impossible to measure, so its certainly fair to keep an open mind.

We can speculate on many mechanisms, not all of which are hot hand, but all can be big:

1. short-term muscle memory: what you talk about.

2. long-term muscle memory: variation in the ability to retrieve the muscle memory of how to shoot from a particular location.

3. shooting in-rhythm: i.e. when the offense allows the player to set and anticipate the pass, so he is prepared to shoot (i.e. retrieve long-term muscle memory)

4. unconscious adjustment of shooting mechanics: easy to repeat on a given day (b/c of muscule memory), but doesn't carry over across days unless perhaps the coach notices and intervenes.

5. variation in ability to concentrate/focus on aiming: we know this is thing, and shooting is an aiming task. amphetamines, endogenous or exogenous, and affect it.

6. degree of physiological arousal: this is is the classic
inverted U-curve on performance.

There is some evidence for this in basketball.

Goldman and Rao show it affects free throw shooting; it would be much harder to tease out with the noise in games. This speaks to choke and clutch by the way.

7... we can go on and on.

The ex-ante plausibility of a meaningfully large hot hand is pretty high, and that's before we even get to what is asserted by actual practitioners who have a more granular read of the data than we do, and can observe things we can't measure [I believe practitioners can be very biased, this is more about us having humility with regard to what we can say].

So now the evidence against the hot hand being something should be pretty strong to be so sure its small.

continued...
At Wednesday, September 20, 2017 1:27:00 PM, Joshua Miller said...: part 2/2
you say:

I still don't believe that those necessarily suggest a similar hot hand "in the wild" (as Guy puts it), especially considering that to my knowledge, none has been found in actual games.

you later say:

You could argue that, well, maybe 50 to 60 percent understates the true effect ... and you could get a stronger signal by looking at longer streaks.

That's true. But, to me, that argument actually *hurts* the case for the hot hand. Because, with so much data available, and so many examples of long streaks, a signal of high-enough strength should have been found by now, no?

I continue to be baffled why Guy and you are influenced so much but the "failure" to detect that hot hand in these studies. Every one of these studies is replete with measurement error, specification error, and omitted variable bias. Futher, they have to average across all players, non-shooters included. A simple simulation with a real hot hand shows that you shouldn't move your priors with 1 study like this, or even 100 studies like this. You could have a huge hot hand and you'd never find it with this approach.

You talk about the classification (measurement) error, i.e. mixing hot shots with non-hot shots. We have been making a similar point since our 2014 Cold Hand paper, which we discussed with Gelman 3 years ago. I don't really get the point though because all it is saying is that you shouldn't make decisions based on undiagnostic binary measures of a player's underlying state, because these shot outcomes measure a player's probability of success with error. Yes, that's the point. Coaches and players argue that that can see something else, that they aren't just using shot outcome information. Its a skill to see the hot hand. That is what Bill Russel argued in his 1969 retirement letter.

I think there is a simple test of whether the current game evidence constitutes compellingly against the existence of hot hand shooting. Imagine you have open-minded data-friendly practitioner that can follow the arguments here, and believes in the hot hand like Bill Russell (Player/Coach!). Could you convince him to abandon his belief?

This reminds me of the Oregon experimental study of Medicare where so many people had the take-away that preventative medicine doesn't work. Such a silly argument when the plausibility that it works is high, and there are so many plausible reasons why you would not expect to find it on average across all people that had coverage.

It does not remind me of how communists (and some socialists) continue to believe a central planning would work, and don't accept the Soviet Union’s failure as evidence against (well maybe for fans, because like the communists, they don't look at the evidence). Economic theory, and our knowledge of human nature should make our priors pretty low.

you end with two points:

(1)
The error is not in believing the hot hand exists, but in believing the hot hand is big enough to matter.

I don't see support for this point, as far as I can tell.

(2)
Science may say there's a strong likelihood that intelligent life exists on other planets -- but it's still a cognitive error to believe every unexplained light in the sky is an alien flying saucer.

Here we can agree. And fans, announcers, and newspaper writers are most guilty of this. This is the thing that should annoy everybody.

<< Home

Sabermetric Research

Wednesday, May 17, 2017

The hot hand debate vs. the clutch hitting debate

3 Comments:

About Me

Previous Posts