Hot Hand III
After my last two posts on the hot hand, I had an e-mail conversation with Dan Stone, the author of the original study. He pointed out, correctly, that I had used his most conservative example, which is why I found such a low hot hand expectation based on outcomes. He has a point ... in fairness, I should have shown results for his other models. I'll do that now.
Let me start by recapping Dan's conservative model.
-----
Suppose that the chance of a player making his next shot (I'll assume it's a basketball free throw) is correlated with the chance of his having made his previous shot (regardless of whether or not he made that previous shot). Specifically, suppose to figure out the probability for the current shot, you do this two step process:
1. Take the probability for his previous shot, and regress it 90% to the mean of 75%. That is, "keep" 10% of the probability, and regress 90% of it. (Dan calls the 10% figure "rho".)
2. Take that new probability, and figure the max you can add/subtract while keeping the probability between 0% and 100%. Call that number X -- so, for instance, if the new probability is 77%, X is 23%. Choose a random number, uniformly, from -X to +X. Multiply it by 25%, and add it on. (Dan calls the 25% figure "alpha").
So, for instance, suppose your previous probability was 65%. Regress that 90% to the mean, bringing you to 74%. Then, choose a random number between -26% and +26%. Suppose you pick -12%. Multiply that by the alpha of 25%, giving -3%. Add the -3% to the 74%. That gives you 71%, which is your new probability for the current shot.
-----
I simulated that model over 100,000 shots, and got
-- after a hit, your chance of a hit is 75.049 pct.
-- after a miss, the chance of a hit is 74.875 pct.
My conclusion was, there's not much of a hot hand there, if knowing the previous outcome gains you so little in predictive value.
-----
But, what if we try Dan's less conservative models? We'll get a bigger difference. I'll vary "rho" from 10%, and "alpha" from 25%, and show you some of the new results.
Alpha still 25%, but Rho now 50% regression to the mean:
-- after a hit, your chance of a hit is 75.03 pct.
-- after a miss, the chance of a hit is 74.81 pct.
We've improved to a difference of .22 percentage points, instead of .18.
Now, let's up Rho to .9, which means we only regress 10% to the mean:
-- after a hit, your chance of a hit is 75.20 pct.
-- after a miss, the chance of a hit is 74.27 pct.
Now, the difference is quite large, relatively speaking: almost a whole percentage point (.93).
------
Dan also included a model where alpha is higher, at 0.5. That means we add half the random number, instead of just one-quarter. That should make the hot hand effect even more prominent, because we'll have more extreme values, which means a miss is more likely to be the result of a low probability.
So, here's Alpha a 50%, and Rho also at 50%:
-- after a hit, your chance of a hit is 75.23 pct.
-- after a miss, the chance of a hit is 74.27 pct.
And, here's Alpha at 50%, and Rho at 90% (meaning, only 10% regression to the mean):
-- after a hit, your chance of a hit is 75.93 pct.
-- after a miss, the chance of a hit is 72.06 pct.
As I said, and again in fairness, this is a much larger effect than the one I reported on. However ... well, to me, it's still a pretty small "hot hand" effect in outcome terms. Even knowing that probabilities vary a lot -- sometimes Kobe Bryant steps to the line having only a 50% chance of making his shot -- the difference between a hit and a miss is still only four percentage points. The best a fan can do is to say, "hey, the guy missed his last shot, so he's probably cold -- he's a 72 percent shooter instead of 76 percent shooter."
-------
However, Dan points out that we might be able to do better than that by looking at more than just the outcome. We can actually try to estimate what the probability was, based on how good a shot it was. So, on average, the difference may be 72 percent to 76, but, if Kobe really made a bad miss, as opposed to a close call, we can perhaps be more certain that Kobe had a low probability to start, so his current probability should be low too -- lower than the 72% we'd guess if all we knew is that he missed.
That makes sense, if you can indeed make an inference about the probability based on the qualitative characteristics of the actual performance. You probably can, a bit, but I'm skeptical that it's enough to significantly improve predictions.
And, I'm also skeptical that Kobe's probability can vary so much from shot to shot; normally, most researchers assume a constant probability, not a varying one.
But, Dan may prove me wrong, on both counts, in a subsequent paper. I'll keep an open mind.
Labels: hot hand, statistics