Sabermetric Research: Is there evidence for a "hot hand" effect?

You're in the desert, exactly 10 miles south of home. Instead of walking straight home, you decide to walk east one mile first. How far away from home are you when you're done?

The answer: about 10.05 miles. Your one-mile detour only added 0.05 miles to your return trip: one-half of one percent.

That's just a simple application of the Pythagorean theorem. Draw a right triangle with sides 10 miles and 1 mile, and the hypotenuse will be 10.05 miles long.

------

That triangle thing is meant to explain why it might be difficult to find evidence for or against streakiness in baseball hitting expectations.

There's a belief that there are times when a player has a "hot hand", in the sense that he's somehow "dialed in" and is likely to perform better than his usual. And, there might be periods when he's "cold," and should be expected to perform worse.

Torii Hunter hit .370 in April, 2013. Did he have a hot hand, or was he just lucky? That's the question that we want to answer. Maybe not specifically about Torii Hunter, but, in general ... when a player has a hot streak or a cold streak, is there something behind that?

The difficulty answering that is that there's a lot of luck involved, and it's hard to separate out the luck from the "hot hand" (if any).

If you assume each AB is independent of the previous one, then, over a month's worth -- say, 100 AB, which was Torii Hunter's total -- the SD of batting average, by luck alone, will be about 43 points. That's massive ... it means that one time out of 20, a player will be 86 points off expected in a given month.

Now, suppose there's also a real "hot hand" effect, that's 1/10 the size of the luck effect. Now, the SD of batting average, by luck and hot hands combined, will be one half of one percent higher, just like in the Pythagorean triangle example. That's almost nothing -- .0002 in batting average.

Effectively, this level of hot hand is invisible, if it exists.

This finding might be good news to those who believe that players are often "on" or "off" (Morgan Ensberg is one of those believers). But ... there are still things we *can* say about hot-hand effects, and those things won't be compatible with some of the ways people talk about streakiness.

-------

First, let's try to make an intelligent guess on how much hot-handedness can remain invisible. As we saw, 10 percent of luck is impossible to see because the SD difference is only 0.5 percent. If hot hand variation were 20 percent of luck, the SD would go up by two percent. If hot hands were 30 percent of luck, the SD would go up by around 4.5 percent. If it were 40 percent of luck, the SD would go up by around 7.7 percent.

Actually, let me quote Bill James (subscription required), from his recent essay on streakiness that inspired me to write this:

" ... how much causal variation is it reasonable to think might be completely hidden in a mix of causal and random variation?

"Well, if it was 50-50, it would be extremely easy ... [i]f you add a second level of variation equal to the first, it will create obviously non-random patterns.

"If it 70-30—in other words, if the causal variation was roughly half the size of the random variation—that, again, would be easy to distinguish from pure random variation. Even if it was 90-10, we should be able to distinguish between that and pure random variation. If it was 99-1, maybe we would have a hard time telling one from the other."

I think Bill is a little optimistic on the 90-10 ... you're still talking one-half of one percent. On 70-30, though, I think he's right. That would be around a 9 percent increase in the standard deviation, which I think we'd be able to find without too much difficulty.

For the sake of this argument, let's say that the limit we can observe is ... I dunno, let's say, 75-25. That would mean the SD of hot hands was 33 percent of the SD of luck, which means the overall SD goes up by around 5.3 percent. I think we could find that if we looked. I think.

I may be wrong ... substitute your own number if you prefer.

------

It might be legitimate for a hot-hand advocate could say to us, "well, you don't know for sure that there's no talent streakiness, because you admit that your methods can't pick it up."

Well, maybe. But if you make that argument, don't forget to also limit yourself to hot hand effects that are 33 percent of luck. That's around 14 points (again, that's batting average over 100 AB).

That means only one player in 20 will be more than 28 points off his expected batting average for any given month. Only one player in 400 will be more than 42 points "hot" or "cold" for the month.

Are you prepared to live with that? When you find ten players hitting 100 points off their expected for the month of August, you'd have to say, "Well, it's virtually impossible that *any* of them were *that* hot ... that's 7 standard deviations, which never happens. They may have been a bit hot, but they were mostly lucky."

Seriously, are you prepared to say that?

There's more. With a 75-25 mix, the correlation between "hot hand" and performance would be a bit over 0.3. That means, if you find a player hitting 2 SD better than expected, his "hot hand" expectation is +0.6 SDs. That's around 8 points.

Seriously. When a career .250 player hits .340 in May, you'd have to be saying, "wow, I bet that player has a hot hand. I bet in May, he was really a .258 hitter!!!"

My guess is, most "hot hand" enthusiasts wouldn't bite that bullet. They don't just want "hot hands" to exist, they want them to be a big part of the game. When they see a player on a hot streak, they want to believe that player is *significantly better*, that he's a force to be reckoned with. "He's 8 points hot" is just not a very good narrative.

But that's pretty much what it amounts to. If 75-25 is the right ratio, then only around TEN PERCENT of a player's observed "hotness" would come from a hot hand. The other 90 percent would still be luck.

-------

Now, one assumption in all those calculations is that "hot hands" are random and normally distributed. Followers of streakiness would probably argue that's not the case. Intuitively, it would seem that most players are just "themselves" most of the time, but, occasionally, someone will get exceptionally hot and really be 100 points better for a brief period.

For instance, maybe hothandedness manifests itself in outliers. Maybe 38 out of 40 players are just normal. But, the 39th player is cold and drops 60 points, while the 40th is hot, and gains those 60 points. That still works out to the SD of around 14 points that we hypothesized as our limit.

If that's the case, then, the "check the SD method" would still fail to find anything.

But ... now, we'd have other methods that would actually work. We could count outliers. If one player in 40 gains .060 every month, we should see a lot more exceptional months than we would otherwise.

So, if that's how you think hot hands work, let us know. Those kinds of hot hands *won't* be invisible.

------

And even if they exist at that level, it won't tell us much.

Suppose a player has a month where he actually hits .060 better than normal. Can you now say, "Ha! See, that player had a hot hand!"

Nope. Because, by the model, the chance of a player having a hot hand is 1 in 40. But, the chance of a player having a +60 point month, just by luck, is, I think, something like 1 in 12.

A quick naive calculation is that, even in this extreme case, there's a 3-in-4 chance the player was just lucky! (It's probably actually higher than that, if you do the proper calculation.)

------

So here's what I think we have:

1. Moderate variations in talent might be impossible to disprove by the standard method. For monthly "hot hands", that might be .014 points of batting average, as a guess.

2. If those exist, they will rarely be higher than, say, 30 or 40 points. So, when a .270 player hits .380 in August, it *can't* all be a hot hand. It'll still be mostly luck.

3. If hot hands manifest themselves in outliers, we should have no problem finding them. The idea that players will regularly get "scorching hot" in talent ... well, I bet we could debunk that pretty easily.

4. Even if hot hands do exist, unusual performances will be more driven by luck than by performance variations. There will never be a case where you can look at a hot month, and have good reason to believe it wasn't just luck.

(Hat tip and more discussion: Tango)

Labels: baseball, Bill James, hot hand, streakiness

2 Comments:

At Tuesday, July 16, 2013 2:32:00 PM, Anonymous said...: I am not following this at all. If hot hands exist, why would they be distinguishable from the normal standard deviation? The normal 46-point batting average SD already accounts for any effects of streakiness - you couldn't tell the difference between luck and hot hands by looking at standard deviation, because both are normally distributed. In fact, it can be argued that streakiness is just a more sophisticated way to describe luck. Now if you really want to determine whether streaks exist, here are two studies you can do: 1) player's batting average the at bat after an out vs player's average the at bat after a hit, controlling for quality of pitcher. 2) velocity of balls coming off the bat, controlled for quality of pitcher, the at bat after an out or a hit. In basketball it is even easier as you can look at shooting percentages and free throws.
At Wednesday, August 28, 2013 9:03:00 PM, Don Coffin said...: There's an article in the Journal of Sports Economics (August 2013), "Misses in "Hot Hand" Research," by Jeremy Arkes, which argues that evidence for "hot hands" is much stronger than usually believed (by researchers). I have a great deal of trouble with the article, in part because I think the argument, and the statistical analysis assume the conclusion. Unfortunately, I can't find an ungated version of the paper...

<< Home

Sabermetric Research

Tuesday, July 16, 2013

Is there evidence for a "hot hand" effect?

2 Comments:

About Me

Previous Posts