Clutch hitting: a new study from Pete Palmer and Dick Cramer
Of the many excellent presentations at last weekend's SABR convention in Cleveland, one of my favorites was the study by Pete Palmer and Dick Cramer, on clutch hitting. I have to admit that the subject has been done to death (notably by Palmer and Cramer themselves). And there are probably a lot of people like Chris Jaffe, who is "sooooooo very tired of clutch hitting studies."
So this study could be accused of beating a dead horse – other studies, I think, have already convincingly shown that clutch talent doesn't exist – but, on the other hand, on a controversial issue like clutch, you can never have too much evidence.
More important, the highly-regarded "The Book" (along with a previous study by author Andy Dolphin) does believe there is some evidence for clutch. So the debate isn't completely settled.
That's why I think this study does add valuable evidence to the pile.
Anyway, many thanks to Pete and Dick, who have allowed me to post their presentation slides, and two writeups of their findings.
Let me start with a recap of my three favorite classic clutch studies, before getting to the new one.
(I will also point out that "clutch hitter" doesn't mean a player who hits well in the clutch – it means a hitter who performs *better* in clutch situations than normal, relative to the rest of the league.)
Dick Cramer, 1977
First, there was Dick Cramer's groundbreaking study from 1977. Dick looked at all players in the 1969 and 1970 seasons. He figured the amount by which they increased their team's win probabilities over the season, and compared that to what you'd expect a raw measure of run performance from their batting line. The difference was their observed clutchness; a clutch player would have created more wins from his raw batting statistics, because his hits would have come when they were more important.
Comparing 1969 clutchness to 1970 clutchness, Dick found an r-squared of .038 for National League players, and .055 for American League players. Dick's conclusion was that, which such a small correlation, clutch hitting was not shown to exist.
As I write this, it now occurs to me that these aren't actually that small – the r is +/- 0.2 in both cases. The study doesn't actually say if the correlation was positive or negative (Dick, if you're reading this, which was it?). Of course, if it had been a negative correlation, that would be stronger evidence.
It's this study, I think, that Bill James criticizes in his famous "Underestimating the Fog" essay (.pdf). Bill argues that Dick didn't actually prove clutch hitting doesn't exist. That's probably true, but it's pretty good evidence that, if it does exist, it's weak. Assuming the correlation is positive, it means that even a player who hit 100 points better in the clutch in 1969 would be expected to hit only 20 points better in the clutch in 1970.
Pete Palmer, 1990
In the March 1990 issue of By the Numbers (.pdf, see page 6), Pete Palmer tackled the question a different way. He noted that even if there were no such thing as clutch talent, some players would *appear* to be clutch just because of dumb luck. He then figured what the distribution should be if it were all just luck, and compared it to the actual distribution.
If the two were the same, that would be evidence that clutch hitting is nothing more than random chance. If the two were different, that would show that clutch talent actually exists, over and above the random effect.
Consider the analogy of coin flips. A fair coin would land eight consecutive heads 1 time in 256. But if 10% of coins were "clutch," with a .600 heads average, you would see eight consecutive heads about 5 times in 256 – five times as many!
So clutch hitting talent would certainly show itself if it existed in any significant quantity.
But when Pete looked at the distribution of how player's hit in the clutch, he found it was perfectly consistent with a normal distribution. For instance, out of 330 random numbers from a normal distribution, you'd expect about one of those to be more than 3 SD above or below the mean. In real life, there was indeed exactly one – Tim Raines (.352 clutch, .296 non-clutch).
If clutch hitting were indeed a real skill, there would be a lot more than just one player 3 SD from the mean.
Because Pete found no "extra" extreme results than what would be expected by chance, his conclusion was that clutch hitting didn't appear to exist.
Tom Ruane, 2005
In this exhaustive review of a few decades worth of Retrosheet data, Tom Ruane looks at all players' clutch hitting stats, runs a random simulation as if they were all non-clutch hitters, and finds the distributions match almost exactly. (The relevant section can be found by going to the study and searching for "Is The Data Random?")
His analysis is very similar to Pete's 1990 work, but with a much larger database.
Cramer and Palmer, 2008
Finally, we come to Cramer and Palmer's new study. It's a bit of a cross between Dick's 1997 study and Tom's study – it looks at 50 years' worth of Retrosheet data, but uses the "win probabilities" method.
And there are several sub-studies within it.
The first study calculated clutch performance for each of ten levels of leverage – highest clutch, with the game most on the line, all the way down to lowest clutch, with little chance of changing the outcome (like in a 15-0 ninth inning). Then, it calculated performance for 10 different random subsamples of the games (based on the date).
Comparing the two distributions, it turned out that the distribution of "clutchiness" was almost exactly the same as the distribution of "datiness". Since datiness is random, this suggests that clutchiness is no less random.
The other substudies were:
-- looking at only the 10% highest-leverage situations, there were almost exactly as many players 2 SD and 3 SD away from the mean as if clutch were random;
-- looking at clutch performance for the 897 players with at least 3000 PA in the last 50 years, the SD was about 3 runs of clutch per 500 PA. A random simulation gave 2.5 runs. Pete and Dick write that "it may be that real life variation could be a little different from the simulated value, but the two are pretty close." My take is that the 0.5 runs is fairly significant – it means the SD of clutch would be about 1.66 runs (the square root of (3 squared minus 2.5 squared)). Still, that means the top 2.5% of players would only be about 3 runs better than average.
-- rerunning Dick's year-to-year correlation experiment gave an r of .002, which is very, very close to zero, both theoretically and practically.
-- finally, for rookies first entering the league, there was no improvement from their first at-bat (when they would presumably be very nervous) to their 100th at-bat (when they should be less nervous. While this doesn't speak to the clutch issue directly, it does serve as more evidence that players' performance doesn't seem to be affected by their personal stress level.
There are lots of other studies on clutch hitting that I haven't mentioned here; Cy Morong keeps an updated list of them. As I mentioned, Andy Dolphin did find evidence of significant clutch talent. "The Book," which Dolphin co-authored, found evidence of clutch talent with an SD equivalent to about 8 points of OBP. Those are the only studies I remember seeing that actually found something non-zero.
I'd be interested in seeing what Dolphin (and co-authors Tom Tango and Mitchel Lichtman) think about Pete and Dick's recent work.
While I'm on the subject of clutch, when Bill's "Understimating the Fog" came out a couple of years ago, I responded with a study of my own. Bill disagreed with what I did, and we had a bit of a discussion on the SABR message board. I'm linking to it here, because I don't think it's online anywhere else.
-- Here's Bill's original "Underestimating the Fog" essay (pdf).
-- Here's my response in "By the Numbers" (pdf, see page 7).
-- Here's Bill's response to me, called "Mapping the Fog" (pdf).
-- And here's my response to Bill's response (pdf).