Do batters really hit .463 when gunning for a .300 average?
A player enters his final plate appearance of the season batting .299 or .300. Presumably, he wants to finish at .300. What happens?
He hits very, very well. According to the New York Times, describing a soon-to-be-published academic study,
"[The academic authors] found that the 127 hitters at .299 or .300 batted a whopping .463 in that final [plate appearance], demonstrating a motivation to succeed well beyond normal (and in what was usually an otherwise meaningless game)."
.463! Holy crap!
But ... do you see what might be going on? Selective sampling. The "final plate appearance of the season" situation is not known beforehand. It could just be that if that player gets a hit, and passes .300, he's removed from the game. So the "final appearance" sample will be biased in favor of players who got a hit.
Suppose you play a game with dice. Every roll is an at-bat. Every 1 or 2 is a hit. The expected batting average is .333. You plan to simulate a player's season of 500 AB. However, as soon as his batting average passes .300, you stop dead, and that's the end of his season.
What will you find? Over 99 percent of the time, the last AB will be a hit. (The only time it won't is if you go all 500 AB without ever passing .300.) Those dice "players" will hit over .990 in their last AB, the one where they first achieved .300. It's obviously not because the die has a motivation to succeed.
That is: it's not that the situation of being able to pass .300 makes their last AB productive. It's probably that being productive while passing .300 *makes the AB their last*. At least that's what I think is going on.
In fairness, there is a bit of evidence that there may be a motivational component too. First, when hitting .299, none of the 61 players involved walked in their final plate appearance. 0 walks for 61 does suggest those players were specifically gunning for .300.
Also, the study included batters hitting both .299 and .300. Those hitters already at .300 obviously weren't sat out, at least not right away. There were 66 players already at .300 ... those guys must have hit pretty well in order to keep the overall average at .463. (If they hadn't, presumably the study would have talked only about the .299 hitters.)
If it were indeed all a result of benchings, how many benchings would it take? Regressing to the mean a bit, suppose the 127 hitters in the sample had talent of about .290. The difference between .290 and .463 in 127 AB is the difference between 37 hits and 59 hits. That means 22 hits need to be explained by benchings. How many benchings would that take? More than 22 (because some of the benched guys might have passed .300 anyway in later AB.) Maybe we can take 31 as an estimate ... if those 31 weren't benched, 9 would have got a hit next time up, staying at .300, and 22 would make an out, dropping back below .300.
Or, what's the farthest we can go without statistical significance? Two SDs of batting average in 127 AB is about .080, or 10 hits. If, by random chance, the hitters were 2 SD better than normal in those situations, then there's only 12 hits left to explain, and only about 17 benchings are required.
(You can probably go even a bit lower if you take into account that some of the 127 PA were walks, meaning the .463 is based on less than 127 AB. UPDATE: David Pinto finds that it could have been 57-for-123.)
Is 31 benchings out of 127 batters reasonable? Is 17 benchings reasonable?
I don't know, but benchings can't be that rare. In 1980, I remember Bobby Mattick sitting Alvis Woods after he got to .300 in the last game (I can't find a reference, but the box score confirms my memory, for what that's worth). And, yesterday's NYT article actually mentions a more recent case (but doesn't seem to realize the implication for the study's findings):
"Five years ago, in a meaningless 162nd game against the Yankees, [David] Ortiz entered batting .299 for the season; he struck out in the first inning to drop to .298 and walked in the third, knowing he still had a few more chances to swing for .300.
One inning later, Ortiz singled to reach .300. He batted one more time in the sixth — he walked, refusing to swing at anything that might result in an out — and was, because of the statistical awareness of Manager Terry Francona, replaced on the bases to make sure that .300 season average would last forever."
So I'm skeptical. I guess we have to wait for the study, by Devin Pope and Uri Simonsohn, to come out to find out what's really going on. The Times says it's been accepted for publication in "Psychological Science", but not yet available.
Or, if anyone wants to reproduce the study, and check to see how many .300 hitters ended their seasons a couple of AB earlier than expected ...
UPDATE: commenter David N. kindly posted a link to the actual study.
If you look at the study, the authors actually show evidence that pinch-hitting is the cause! However, they didn't get the significance of that data.
In their "last scheduled plate appearance of the season", the average batter was pitch hit for 7 percent of the time.
But batters with a .298 or .299 average were pinch hit for only 4.1 percent of the time. Batters with .300 or .301 were pinch hit for 19.7 percent of the time.
And, most importantly, batters hitting exactly .300 were pinch hit for 34.3% of the time!
That basically confirms that the authors' results are likely to be the result of cherry-picking. If you're hitting .299, you get a chance to get a hit in your last AB to jump to .300. But if you're already hitting .300, you often don't get a chance to drop back to .299, getting an out in your last AB.
You know how when you looking for something you lost, you always find it in the last place you look? Well, the same thing applies here. When you're looking for .300, you find it with a hit in the last AB you take.
UPDATE: Over at "The Book," commenter Guy points out that I'm overstating the case a little bit. In computing the batting average, the study did ignore players who were pinch-hit for in their last game. However, it did *not* seem to ignore players who were pinch-run for, or replaced defensively before their next plate appearance. So the selective sampling issue remains.