A game theory study on pitch selection
Commenter "Eddy" was kind enough to send me this link to what looks like a press release on a new study by Kenneth Kovash and Steve Levitt (of Freakonomics fame). The link is to a summary only; to get the actual study, I had to pay $5.
The study is in two parts: one baseball, and one football. I'll talk about the baseball results here; I'll send the football portion of the study to Brian Burke (of "Advanced NFL Stats") in case he wants to review it himself. I hope that's allowed under fair use and I don't have to pay another $5.
In the baseball half, the authors claim that pitchers throw too many fastballs. They would do better -- much better, in fact -- if they threw other kinds of pitches more often.
How can you tell, using game theory, whether fastballs are being overused? Simple: you just check the outcomes. If opposition hitters bat for an OPS of .850 when you throw a first-pitch fastball, but they OPS (can you use OPS as a verb?) .800 when you don't, they obviously you should cut back on the fastballs. At first glance, it might look like you should get rid of them entirely, because, that way, you could shave .050 off the opposition's OPS. But it's not that simple: as soon as the opposition realizes that you're not throwing fastballs, they'll be able to predict your pitches more accurately, and they'll wind up OPSing higher than .800 -- probably even higher than the original .850. Game theory can't tell you the right proportion, at least not without having to make assumptions that would probably be wrong. But it *can* tell you that you should adjust your strategy until the OPS-after-fastball is exactly equal to the OPS-after-non-fastball.
If that's what the Kovash/Levitt study did, it would be great. But it didn't. Instead, it did something that doesn't make sense, and makes almost all its conclusions invalid.
What did it do? It considered outcomes only for pitches that ended the "at bat". (The authors say "at bat", but I think they mean "plate appearance". I'll also use "at bat" to mean "plate appearance" for consistency with the paper.)
That's a huge selective sampling issue. It means that when a pitch on a 3-0 count is a ball, you count it; when it's put in play, you count it; but when it's a strike, you don't include it. That doesn't work. I can make up some data to show you why. Suppose:
-- Fastballs are 50% put in play, for an OPS of 1.000
-- Fastballs are 50% strikes, for an OPS of .800 after the 3-1 count.
-- Non-fastballs are 25% put in play, for an OPS of .900
-- Non-fastballs are 25% strikes, for an OPS of .800 after the 3-1 count
-- Non-fastballs are 50% balls, for an OPS of 1.000.
That summarizes to:
0.900 OPS for fastball
0.925 OPS for non-fastball
Clearly, you should throw a fastball, right?
But if you consider only the last pitch of the at-bat, you have to ignore those 3-1 counts. Then you get:
1.000 OPS for fastball
0.933 OPS for non-fastball
And it looks like you should throw *fewer* fastballs, not more. And that's wrong.
This kind of thing is exactly what Kovash and Levitt have done. They think they've shown that the fastball is a worse pitch than the non-fastball. But what they've *really* shown is that the fastball is a worse pitch than the non-fastball only if you ignore the fact that if the pitch doesn't end the at-bat, the fastball is more likely to put the count more in the pitcher's favor.
So I don't think their main regression result, the one in Table 4, holds water, and I don't think there's a way for the reader to work around it. If the authors just reran that regression, but considered the outcome even if it wasn't the last pitch of the at-bat, that would fix the problem. I'm not sure why they chose not to do that.
Still, there are some other aspects of the study that are interesting.
In Table 2, the authors show results for every count separately. On 3-2, every pitch is the last pitch of the AB (except for foul balls, which the authors actually included in the study, but don't affect the results). Therefore, the change in count isn't a consideration, and we can take the results at close to face value.
So what happens? There is indeed a big difference between fastballs and non-fastballs:
.769 OPS after a 3-2 fastball
.651 OPS after a 3-2 non-fastball.
This would certainly lead to a conclusion that pitchers are throwing too many 3-2 fastballs, and the results stunned me: I didn't expect this big a difference. But then it occurred to me: most of the OPS on 3-2 is walks. And walks are undervalued in OPS. If a 3-2 fastball results in more balls in play, but the 3-2 curveball (or whatever) results in more walks, the actual run values might be more even. That is: pitchers know that walks are "worse" than OPS says they are, so they're willing to tolerate a higher OPS for fastballs if it's contains fewer walks. That seems quite reasonable.
Suppose walks form half of OBP for fastballs, but 60% of OBP from curveballs. That's a difference of .100 in OPS due to walks. If you assume that should "really" be .140, that closes the gap from 120 points down to 80.
That adjustment is still not enough to explain the entire gap between fastballs and non-fastballs, but it's certainly part of it. In studies like this, where you're looking for very small discrepancies, and you have non-traditional proportions of offensive events, you need to use something more accurate than OPS.
But here's something that makes me worry, and I wonder if there's a problem with the authors' database. Here are the overall OPS values for ABs ending on that pitch, from the authors' Table 1:
Do you see the problem? This data puts the average OPS at .709 (fastballs being twice as likely as non-fastballs). But the overall major-league OPS for the years of the study (2002-2006) was around .750. Why the discrepancy? The authors do say they left out about 6% of pitches, mostly "unknown", but with a few knuckleballs and screwballs. But there's no way 6% of the data could bring a .750 OPS down to .709. So I'm thinking something's wrong here.
There's no such problem with Table 2, which is broken down by count instead of pitch type. That table does average out to about .750.
UPDATE: in the comments, Guy reports that if you calculate SLG with a denominator of PA instead of AB, the numbers appear to work out OK. So the authors probably just miscalculated.
Finally, the authors argue that pitchers aren't randomizing enough. According to game theory, there should be no correlation between your choice of this pitch, and your choice of the next pitch. If you have a correlation, because you're choosing not to randomize properly, the opposition can pick up on that, guess pitches with more confidence, and take advantage.
Kovash and Levitt found that pitchers have negative correlation: after a fastball, they're more likely to throw a non-fastball, and vice-versa. They conclude that teams are not playing the optimal strategy, and it's costing them runs.
However: couldn't there be another factor making it beneficial to do that? It's conventional wisdom that, after seeing a fastball, it's harder to hit a breaking pitch, because your brain is still "tuned" to the trajectory of the fastball. If that's true -- and I think every pitcher and broadcaster would think it is, to some extent -- that would easily explain how the negative correlation observed in the study could actually be the optimal strategy. But the authors don't mention it at all.
So I don't think we learn much from this paper, but there's a tidbit I found interesting. Apparently Kovash and Levitt have access to MLB bigwigs, and did a little survey:
"Executives of Major League Baseball teams with whom we spoke estimated that there would be a .150 gap in OPS between a batter who knew for certain a fastball was coming versus the same batter who mistakenly thought that there was a 100 percent chance the next pitch would *not* be a fastball, but in fact was surprised and faced a fastball."
That's kind of interesting. I have no idea how accurate the estimate is ... anybody seen any other research on this topic?