Can batters decrease strikeouts by reducing their power?
Here is a baseball study called "The Tradeoff Between Home Runs and Contact Rate."
Lawrence W. Cinamon ran a regression on home run rate (HR/AB), as predicted by independent variables for park, season, and contact rate (non-K/AB). He uses two different models, with two different statistical corrections, the first for autocorrelation, and the second for autocorrelation and heteroskedasticity. (I leave it to the statisticians reading this to comment on the corrections, which I don't understand.)
For the first model, Cinamon finds that home run rate increases by .041 points for every one-point increase in strikeout rate. For the second model, it's .14 percentage points.
So, using the second model, increasing strikeouts from 20% of 500 AB to 21% of 500 AB should increase the number of home runs by .7 of a home run. Roughly, every seven strikeouts equals one home run.
First, if you look at it a slightly different way, the effect is even larger. Another way to think of home run power is per ball in play: HR/BIP, rather than HR/AB. That is, the chance of hitting a home run *given that you don't strike out* increases by more than .14 points. This is consistent with Voros McCracken's model (also used by Jim Albert), where a player's ability is the combination of four skills: (a) walking; (b) striking out when not walking; (c) hitting HRs when not walking or striking out; and (d) getting a hit when not walking, striking out, or hitting a home run.
That is: suppose Dave Kingman could hit a home run every time he put the ball in play – but he struck out 95% of the time. He'd have 25 hits, and 25 HR. We'd still call him a power hitter, despite his .050 batting average.
Second, Cinamon had a quadratic variable for years of experience, but dropped it because power peaked at 22 years, and "only three players in the sample have years numbered 22 or higher." But that's no reason to drop the variable. (I'm sure you could find diseases that peak at age 100, even though not many people live to be 100.) Cinamon says the result is due to "survivorship bias" – that home runs cause long careers, not the other way around. But both are true. Players do hit for more power as they get older, as Bill James told us many years ago.
In any case, even if there is survivorship bias, I don't see why that should invalidate the results. Cinamon says that "long ball causes long careers." But isn't that *more* reason to adjust for age?
What Cinamon does instead is limit his study to the first five seasons for players' careers. Since the tradeoff between HR and K is very likely to change as a player ages, the study's results now apply only to younger players.
Third, are we sure what the study observes is really a tradeoff? Suppose players are independently either high- or low-strikeout, and independently high- or low- power. And suppose there's no tradeoff possible – players are what they are, and can't change. And everyone makes the majors except the bad group, the high-K low-power group.
In that case, you'd get exactly the results you see here: HR hitters (a combination of low-K and high-K) have more strikeouts than singles hitters (who are low-K only). But there would be no tradeoff, just the illusion of one.
In that model, the effect is 100% selection bias, and 0% tradeoff. Cinamon assumes the effect is 0% selection bias, and 100% tradeoff. What's the real answer? Obviously, it's somewhere in the middle. But where? And how could we design a study to tell?
(Hat tip: Marginal Revolution.)