Tuesday, April 03, 2007

Is expansion responsible for the home run explosion?

Economist J. C. Bradbury, author of the recent (and enjoyable) book "The Baseball Economist," weighs in today on the New York Times op-ed page. Bradbury argues that recent historic highs in home runs and strikeouts are not because of steroids, but because of expansion.

His argument goes something like this:

-- Stephen Jay Gould showed, in his book "
Full House," that the lower the standard deviation of a statistic, the better the players are, overall, at the skill it measures.
-- The standard deviation of performance is higher lately for both pitchers and hitters, which suggests a dilution of talent.
-- Dilution of pitching talent gives elite home run hitters lots of inferior pitchers to tee off on (and similarly for pitcher strikeouts).
-- The dilution of talent is most likely caused by expansion.
-- And so, expansion is the most likely explanation for the recent explosion in home runs.

I don't agree. Well, I agree partially – I think expansion is certainly responsible for part of the increase, but only a small part. There are three reasons, which I'll state here, then explain:

-- expansion can be shown to have only a small effect on batting statistics;
-- only home runs have exploded, while other stats have stayed much the same; and
-- population effects mitigate the effects of expansion.

First, expansion will cause only a small increase in home runs, not a large one. Suppose the major leagues expand from 28 to 30 teams. And suppose Joe Slugger hit 25 home runs before. How many will he hit now?

Well, 28/30 of the pitchers he faces in the expansion year will be the same ones who would have had jobs if expansion hadn't occurred. So Joe will hit 25 times 28/30 home runs against those pitchers, or 23.3 home runs.

That leaves 2/30 of the pitchers, who have jobs only because of expansion. Suppose they give up 50% more home runs than the average established pitcher. (That's probably high – if the average ERA is 4.50, 50% more than that average is 6.75. Expansion pitchers aren't *that* bad.) So against the expansion pitchers, Joe will hit this many homers: 25, times 2/30, times 1.50, which is exactly 2.5 home runs.

In the pre-expansion year, Joe hit 25 home runs. Post expansion, he'll hit 25.8 home runs. The difference is 0.8. That's about 3% more.

Three percent more home runs can't explain the recent explosion. Without the 1998 expansion, the calculation shows that Barry Bonds would have hit only 70.7 home runs in 2001, instead of 73. McGwire's record would have been 68 instead of 70. Sosa would have been 64 instead of 66.

Bradbury writes that home runs per game are up 30 percent in the last decade. The last decade included only the one expansion. That could explain a 3% increase, not one ten times as large.


Second: why have only home runs and strikeouts risen? If the expansion hypothesis is correct, all hitting stats should have increased. Gould took special pains in his book to talk about .400 hitters, and how the recent increase in talent made it all but impossible for batters to hit .400 these days. So why haven't batting averages gone up too? Actually, I don't know that they haven't, although we haven't seen any monster batting averages in at least a couple of decades. But what about, say, triples, or ERAs?

A quick check: in 1996, there were 855 triples hit, or about 31 per team. In 2003 (the last year covered in my copy of Total Baseball), there were 934, or about 31 per team. Why hasn't dilution affected triples?


Third, there's the effect of population. There are more major-league players now than ever, but, also, the pool of people from which they are chosen has also increased. On page 99 of his book, Bradbury notes that, in 1970, there was one major leaguer for every 338,687 US residents. In 2000, it was one per 375,229. Going by these raw numbers, it looks like the quality of baseball should be higher now than it was then – not lower. Why weren't records being broken in 1970 the way they're being broken now?

In fairness, Bradbury acknowledges this argument and addresses it. For one thing, he says, the increase in population is, in substantial part, due to the population living to an older age. That portion of the increase obviously doesn't increase the pool of potential baseball players, and so the raw numbers are misleading. For another thing, he argues that there are so many more sporting opportunities now, relative to the past, that the increase in talent might nonetheless wind up being spread thin because of all the additional sports that attract away the talent. Put another way, although baseball roster spots may have increased only 25% since 1970, roster spots *in all sports* may have increased by much more than that.

And so, Bradbury argues, we shouldn't use population statistics as a proxy for talent – we should use only the standard deviation of observed performance.

It's a reasonable argument, and I think Bradbury might have a point that we can't rely on raw population statistics to compare 2007 to, say, 1950. But what about more recent times? The most recent expansions took place a little over a decade ago, and opened up about 15% more roster spots. Between 1990 and 2000, the population increased about 13%, and the majors started scouting increasing numbers of players from outside the US, so the population increase roughly matches the effects of expansion, even (likely) after taking into account the aging popluation.

So can we really argue that other sports sucked away substantial amounts of baseball talent specifically in the last decade? Are there lots of American soccer superstars who would have otherwise become major-league pitchers? Have the NFL and NBA expanded so much, or increased salaries so much more than MLB, that would-be baseball players are defecting to football and basketball? And is the effect so large that it would increase home runs by 30% in one decade?

It doesn't seem plausible to me.


So if it's not expansion, what *has* caused the modern-day upsurge in power and strikeouts? Here's my hypothesis. (I don't have any evidence to support it, but, hey, what the heck.)

Here's what I think is happening: players realize that it's hard to hit major-league pitching. And they realize that top pitchers are getting better and better, and so harder and harder to hit against.

But they've also figured this out: if they get bigger and stronger, their stats will get better even if they don't do anything different at the plate. If they work out over the winter, and maybe even dabble in steroids, they can do exactly what they did before, but some of what used to be warning-track fly balls will now become home runs.

Put another way: it's hard to improve your hitting by changing your grip, or your batting stance, or the way you react to a breaking ball. But it's easy to improve your hitting by bulking up. So that's what players do.

And that means that a larger percentage of major-league players wind up being power hitters. In the past, the guy who hit .250 with only moderate power might get beat out of a job by the .290 slap hitter. Now, he works out over the winter, gets strong enough to turn 6 fly balls into home runs, and that makes him a .262 hitter with 20 home runs instead of 14. Now, it's the slap hitter who's out of a job.

More power hitters means more strikeouts. Even if Joe Pitcher doesn't get better at all, he finds that he's not facing contact hitters who strike out every 12 AB – he's facing power hitters who strike out every 6 AB. Bingo, more strikeouts. That would be true even if the overall quality of the batters he faces didn’t change – as long as their power profile changes.

Here's how you could figure that out: find every batter from 20 years ago who was above average in runs created per game. Repeat for players from last year. Compare the two groups. If last year's group struck out more, and also hit more home runs, that would be confirmation that the increase in strikeouts didn't come solely because the pool of talent got diluted.

Labels: ,


At Tuesday, April 03, 2007 11:06:00 AM, Blogger Brian Burke said...

This comment has been removed by the author.

At Tuesday, April 03, 2007 11:09:00 AM, Blogger Phil Birnbaum said...

You're right ... that's a good point.

At Tuesday, April 03, 2007 11:59:00 AM, Anonymous Anonymous said...

Expansion had virtually nothing to do with the post-1993 offensive explosion. There are many reasons we can be sure expansion isn't the explanation, and Phil covered a few of them.

I'm surprised by JCB's conclusion that variance in pitcher performance has increased since 1993, which seems to be what led him astray. Dan Fox at BPro has shown that variance in hitter performance, including slugging, has continued to decline in recent years (though slowly). I'd be surprised if the COV for pitchers has increased, much less being at an "all time high," if you correctly adjust for the greatly increased number of relievers with a small # of IP. Phil: have you read this chapter in the book? Can you explain his methodology?

At Tuesday, April 03, 2007 12:06:00 PM, Blogger Phil Birnbaum said...

Hi, Guy,

JC doesn't go into any detail about his methodology. He's got a graph of COV over the years, and ERA does show some evidence of a rise from about .3 in 1980 to .32 now.

I like your idea that the increase in COV is due to more pitchers with fewer innings ... but I don't know if JC adjusted for this or not.

At Tuesday, April 03, 2007 12:28:00 PM, Anonymous Anonymous said...

An increase from .3 to .32 isn't very impressive. Even if he looks only at starters, or pitchers with more than X IP, I'm sure the average number of IP has shrunk considerably since 1980 -- there were 15 pitchers with >250 IP in 1980, zero in 2006. This change alone would probably raise the COV much more than .02, since each pitcher has a smaller sample size and more random error. Perhaps he controlled for that, but I'm guessing he didn't. And of course, you really have to take CO out of this kind of analysis (or do park adjustment) as Coors skews things considerably.

At Tuesday, April 03, 2007 8:34:00 PM, Anonymous Anonymous said...

isn't it easier to throw harder by bulking up?

At Tuesday, April 03, 2007 11:24:00 PM, Blogger Tangotiger said...

I would guess the new parks are hitter-friendly, and there's been alot of them introduced in the last 15 years.

Coors was also added, which is also a boon.

The lowering of triples is probably a combination of players being selected for power not speed (speed is not as valuable in high-offense), and the artificial turf going away.

The "expansion theory" makes more sense when you look at the league expanding from 16 to 20, or 20 to 24... but, 28 to 30? That's hardly a blip. If the bubble player is 80% the average MLB player, then having 28 teams of 100% plus 2 teams of 80% gives you 30 teams of 98.7%. No one's going to notice.

At Tuesday, April 03, 2007 11:46:00 PM, Blogger Tangotiger said...

I like what highboskage says here:

It was a one-shot deal, and the culprit was likely the ball.

At Wednesday, April 04, 2007 2:20:00 AM, Blogger Greg Spira said...

Phil - While, as I explained over on btf, there are a lot of ways to study the statistical effects of expansion and all of them make it very clear that expansion does not increase overall offense (except, of course, in the sense that more games equals more total runs). I don't really think your explanation works well to explain a large jump in offense over the course of one or two seasons.

And the reality is, there hasn't been much of a long term change in offense - if you go back to 1961 and 1962, the 2 years before baseball changed the rules to help pitching, there were 9 runs scored per game. In 2005, there were 9.2 runs per game. There have been + changes of .4 or more since 1960 in 1969 (when they lowered the mound), 1973 (intro of dh), 1977, 1987 , and 1997. There have been - changes of .4 or more since 1960 in 1963 (mound raised, zone enlarged), 1968, 1971, 1978, and 1988. (nice new year-to-year table in this year's ESPN Baseball Encyclopedia)

The explanations that best explains the statistical evidence for the increases in 1977, 1987 and 1993-4 are 1) changes in the ball 2) changes in the strike zone and/or 3) random variation. The most unique thing about the boost in offense in 1993-4 is not that it happened, but that there was no subsequent retreat. Unfortunately, there's never been any strong evidence that the ball has been changed, and nobody seemed to notice a particularly significant short-term change in the way the strike zone was called. So are these changes orchestrated or not? I lean towards yes; I still believe that whatever happened in 1987 and 1993 was similar - those years are, I think, not coincidentally, the only two year Dale Sveum hit double digit home runs - but I can't prove it.

At Wednesday, April 04, 2007 9:40:00 AM, Blogger Phil Birnbaum said...

Hi, Greg,

You're right ... my explanation doesn't work for sudden changes, just for long-term changes. It certainly doesn't explain what happened suddenly in 1993 and 1994.

Oh, one point: the increase in *offense* is different from the increase in *home runs*.

It's possible that a 30% increase in home runs (and a 15% increase in strikeouts) leads to exactly the increase in offense observed, I'll have to check that. But they're two separate issues.

And, if the change occurred in the "last decade," that's after 1993-94. So what happened was a change in HR *without* much of a change in runs scored, right?

I'm away from my desk today, will look into this further when I get home.

At Wednesday, April 04, 2007 12:11:00 PM, Blogger Greg Spira said...

Phil - You are right that changes in home runs and runs don't always run together, but in this case most of the change in home runs ran parallel to the change in runs:

Home Runs Per Game Per Team

1962: .93
1963: .84
1968: .61
1969: .80
1970: .88
1971: .84
1976: .58
1977: .87
1986: .91
1987: 1.06
1988: .76
1992: .72
1993: .89
1994: 1.03
1997: 1.02
1998: 1.04
1999: 1.14
2000: 1.17
2001: 1.12
2002: 1.04
2003: 1.07
2004: 1.12
2005: 1.03
2006: 1.11

Starting in 1999, there's been some additional fluctuation upwards, but overall, the recent numbers still look pretty similar to the 1987 and mid-90s numbers.

1987 1.06
1994-1998 average 1.04
2002-2006 average 1.06

At Wednesday, April 04, 2007 9:26:00 PM, Blogger Phil Birnbaum said...

Thanks, Greg.

It turns out I misinterpreted J.C.'s comments -- I assumed a 30% increase over the last decade *to today*, but he wrote that there was a 30% increase in the decade up to *the expansion era*.

I should have read more carefully.


Post a Comment

<< Home