Thursday, July 31, 2008

Batters improve when young -- but it looks like pitchers don't

In preparation for my upcoming presentation on aging patterns in baseball, I ran a little study. I found all players of a specific age – 25, say – and compared their performance at 25 to their performance at 26. I took all the arithmetic differences and averaged them (weighted by the smaller number of batting outs in each of the two seasons). The study covered 1948-49 to 2006-07.

Here are the results for hitters, by age. The numbers are the change in "runs created per 27 outs," adjusted for league offense. The large number to the right is the number of batting outs in the group. I've left out the extremes that had less than 200 batting outs.

18-19 ... +0.75 ... 1079
19-20 ... +0.81 ... 6303
20-21 ... +0.31 ... 24915
21-22 ... +0.14 ... 73503
22-23 ... +0.28 ... 144312
23-24 ... +0.09 ... 238990
24-25 ... +0.13 ... 332481
25-26 ... +0.05 ... 408934
26-27 ... –0.12 ... 439207
27-28 ... –0.08 ... 432988
28-29 ... –0.09 ... 408142
29-30 ... –0.20 ... 377018
30-31 ... –0.12 ... 330300
31-32 ... –0.25 ... 276881
32-33 ... –0.24 ... 226988
33-34 ... –0.25 ... 176198
34-35 ... –0.40 ... 137175
35-36 ... –0.32 ... 95257
36-37 ... –0.33 ... 66625
37-38 ... –0.60 ... 40682
38-39 ... –0.26 ... 27436
39-40 ... –0.49 ... 15966
40-41 ... –0.73 ... 9027
41-42 ... –0.41 ... 4848
42-43 ... –0.89 ... 1739
43-44 ... –0.24 ... 917
44-45 ... –0.94 ... 412
45-46 ... –0.48 ... 251

The results are just a little bit different from conventional wisdom ... it's normally accepted that the peak of performance is at age 27, but this study seems to show it's 26 (but 27 isn't actually much different).

Other than that, it's exactly as you'd expect – decelerating improvement up to a certain age, near flatness for a few years, and accelerated decline after that. I should probably draw this as a graph, and as a chart of cumulative performance, but I'll leave it like this for now.

However, for pitching, the results are not so neat. Here's the chart – the numbers are component ERA, and outs pitched (thirds of innings):

18-19 ... +0.03 ... 1441
19-20 ... -0.07 ... 9272
20-21 ... +0.11 ... 32909
21-22 ... +0.13 ... 83914
22-23 ... -0.01 ... 162500
23-24 ... +0.15 ... 256394
24-25 ... +0.05 ... 339126
25-26 ... +0.08 ... 390041
26-27 ... +0.20 ... 396202
27-28 ... +0.16 ... 391072
28-29 ... +0.18 ... 351132
29-30 ... +0.18 ... 323144
30-31 ... +0.20 ... 274521
31-32 ... +0.26 ... 233170
32-33 ... +0.13 ... 193837
33-34 ... +0.24 ... 156676
34-35 ... +0.23 ... 122400
35-36 ... +0.22 ... 92301
36-37 ... +0.34 ... 66984
37-38 ... +0.13 ... 45787
38-39 ... +0.16 ... 34301
39-40 ... +0.19 ... 25753
40-41 ... +0.48 ... 17021
41-42 ... +0.08 ... 11922
42-43 ... +0.24 ... 5906
43-44 ... +0.61 ... 4369
44-45 ... +0.30 ... 2790
45-46 ... +0.58 ... 2002
46-47 ... +0.97 ... 865
47-48 ... +0.53 ... 385

Only in two cases – 19-20 and 22-23 – do pitchers, as a group, actually improve. At all other ages, pitchers get worse from one year to the next.

However, the younger pitchers do seem to decline less than the older pitchers, as you'd expect. If you were to subtract 0.16 from all the Component ERAs in the table, every age up to 25-26 would be an improvement, and 19 of the 22 after that would be declines.

Still, I don’t understand why pitching should decline almost every year. A few possibilities:

1. I made a mistake in the calculations.

2. Unlike batters, pitchers at any age run the risk of a complete loss of effectiveness. And so the small decline at age 20, for instance, is a combination of 90% of pitchers improving by 0.20 runs per game, and 10% of pitchers declining by more than 2.00 runs per game.

3. There is asymmetry in the measurements. In hitting, a bad decline might be from 5 runs per game to 3. In pitching, a bad decline might be from 4 runs per game to 8. So it could be that the aging pattern is the same, but a bad pitching season could be a very large number, which skews the results.

4. It could be that because pitching is close to a one-dimensional physical skill, young pitchers are, in one sense, at their peak when young, and their entire career is a decline. This is somewhat supported by the fact that there are more young pitchers than young hitters, at least as measured by outs.

What do you think? I'm at a loss to explain what's going on.

Labels: ,

16 Comments:

At Thursday, July 31, 2008 11:05:00 AM, Blogger studes said...

Phil, I don't know if you have the 2008 THT Annual, but David Gassko came up with very similar results in his article "Do Managers Matter?"

 
At Thursday, July 31, 2008 11:07:00 AM, Blogger Phil Birnbaum said...

I do have it, and I remember skimming that article. Will check it out right now. Thanks.

 
At Thursday, July 31, 2008 11:11:00 AM, Blogger Phil Birnbaum said...

Got it! Thanks!

 
At Thursday, July 31, 2008 11:56:00 AM, Blogger David Gassko said...

Phil,

You absolutely need to regress the numbers in the first year of a matched pair to the mean (i.e., if you're looking at the change in a player's stats from 25 to 26, you regress his numbers at 25 to the mean); otherwise, you end up with huge selection bias effects. See here for more: http://www.tangotiger.net/adjacentPitching.html

 
At Thursday, July 31, 2008 12:03:00 PM, Blogger Nate Hebel said...

I would lean heavily toward explanation #4. Pitching seems more of a raw event than batting. I would also add that i would expect that work load and injuries play a role indirectly too. (Something like: pitcher performance is more impacted by nagging injuries than batters, and they are injured more often as well?)

 
At Thursday, July 31, 2008 12:08:00 PM, Blogger Phil Birnbaum said...

David: you're right, I should have thought of that. Thanks, will rethink.

 
At Thursday, July 31, 2008 12:34:00 PM, Blogger Tangotiger said...

Right, regression is a huge issue, as David links to.

For batters, you should weight by PA, not outs.

You can also look at this for comparison purposes:
http://tangotiger.net/agepatterns.txt

 
At Thursday, July 31, 2008 12:48:00 PM, Blogger Phil Birnbaum said...

If you're using RC27, shouldn't you weight by outs? You weight by the appropriate denominator for the stat, don't you?

For instance, if a guy goes 1 for 1 and has an RC27 of infinity, you weight by 0 outs. Otherwise, you get infinity for the whole group.

 
At Thursday, July 31, 2008 12:53:00 PM, Blogger Phil Birnbaum said...

Even if you regress to the mean, you're going to have problems with dropouts. Two guys, age 37, both with league-average skill 4.00. One of them gets lucky and goes 5.00. The other gets unlucky, goes 3.00, and gets released.

Even if you regress to the mean, you won't regress the 5.00 all the way to 4.00. Maybe you'll regress him to 4.50. So next year, it looks like he dropped from 4.50 to 4.00, and it looks like aging is worse than it is.

Any good way to solve the dropout problem? I don't see any obvious ways, but maybe I'm missing something.

 
At Thursday, July 31, 2008 2:14:00 PM, Blogger Tangotiger said...

Ok, I see your issue. You shouldn't use RC/out then. But, in the end, in your case, it won't matter much if at all.

Yes, you will always have the selection bias issue. Here are more to consider:
http://tangotiger.net/aging.html
Guy's last year would be a big issue.

And here:
http://www.tangotiger.net/AgingSelection.html
PA (opps) are not handed out at random.

We can talk about the problems for an hour at least!

 
At Thursday, July 31, 2008 2:20:00 PM, Blogger Phil Birnbaum said...

Tango,

No, I don't think it makes much difference what stat you used. I used RC27 because it was convenient.

It looks like my presentation will be about the *difficulties* of coming up with aging equations. I'll use your study that regressed to the mean as perhaps the closest we can come to an understanding.

 
At Thursday, July 31, 2008 2:26:00 PM, Blogger Phil Birnbaum said...

Here's another thought.

Suppose there are certain types of hitter who peak at 38. You'll never know, because they'd be so bad at 22 that they'd never get drafted. They'd wind up selling insurance for a living.

Suppose there are certain types of hitters who peak at 23. Their peak is very likely low, so they never make the majors.

So peaks are probably very different for different types of players. (We already know that power hitters age more gracefully than speedsters, right? Or am I misremembering?)

However, it does seem that the BEST players peak at 27. Why? Because more MVPs, batting titles, etc., are won at 27 than other ages (Bill James, 1982 Abstract).

 
At Thursday, July 31, 2008 4:56:00 PM, Blogger Tom Timmerman said...

Has anyone attempted to disentangle the effects of age and experience? In other words, do players peak as a function of age or as a function of experience?

 
At Friday, August 01, 2008 11:32:00 AM, Blogger Tangotiger said...

I looked at "experience" once. Stick with age. Not to mention how do you count 150 IP in the minor leagues? If you only look at MLB data, that counts as 0.

Phil:
http://tangotiger.net/agepatterns.txt

That tells you that speed peaks in the early 20s, and walks peaks in the late 30s.

 
At Friday, August 01, 2008 11:35:00 AM, Blogger Tangotiger said...

Phil, also look at this thread, and most of the comments, particularly also those starting at post 28:
http://www.insidethebook.com/ee/index.php/site/comments/another_aging_study/

 
At Sunday, August 03, 2008 12:16:00 AM, Anonymous Anonymous said...

Pitchers dont appear to improve, since the most consistent predictor of a good pitcher is strikeout rate - it has a higher correlation year-to-year than other pitching stats, and is a major factor in how good a pitcher is.
Strikeout rates by pitchers do not improve when they get older. I remember seeing they peak rather early.

Hitters peak later, as strength (power) increases as a person ages.

 

Post a Comment

<< Home