Sabermetric Research: Pitchers with lucky and unlucky W-L records

Thursday, June 21, 2007

Pitchers with lucky and unlucky W-L records

Another nice article in SABR's "Baseball Research Journal 35" figures out which were the luckiest and unluckiest starting pitchers in terms of career wins. It's called "Still Searching for Clutch Pitchers," by Bill Deane ("with statistics by Pete Palmer").

Deane and Palmer figured it this way: first, they counted all the runs the pitcher gave up in all his starts. Then, they figured how many runs his teams scored for him in the games he started, and pro-rated that figure to his innings pitched. They subtract one from the other to come up with a run differential. Finally, they take a .500 record (with the same number of decisions), and add or subtract wins based on that run differential. The result is the pitcher's expected W-L record.

Here are the luckiest pitchers according to Deane and Palmer. Remember, "luck" here does not include any above- or below-normal run support, which has been taken out of the picture.

+20.6 wins: Mickey Welch (307-210 actual, 286-191 projected)
+17.8 wins: Greg Maddux (333-203 actual)
+16.8 wins: Bob Welch (211-146 actual)
+15.2 wins: Clark Griffith (237-146 actual)
+15.1 wins: Christy Mathewson (373-188 actual)
+14.0 wins: Roger Clemens (348-178 actual)

And the unluckiest:

-24.3 wins: Red Ruffing (273-225 actual, 307-201 projected)
-14.7 wins: Jim McCormick (265-214 actual)
-13.7 wins: Dizzy Trout (170-161 actual)
-13.4 wins: Bob Shawkey (195-150 actual)
-13.1 wins: Walter Johnson (417-279 actual)
-12.1 wins: Bert Blyleven (287-250 actual)

The authors note that, over all the pitchers, the distribution of the discrepancies between actual and expected, in terms of standard deviations, is exactly as you would expect from the normal distribution. And so they conclude that the difference between expected wins and actual wins is presumably due to luck.

But they don't explicitly say that they're using the standard deviation of the binomial distribution (that is, the SD of coin tosses). From the article, all we can conclude is that the distribution of the discrepancies is normal, not that it exactly matches what would have resulted if the discrepancies were luck. (For instance, you might find that 2.5% of people have SAT scores more than 2 SDs above normal, just as expected, but that doesn't mean that SAT scores are just luck.)

A couple of other minor quibbles:

How many runs does it take to turn a loss into a win? Palmer uses the formula

Runs per win = 10 * (average runs per inning for both teams)

The article bases "runs per inning for both teams" on the combined scoring of the team and its opponents *when that pitcher starts*. I don't think that's right: you have to base it on the *league* scoring. That's because the runs saved are off the league runs, so you have to start at the average.

Put it this way: suppose Roger Clemens gives up 3.5 runs a start, and the league average is 4.5. The league average of 4.5 means 10 runs per win. Clemens saved 1.0 runs *off the 4.5 level*, so you have to use the 4.5 level (not 3.5) to figure the runs.

Actually, the best way might be to integrate the function, or at least sum a few small slices. For instance:

The first 0.1 runs saved are at the rate of 1 win per 10 runs (using 4.5 + 4.5).
The next 0.1 runs saved are at the rate of 1 win per 9.94 runs (using 4.4 + 4.5).
The next 0.1 runs saved are at the rate of 1 win per 9.89 runs (using 4.3 + 4.5).
...
The last 0.1 runs saved are at the rate of 1 win per 9.49 runs (using 3.6 + 4.5).

The average of all the slices is the number you should use. But the article uses 9.49 runs per win for *all* the runs saved, not just the last slice, and therefore it's probably overestimating the "luck" by just a little bit. And, in fact, I've seen arguments that Palmer's formula doesn't actually work all that well anyway.

But in any case, I don't think it's enough to affect the results much, which is why I call it a quibble.

One more minor point: if ten runs makes up a win for a team, does it necessarily follow that ten runs make up a win for a starter? Maybe due to leverage differences, it takes 12 runs for a starter win, but only 8 runs for a reliever win. Again, I'm not sure that's true, and even if it is true, it's probably minor anyway.

The authors do check the overall results, and find that the average "luck" is one win more than expected. "This could be," they write, "because those who are 'lucky' in the win column are more likely to get 200 decisions [which was the study cutoff]."

Sounds right to me. Overall, I think the results of this study are probably pretty accurate.

Labels: baseball

10 Comments:

At Friday, June 22, 2007 8:37:00 AM, Tangotiger said...: I don't know about starters/relievers, but the runs to win conversion is based on the run environment for that game. If Greg Maddux is facing Pedro Martinez, the runs per win value will be very low.

The conversion should be based on the run distribution. You can get the Tango Distribution from my site.
At Friday, June 22, 2007 8:44:00 AM, Phil Birnbaum said...: True, but the study didn't do that on a game-by-game basis. It basically summed the runs allowed and the runs scored, over the entire career, and used that in the RPW formula.
At Friday, June 22, 2007 10:13:00 AM, Tangotiger said...: You'd still need to use the participants of the game in question. If the expected number of runs is 8 total for the game, it really doesn't matter if the rest of the league is 8 runs or 20 runs.
At Friday, June 22, 2007 10:26:00 AM, Phil Birnbaum said...: True. If you have Maddux vs. Clemens, and each gives up 3.5 runs per game, then it takes fewer runs per win in those games. Agree with you there.

But let's say it's Clemens against an average pitcher. And you're trying to credit Clemens with the number of wins he created by giving up one fewer run. Don't you have to start with what an average pitcher would have done? After all, you're crediting Clemens with his performance *as compared to an average pitcher*.

Otherwise, aren't you giving too much credit to the pitcher? You're crediting him with the saved runs, and also with lowering the scoring so that his saved runs are worth more per run. Isn't that double-counting?
At Monday, June 25, 2007 3:00:00 PM, Anonymous said...: In the Johnny Allen example, the author mentions that Allen's teams averaged 5.78 per game, based off of his 241 starts and his teams' 1,393 runs scored in those starts. But based off of this information this doesn't translate to a 5.78 per *nine-inning game* average, as extra-inning games haven't been accounted for. (Allen's runs allowed of course does calculate out to a 4.26 runs per 9 inning game).

Does this upward bias seem negligible? It would seem to clash when used in the formula: 10*sqrt(runs scored + runs allowed per nine innings).
At Monday, June 25, 2007 4:19:00 PM, Phil Birnbaum said...: Kevin: it doesn't take into account 8.5 inning games either. I think Bill James (or someone else) used to use 9.0 innings per game, implying that the extra-inning games are cancelled out by the 8.5 inning home team wins.
At Monday, June 25, 2007 4:58:00 PM, Anonymous said...: I think using the actual RA for the pitcher is correct, not double-counting. But the RPW converter does tend to underestimate the value of runs saved in high-scoring environments (i.e. the RPW is too high), while overestimating the value of runs saved in low-scoring games. So part of Clemens' and Maddux's "luck" is probably playing in a high-scoring era.

Also, how did they handle games pitched in relief? It sounds like they weren't included in the run differential analysis. However, some of these pitchers certainly got decisions in relief games, which would affect their total number of decisions and perhaps given them a worse W-L record (assuming they entered more games with their team leading than trailing). This wouldn't affect modern pitchers, but Ruffing (for example) had 88 relief games.
At Monday, June 25, 2007 5:14:00 PM, Phil Birnbaum said...: Here's an argument why using the total runs per game *including* the pitcher concerned isn't right.

Suppose Joe Superstar comes along. He never, ever gives up a run. He pitches 100 complete games and goes 100-0.

When Joe pitches, how many runs does it take to turn a loss into a win? Infinity.

But in Joe's starts, he turned 50 losses into 50 wins. He also turned 500 runs against (assuming 5 runs per game on average) into 0 runs against. 500 runs, 50 wins, is 10 runs per win -- not infinity.

By the way, the question "how many runs does it take to turn a loss into a win when team A gives up 5 runs per game and team B gives up 3 runs" hasn't been answered, as far as I know. Has someone solved this?
At Tuesday, June 26, 2007 10:21:00 AM, Tangotiger said...: The parameters seem to be a bit unstable.

As a precondition, do we agree that once the environment changes, you have a new environment? That if you have 5 RS and 3 RA, and then you have an extra 1 RA, that how you perceive that 4th RA would be different from the 1st, 2nd and 3rd?

Also, what is it that PythagenPat is not solving for us? As a reminder: RPG = RS + RA
exponent = RPG^0.28
WLratio = (RS/RA)^exponent
win% = WLratio/(WLratio+1)
At Tuesday, June 26, 2007 10:25:00 AM, Phil Birnbaum said...: Let me think about that PythagenPat thing a bit.

I do agree that once the environment changes, you have a new environment. That's why I suggested doing the calculation in slices. Or maybe I'm misunderstanding.

Sabermetric Research

Thursday, June 21, 2007

Pitchers with lucky and unlucky W-L records

10 Comments:

About Me

Previous Posts