Monday, March 03, 2008

Bill James on Bert Blyleven

Should Bert Blyleven be in the Hall of Fame? The main reason he's not is that he didn't win 300 games; his record was 287-250. However, even his critics will acknowledge that Blyleven's other stats are certainly HOF quality – 685 starts, 3701 strikeouts, and a 3.31 ERA.

So the question becomes: is Blyleven's W-L record his "fault"?

On his pay website, Bill James analyzes Blyleven's record quite thoroughly and entertainingly (using Retrosheet data). The Blyleven study is actually available in a free preview – go
here and click on "Blyleven."

Bill argues, quite reasonably, that there are two reasons Blyleven might have lost a few wins off his record:

1. His teams might have given him poor run support;
2. He might have failed to "match the effort" of his teammates.

Number two means that, even though Blyleven pitched well, he might have saved his best outings for when it still wasn't good enough; giving up three runs when his team only scored two, for instance. If true, that would have cost him a bunch of wins, and, in some eyes, would be enough to keep him out of the hall.

Bill starts by looking at run support. Throughout the essay, he compares Blyleven to six other similar pitchers. Those others, like Blyleven, had long careers and ERAs ranging from 3.22 to 3.45.

It turns out that Blyleven had poor run support compared those other guys:

4.39 Jenkins
4.38 Kaat
4.37 Carlton
4.24 John
4.22 Niekro
4.19 Blyleven
4.14 Sutton

From here, let me tell you what I would have done. Then, I'll show you what Bill did, which is much more thorough.

Blyleven had 685 starts, and got about a tenth of a run less support than average. That's about 70 runs. That means that run support cost him about 7 wins, still not enough for 300. Of course, if he had had Fergie Jenkins' support, that would be 14 wins, which does take him over the 300 mark.

As for timing of runs in games, you can use Pythagoras for that. Blyleven's ERA wa 3.31; including unearned runs, his "RA" was 3.65.

A team that scores 4.19 runs per game while giving up 3.65 should have a winning percentage of .563 (using exponent 1.83). Blyleven had 537 decisions, so he should have gone 302-235: 15 games better than his actual record.

However, Blyleven pitched 7.25 innings per start, not nine. Since Bert was an above-average pitcher, the bullpen would have cost a few extra runs. Assuming Blyleven's relievers would have given up (say) 4.25 runs per 9 innings, that would have been about 80 additional runs over Blyleven's 3.65. That's 8 wins. So Blyleven was really only 7 wins worse than he "should have" been due to run timing.

So I'd conclude: run timing cost Blyleven 7 wins, and run support another 7.

Now, here's what Bill did. Actually, this is his main method; he has a couple of other methods, and some interesting observations (When given three runs of support, Don Sutton was 52-33 – Blyleven was only 29-48 !!!). You should definitely read Bill's study in its entirety, because I'm only going to describe one of his methods here.

Instead of resorting to Pythagoras, Bill looked at those six comparison pitchers, and figured out the records of their teams when they scored 0 runs of support, 1 run, 2 runs, and so on. He then counted how many times each of those scores happened in a Blyleven start, and computed an "expected" number of wins.

The expected record was 371.5-313.5. The actual record of Blyleven's teams was 364-321. That's 7.5 wins, almost exactly what the Pythagoras method found.

I like Bill's method better than the Pythagorean one because it doesn't just satisfy sabermetricians, but it's able to convince non-sabermetricians as well. To a columnist who is openly hostile to sabermetrics, Bill's method is one that can't be dismissed out of hand. At least not as easily.

And, by the way, for those of you who want to hold Blyleven responsible for his 7 missed "timing" wins, Bill writes,

"Suppose that Blyleven has a seven-game stretch during which he wins games 13-0 and 5-2, but then loses 3-2, 4-3, 3-2, 7-4 and 3-2. Those are the actual scores of Blyleven’s games from May 3 to June 4, 1977.
Blyleven was supported by 4.43 runs per game during that stretch and allowed 3.14, but he lost five of the seven games.

"One can look at that and say that Blyleven failed to match his efforts to the runs he had to work with—but why is that all Blyleven’s fault? Isn’t it equally true that his offense failed to match their efforts to Bert’s better games? It seems to me that it is.

"So why do we hold Blyleven wholly responsible for this? Wouldn’t it be equally logical, at least, to say that this was half Blyleven’s fault, and half his team’s fault?"

I never thought of it that way before, but, yeah, Bill, you're right.

At Tuesday, March 04, 2008 6:13:00 AM, Blogger David Barry said...

this was half Blyleven’s fault, and half his team’s fault
I'm not sure I agree with this (though I'm not a baseball nut, so shoot me down if I'm wrong).

Batters always give 100%. They only have to concentrate for a couple of minutes. Their good days and bad days come down to luck.

Pitchers can ease up. I don't know if they do, but perhaps they give 100% when there's a lead of less than four runs, but only 95% otherwise. So the pitcher can try to match himself to the offence, whereas the offence try to match themselves to the pitching.

At Tuesday, March 04, 2008 8:27:00 AM, Blogger Phil Birnbaum said...

Right, that's a point. Part of the problem, though, is that the pitcher doesn't know how much offense he's going to get. If Blyleven knew he was going to get only two runs today, he could pitch harder. But, how does he know that? I suppose by the seventh inning he'd have a pretty good idea, but isn't it too late by then?

At Tuesday, March 04, 2008 8:39:00 AM, Blogger David Barry said...

Oops, that's quite true. Still there might be a small effect for some later innings, which would add up over a career. Enough to split the blame 50.5-49.5 between Blyleven and his offence, or something like that.

At Tuesday, March 04, 2008 9:24:00 AM, Blogger Chris said...

I once did a really in-depth study of pitcher run support from 1960-1979. Take every league in those years, determine how often teams won when scored 0 runs, 1 run, 2 runs, 3 runs, et al up to 10+ runs.

Once you have all those percentages, look at each pitcher's run support. Take the number of times his team scored 0 runs for him & multiply that by the chances of winning when scoring 0 runs (in other words (X times 0). Then do the same for 1 run, 2 runs, and so on up the ladder.

There was a park adjustment I threw in, but frankly it was a bit rough & I don't remember it off the top of my head.

Result: Based on this approach, Bert Blyleven won fewer games than he should've; he underacheived more than any other pitcher in the 1970s that I looked up. (Jim Perry was the biggest overachiever if you're curious).

Chris J.

At Tuesday, March 04, 2008 10:07:00 AM, Anonymous Anonymous said...

"Result: Based on this approach, Bert Blyleven won fewer games than he should've; he underacheived more than any other pitcher in the 1970s that I looked up."

That's actually an interpretation, not a "result." The result (I assume) was that Blyleven's TEAMS won fewer games when Blyleven pitched than the RS and RA would predict, and his teams underachieved. We have absolutely no reason to assign more than half the responsibility for this bad timing to Blyleven, as opposed to his hitters (assuming one wants to assign responsibility at all). So we're left assigning Blyleven a roughly 3.5 win penalty, which should have zero effect on anyone's assessment of his career.

BTW, this essay was first published in The Hardball Times Annual (2006 maybe?).

At Tuesday, March 04, 2008 10:10:00 AM, Blogger Phil Birnbaum said...

Guy, the Bill James essay was published in the Hardball Times Annual? Or Chris's study?

At Tuesday, March 04, 2008 1:23:00 PM, Anonymous Anonymous said...

Phil: Sorry, I meant the James study. Can't remember if it was the 2006 or 2007 book.

BTW, Blyleven pitched exactly the same when score was within 1 run as he did at other times. I don't think there's much evidence he pitched poorly in the clutch.

While this was a fine (if overlong) piece by James, I have to say that I think his preoccupation with clutch performance in recent years is an unproductive distraction. For example, I don't know if you saw his piece on Biggio in Slate last week -- suggesting that Biggio was a poor clutch performer who put up good numbers by exploiting weak pitching -- but I thought it was quite weak. Interested in your thoughts.....

At Tuesday, March 04, 2008 8:32:00 PM, Blogger Cyril Morong said...

Since the issue of Blyleven's clutch pitching came up, I thought I would check some numbers at Retrosheet.

I looked at his close and late data. I have him with a career AVG-OBP-SLG in all situations of 248-.300-.365. In Close and Late Situations, he has 0.259-0.317-0.368. I did not include IBBs in OBP. He does not do as well in the CL cases, but a starter could be tired then. But the CL numbers don't look like he was choking either. I don't know in general how well starters did, especially from his era, in CL situations. Maybe they all did a little worse. So I compared him to Jack Morris. I have Morris with in all situations with .246-.306-.378 and in CL situations I have him with .238-.287-.355. So it looks like Blyleven was not as much a clutch pitcher as Morris.

At Wednesday, March 05, 2008 10:36:00 AM, Blogger Tangotiger said...

I posted this a few years ago:

That's how Blyleven did at various Leverage Index classes, split between his 70s and 80s careers.

That's how Morris and Blyleven did in their careers.

I should have excluded the IBB, but, you the data is all there for you to do as you wish.

At Wednesday, March 05, 2008 11:16:00 AM, Blogger Phil Birnbaum said...

Cy and Tango,

Doesn't the situational data affect *how many runs* Blyleven gave up, as opposed to Bill's issue, which is the distribution of runs within games?

Put another way, his clutchness should affect his ERA given his hits and walks allowed, but not necessarily his W-L record given his ERA.

Even so, he sure was anti-clutch in those early years, wasn't he? Wow.

At Wednesday, March 05, 2008 12:40:00 PM, Anonymous Anonymous said...

I think we need to see Tango's same splits for all starting pitchers in those years to fairly interpret Blyleven's performance. Remember that high-LI invariably means Blyleven is facing hitters for the 3rd or 4th time in the game, a big disadvantage. May also be facing above-average hitters (certainly no pitchers) and/or a platoon disadvantage. Conversely, the very low LI situations are mainly where Blyleven has a big lead, meaning weak opponents and/or a good day for him, and hitters' earlier ABs in the game.

I don't doubt that he was less "clutch" in the early years (though may just be random fluctuation). But it's important to put this in context.

At Wednesday, March 05, 2008 1:22:00 PM, Blogger Phil Birnbaum said...

You're right. I was comparing him to Jack Morris, which is the only other player in Tango's spreadsheet. But we really should compare him to the norm, and we don't know what the norm is.

At Thursday, March 06, 2008 10:02:00 PM, Blogger Cyril Morong said...

Since I only looked at close and late, I think it is more than just a runs issue. It is about when he gave up those runs. I am not saying he was especially anti-clutch, but everything else being equal, the guy who is better close and late will win more. I have not yet looked at what Tango did, which probably gets more at the issue.

At Thursday, March 06, 2008 10:18:00 PM, Blogger Phil Birnbaum said...

Cy: Yeah, I think you're right, some of the effect would show up in wins as well as ERA.

"High leverage" comprises both (a) runners on base, and (b) close games, so some of the effect would be more runs, and some would be more losses.

For "Close and late," it depends whether the definition takes account of runners on base. The more it does, the more ERA will be affected as well as wins. If it doesn't, there still might be a little bit of a runs effect, because if Blyleven is worse in those situations, the hits are more concentrated there, and that leads to more runs than if the hits were scattered evenly. But that effect might be small.

At Friday, March 07, 2008 10:25:00 AM, Anonymous Anonymous said...

There's a huge selection bias in Bill James' comparison of Blyleven with only HOF pitchers in seeing how they did when teams scored no runs, 1 run, etc.

At Thursday, January 01, 2009 7:29:00 AM, Blogger Unknown said...








