Sunday, February 10, 2008

"Clemens Report" criticism misses the point

A couple of weeks ago, Hendricks Sports Management (HSM), Roger Clemens' agents, put together a document purporting to show that Clemens' late-career effectiveness was not unusual, compared to certain other great pitchers with long careers. While the report doesn't mention steroids at all, the intent of the report is clear: to show that you can't conclude any illegal behavior on Clemens' part simply by the fact that he remained effective late in his career.

An article in today's New York Times, by Eric Bradlow, Shane Jensen, Justin Wolfers, and Adi Wyner (BJWW), tries to debunk that HSM "Roger Clemens Report." In my opinion, it fails.

BJWW criticize the Clemens Report on the main grounds that if you want to see if Clemens' career trajectory is unusual, he should be compared to *all* "durable" pitchers, not just the three pitchers (Randy Johnson, Curt Schilling, Nolan Ryan) that Clemens' defenders chose.

So they found the 31 pitchers since 1968 with at least 15 seasons of 10 starts and 3000 IP over their careers. They plotted Clemens' career trajectory against the average of the group of 31. Here's the chart (am I allowed to show it here under fair use laws? Hope so.)

Clemens is markedly different: the average pitcher shows a U-shaped curve: an improvement up to about age 31, then a decline to the end of his career. Clemens, on the other hand, shows a straight line with a slight decline (for ERA), and an *opposite* U-shaped curve for WHIP: getting worse up to about age 37, then improving after that.

Therefore, the authors say, Clemens really IS unusual. His "statisticians-for-hire" agents are guilty of selection bias. "A careful analysis, and a better informed public, are the best defense against such smoke and mirrors."

Well, I don’t agree. I think BJWW should also have done a more careful analysis, and thought about their conclusions a bit more.

First: is this group of 31 pitchers (which, by the way, BJWW don't list) really the best control group to use? It is well-known among sabermetricians, since Bill James discovered it back in the 1980s, that power pitchers have much longer career expectations than control pitchers. Comparing Clemens to a mix of power- and control-pitchers would bias the group against him.

In their article, BJWW conclude that the graphs show Clemens to be "unusual" compared to the other pitchers. Well, of course he's unusual compared to most pitchers: he is an extreme power pitcher, of a type that has been shown, over 20 years ago, to have significantly longer careers than others! The Times authors think they have evidence that Clemens is on steroids, but what they've probably found is just evidence that Clemens is a power pitcher!

And this is the *less* important criticism of the Times article.

The second, and absolutely the most important point, is the authors are attacking a straw man. Clemens' agents are NOT saying that his career is *usual* – they are saying his career is *not unprecedented by a non-steroid user*. There's a big difference there, and it's not one of statistics or regressions or comparisons – it's one of common logic.

The public was saying, "look – Clemens' longevity is unusual – therefore he's probably taking steroids." HSM is replying, "Clemens' career is unusual, but not THAT unusual. Indeed, here are three pitchers with similar career trajectories, and nobody is saying *they* took steroids."

That's a convincing reply. To rebut it, it's not enough to show that Clemens' career is even farther from the average than HSM said – because even if that's true, it's irrelevant. The HSM argument doesn't depend on the average – it depends on the extremes. What HSM is saying is, "look, you have to understand, there is a certain type of pitcher, very atypical, who has this kind of career. It's not an outlier, it's not that rare, Clemens fits right in to that group, and it has nothing to do with steroids."

Look at it this way: suppose that five years ago, your neighbor Clem, down the street, comes into some money and builds a big extension on his house and buys a Ferrari. People think he robbed a bank or something. Subpoenaed to appear before a congressional investigation, he denies that he stole the money.

But the public still thinks Clem is a thief. Clem hires a lawyer to rebuff the charges. The lawyer says, look, Clem won the lottery in 2003, that's how he got rich. There's no theft at all. In fact, here are three other well-regarded rich guys who also won the lottery – Ryan, Schilling, and Johnson. They're rich too, and nobody thinks THEY stole anything! See, it's quite possible to get rich without robbing a bank, so lay off my client!

Then, four reporters, in a New York Times investigative article, say, well, why the heck should we compare Clem to only these three guys, cherry picked by Clem's lawyer? We should compare him to *everyone* who made a million dollars ever! They do, and find that, of everyone who made a million dollars in 2003, most of them were CEOs, and made similar amounts in 2004, 2005, and 2006. But Clem didn't make anything in those years – his career earnings trajectory is very different from the average million-dollar earner. See? We *should* be suspicious that Clem robbed a bank! His agents are full of crap!

Well, that argument is obviously silly -- but it's exactly the argument the Times authors make.

Even if the statistical analysis is correct, it simply doesn't matter whether Clem's earnings vary from CEOs. What matters is whether other people have won the lottery, and whether it's reasonable to think that Clem did too.

The relevant baseball question is not "how far is Roger Clemens from the norm?" The question is: "If a player is as far from the norm as Roger Clemens, what is the chance that he took steroids?"

And the answer is: if you acknowledge that Schilling, Ryan, and Johnson have roughly a similar career trajectory as Clemens, and you believe that none of them took steroids, then, from the statistical evidence alone, your first estimate of the probability Clemens cheated should be approximately *zero*.

Labels: , ,


At Sunday, February 10, 2008 7:03:00 PM, Blogger Phil Birnbaum said...

One more way of looking at this:

Schilling, Ryan and Johnson all qualify under the criterion of 15 ten-start seasons and 3000 IP overall. It is reasonable to assume, then, that all of them were on the Times' authors list of 31 comparison pitchers.

If 3 of 31 were reasonably comparable to Clemens, that's at most a 90% confidence level (assuming none of the other 28 were close, which is the most conservative assumption). Also, how many of the 31 were power pitchers and therefore properly comparable? If it's only, say, 24, then that becomes an 82% significance level.

At Monday, February 11, 2008 11:56:00 AM, Blogger Don Coffin said...

My immediate reaction to the NYT article is to say that without knowing the distribution of performance, simply looking at the average performance is meaningless. (Which is in agreement with your own comment.) If the mean ERA for all 31 pitchers at age 40 is (I'm making this up) 4.25, with a standard deviation of 0.65, then the 90% confidence interval is between 2.95 and 5.55. Big whoop.

At Friday, July 11, 2008 10:25:00 AM, Anonymous Anonymous said...

I can't believe there are intelligent people arguing about this...Clemens took steroids. So did Bonds, Marion Jones, etc etc etc. You don't need stats - just look at pictures of them from earlier in their career to now. You want more proof?? Just wait & see how many of them drop dead before they hit 55. My guess is that we'll have a rash of ex-supposedly superstar athletes dropping dead between 2015 & 2025. I will not shed any tears...

At Friday, July 11, 2008 1:02:00 PM, Anonymous Anonymous said...

The argument is whether the statistical evidence *alone* is convincing enough to state a case that Clemens is an outlier and thus implicated for steriods. Have to agree that without seeing the standard deviation of the populations, no way to take it seriously.

There are other stats which are more convincing. Head diameter, helmet size, chest size, uniform size, etc.


Post a Comment

<< Home