## Monday, August 27, 2007

### Managers and the Pythagorean Projection

A reader of this blog recently sent me a couple of old academic articles arguing about managers and Pythagoras.

In 1994, Ira Horowitz wrote an article (subscription required for all these academic articles) called “
On the Manager as Principal Clerk.” I’m not exactly sure what the title means (I actually haven’t read this one, but it’s referenced by the two I did read), but it attempts to evaluate major league managers by whether and by how much their teams beat their Pythagorean projection.

I’ve never been convinced that beating Pythagoras is a managerial skill, and I’ve always thought it was mostly luck and relief pitching. And, as far as I can tell, Horowitz didn’t really produce any evidence to address the question. Instead, he quotes the projection, in this form:

(W/L) = (R/OR)^2

which is correct. But then, inexplicably, he tries to model each individual manager by a regression of the form

(W/L) = a(R/OR) + b(R/OR)^2

Why would you include a linear term when the proven relationship doesn’t include one? Maybe Horowitz explains in the paper, but I can’t think of any reason for doing this. Maybe it’s standard that if you suspect a quadratic relationship, you include the linear term -- but in this case, we know there’s no linear term.

Anyway, Horowitz “rates” each manager by the sum a+b. The thinking is that that represents the winning percentage the manager would squeeze out of his team if he had exactly as many runs scored as runs allowed. That is, the paper makes the assumption that a+b represents the skill of the manager, and is constant across all RS and RA for any possible team.

That doesn’t really make sense, and, in a 1997 rebuttal paper, “A Note on the Pythagorean Theorem of Baseball Production,”
John Ruggiero, Lawrence Hadley, Gerry Ruggiero, and Scott Knowles provide a neat explanation why.

They note that for any two managers, with two different “a+b” parameters, there is a value of R/OR for which the two managers predict equal results. On one side of that equilibrium number, manager X appears to be better, while on the other side, manager Y appears to be better. This contradicts Horowitz’s assumption that “a+b” provides a well-ordered ranking of individual skill.

As the rebuttal puts it,

“... the results indicate that [Al] Lopez has a higher predicted win-loss ratio than [Earl] Weaver for any runs ratio greater than 1.2 while Weaver has a higher predicted win-loss ratio for any runs ratio less than 1.2. In other words, Horowitz’s index indicatest hat Lopez is the better manager as long as a team is expected to outscore their opponents by 20%; otherwise Weaver is he better manager. We believe this is illogical.”
And it is indeed illogical. Unfortunately, so are some of these authors own criticisms. I’ll mention just one. The rebuttal paper comes up with this identity:

W - L = R – OR – E

where “E” is “excess runs” – the sum of all runs, for and against, that create a win or loss by more than the single run necessary to win the game.

The identity is, I guess, true, but what good is it? The authors don’t say, except to argue that because of their identity, the Pythagorean Projection is wrong because “the functional form of the equation is misspecified, and E is omitted.” I have no idea what they really mean, but it sounds like those silly arguments you hear that it’s impossible that OPS can be any good because you shouldn’t add two numbers with different denominators.

Based on that logic, the paper also argues that there is no mechanism by which managers can beat Pythagoras: “Apparently, the belief is that an efficient manager will forgo runs in a ball game once a lead is obtained, and then use these forgone runs during future games when his team is behind.”

Well, no, there are other, more realistic interpretations. One reasonable idea is that managers will use their best pitchers when it matters most, thus having a better winning percentage in one-run games, which leads to outdoing Pythagoras. The authors improperly reject this idea, too, on the grounds that they can come up with counterexamples where this doesn’t happen.

These, and many of the authors’ other comments, suggest that they don’t really understand the issues behind Pythagoras at all, and the main contribution of their paper is the explanation of why Horowitz’s measure doesn’t work.

Finally, Horowitz responds in an article called “Pythagoras’s Petulant Persecutors.” I think he rebuts the rebuttal correctly, but in academic and economics terms. Even so, the three papers don’t really tell us anything useful about anything.

Chris Jaffe’s
recent article on the 2007 Diamondbacks, though, does. As has been noted many different places, the D-Backs are in first place in their division despite giving up signficantly more runs than allowed. As of right now, they are 12 games above their Pythagorean projection.

Is that just luck? Is Arizona likely to regress to the mean and fall back out of contention? Jaffe says no – they will continue to win. The reason: they aren’t outperforming because of luck, but because their mop-up men stink, with a combined ERA above 7.00. Since those pitchers are often being used when the game is completely out of hand anyway, those allowed runs are less important than others. The result: the Diamondbacks lose a lot of blowouts, and therefore appear to be “lucky” in beating Pythagoras.

Jaffe credits the manager for this; he’s got crappy relievers, and has succeeded in saving them for situations where it doesn’t matter. That’s one way manager skill can beat Pythagoras: know which pitchers are better and which are worse, and save the worse ones for situations where it doesn’t matter how bad they are.

Alternatively, maybe those relievers have just been unlucky; after all, no manager will keep a pitcher on the roster if the manager believes his real skill is in the 7.00 range. If that’s the case, then, as Jaffe says, it won’t be the W-L record that reverts to the mean; it’ll be the opposition runs against.

Anyway, it seems to me that the discrepancy is still due to luck – at least in the sense that the Diamondbacks were lucky that the excess runs came when it didn’t matter. That (good) luck is offset by (bad) luck, in that the long relievers are giving up a lot more runs than they should. Combining those two leads to Chris’s conclusion that the D-Backs really are as good as their actual record.

All this leaves the question of managers and Pythgoras still open. In a study last year, Jaffe found that some managers were able to consistently beat Pythagoras over a long career.
At the time, I wondered if that means discrepancies were actually something other than luck. This could be one reason: allowing blowouts to get out of hand when it doesn’t matter anyway. It doesn’t seem to me like that should be enough, but it’s worth looking into.

Labels: ,

At Monday, August 27, 2007 12:06:00 PM,  John said...

On the Diamondbacks Chris is right that one of the reasons they outperform their pythag is because of the Jekyll and Hyde bullpen.

Is this pythag outperformance really an outcome of manager strategy? Are we really saying that most managers are *that* stupid they cannot work out that they use good relievers in close games and bad relievers in blowouts? No. Sure Melvin needs to take some credit for good use of the pen but I'd argue that 85% of the pythag outperformance is due to make-up of the pen which is due either to a canny GM, or more likely, serendipity

At Wednesday, August 29, 2007 7:14:00 AM,  Brian Burke said...

One suggestion to test if the Pythagorean error is due to manager skill is to determing the error for a very large set of managers and then examine the distribution. I would expect it to be normal. Then, based on Pythagorean results from teams from the same sample, create a simulation that generates Pythagorean errors based purely on random luck.

Assuming the sample size is large enough, the comparision of the two distributions would tell you what effect managers can consistently have on Pythagorean error.

For example if the distribution for the simulated error is tall and narrow, and the distribution of the actual error is wide and flat, then we have evidence that managers tend to have a consistent effect on beating or losing to the Pythagorean expectation.

The amount by which the distributions differ would be the amount by which managerial skill effects a team's Pythagorean error.

At Monday, September 03, 2007 12:53:00 AM,  Chris J. said...

'course, looking back I overstated my case. Sure there are good reasons for why the D-backs are over their projections, but criminey, they're wildly over. Must e some luck there.