## Friday, March 05, 2010

### Improving on Pythagoras

A recent press release from Iowa State University promotes a recent study from physics professor Kerry Whisnant, who has discovered an extension to baseball's Pythagorean Projection that makes it more accurate.

The Pythagoran formula ("Pythagoras") says that you can predict team winning percentage as

wpct = RS^2 / (RS^2 + RA^2)

Where RS means "runs scored," RA means "runs allowed", and "^2" means "to the power of 2." The formula was discovered by Bill James some 30 years ago.

But there are cases where the formula doesn't give you the whole story. Your actual win total doesn't just depend on runs scored and allowed -- it also depends on the consistency of each. If your scoring is less consistent than average, you should outperform Pythagoras in terms of wins. For instance, if you score exactly the same number of runs as you allow, you should wind up a .500 team. But if you win more blowouts than average -- by scores of 15-2 and 16-6, for instance -- you'll finish at less than .500, because you've "wasted" your runs when you don't need them. And if you *lose* more blowouts than average, you'll win *more* games than 50 percent, because your opponents are "wasting" their runs.

Whisnant generalized the "blowout" argument to use the standard deviation of runs scored and allowed, instead. The SD is a measure of how spread out the run totals are, so that a team with lots of blowouts will have a higher SD than a team with fewer. But then he went one step further -- he noted that, if you have two teams with the same runs scored, the team with the higher slugging percentage will have a lower SD. Why? Because home runs more consistently contribute to scoring than do singles. Generally, a home run is worth about three singles, but the effects are more certain. If you hit only five home runs in a game, you're going to score exactly five runs. But if you hit 15 singles, you're going to score anywhere from 0 to 15 runs, depending on how they're bunched and how often your outs advance your baserunners.

Whisnant found that if you include the effects of SLG, you can come up with a formula gives you a better estimate than just "vanilla" Pythagoras. (The full formula is available at the above links.)

So how big is the effect? If you have two teams with exactly the same runs scored and allowed, you would normally expect them to win the same number of games. But not if they have different SLGs. For for every .080 that one team exceeds the other opponent in slugging percentage, it should win one extra game. Since a run is one-tenth of a win, that means every .008 is worth a run. And, since there are nine batters, then every .072 excess slugging by a single, individual batter makes him worth the equivalent of an extra run. If you have two players, both creating X runs as best you can measure them, but one's SLG is .400 and the other's is .472, then the second player should be thought of as if he's worth one run more than the other guy.

If you use the new formula, Whisnant found you can reduce the error of the Pythagorean formula roughly by half.

I like the study a lot. It's been known for a long time that *how* you score your runs affects how many games you win, but this is the first time, to my knowledge, that anyone has tried to quantify the effect in terms of batting-line statistics.

----

But ... I'm not 100% convinced about the exact numerical results.

Whisnant calibrated his formula on samples of head-to-head matchups between teams over a season. That's about 10-13 games. But a Pythagorean formula that evaluates winning percentages over 12 games (say) is not necessarily as accurate as one that works on 162 games.

Consider a sample of one game only. The traditional pythagorean formula (with exponent 2) isn't the best fit: it will say, for instance, that a team that scores four runs and allows two should win 80 percent of the time. But, of course, that team will really win 100 percent of the time. For one-game samples, the best exponent is *infinity*. (Or, if you don't mind rounding, an exponent of 15 will be more than good enough.)

That is: Pythagoras works best when the score of every game is independent of the total scores. That's never true, because the score of the game *contributes* to the total. But, for 162 games, it's *almost* true -- one game out of 162 is barely anything. But for one game, it's absolutely not true at all -- not only is the one game not independent of the total, it actually *is* the total!

So what about for 12 games? It's probably closer to the 162-game case than the 1-game case, but I suspect it might still be somewhat off. Whisnant gave his formula coefficients to three decimal places; my guess is that if he recalibrated to 162-games seasons, the third decimal place would definitely change, and probably the second decimal too.

Regardless, even with the existing formula, you could check, by using it for actual team seasons, and comparing its accuracy to Pythagoras on those. I bet you'd find that Whisnant's average error (or average square error) was a lot better, but that it would be very, very slightly biased, as a bit too extreme.

That might be a quibble ... but, combined with the fact that even the best simulations may not be accurate to the third decimal place ... well, I think testing with real MLB data is a must-do.

----

Also, while I think the new formula is theoretically very interesting, I'm not sure it has any practical use to a GM. Remember, to get the equivalent of one extra run, when you're evaluating two equal players, you have to take the one with a slugging percentage .072 higher. What does that mean in real-life terms? To keep things simple, I'll look only at singles and home runs.

Replacing a single with an out is worth about .75 runs. Replacing an out with a home run is worth about 1.75 runs. If a player has 500 AB in a season, .072 in slugging represents 36 extra bases.

A bit of math will show that if you want to keep run production the same, while increasing slugging by .072, you have to turn 49 singles into 21 home runs. So if you start with a .290 hitter with 20 home runs, to gain the equivalent of one single Pythagorean run, you have to trade him for a .234 hitter with 41 home runs.

That's a big difference. And it's not that easy to do: there aren't all that many .234 hitters with 41 home runs, and it might not be worth pursuing one just to gain one-tenth of a win.

More importantly, the run estimation techniques we have simply aren't good enough to be accurate within anything close to a single run. Linear Weights data tells us the relative value of a single and a home run *on average*. But no team is average. A hit creates more runs the more men you already have on base. If you're the Yankees, scoring 5.6 runs a game, instead of the league average 4.8 ... well, is the .234 hitter with more power *really* worth the same as the .290 hitter with less power? My gut says that he'll be worth somewhat less -- even though he'll have more men on base to drive home, his extra outs are more costly on a better team. And I suspect that, on a better team, the value of a single increases more than the value of a home run -- the single has both more men to drive in, and better hitters coming up to drive him home. The HR has more men on base, but gains no benefit from the batters following him.

Moreover, doesn't it make sense that a team scores the most runs when it has some optimum combination of singles hitters and power? A team of nine Mark McGwires would score a lot fewer runs than Linear Weights would suggest, because LW has a built-in assumption that the McGwires would have a typical number of runners on base. If you hire nine McGwires to save nine-tenths of a win in Pythagoras, you're going to lose a lot more than nine-tenths of a win in runs scored, even if Linear Weights tells you otherwise.

Generally, no matter what the runs formulas said, I wouldn't be sure that I knew how to evaluate player productivity to the point where I could be sure to gain that minimal one-run Pythagorean advantage without doing at least one run's worth of damage in the attempt.

-----

On the subject of Pythagoras in general ... there seems to be a implied argument that when a team finishes ahead of its pythgorean projection, it's a positive thing, and when it falls short, it's a negative thing. For instance, a commenter to Whisnant's original article said that you could "squeeze a couple of extra wins" out of a certain number of runs, which suggests it's something to shoot for.

I don't think that's the way to look at it. It seems to imply that God sends a team a certain number of runs, and it's up to the players (or even the manager) to distribute those runs appropriately. And so a team that beats its Pythagorean projection by a win or two has been more successful than if they won fewer games than expected.

It doesn't seem to me that that should be true at all. It's important to win ballgames, and you do that by scoring runs, but the efficiency with which you happened to convert runs to wins is not something that really matters.

Look at it this way. Suppose you're leading 2-1 late in the game, and then you score three insurance runs in the top of the ninth. Your closer strikes out the side, and you wind up winning 5-1. Were those three runs a good thing? Of course they were -- they increased your probability of winning significantly (I could look it up, but maybe it's a tenth of a win or something).

But those runs actually made you *do worse* in terms of your Pythagorean projection. Without the top of the ninth, you would have won the game with a one-run advantage. Now, you win the game with a four-run advantage. You're actually *less efficient* in terms of Pythagoras -- you've "wasted" three runs. But so what? Those runs were a good thing: they insured the victory.

If you hadn't scored in the ninth inning, your season might have (for instance) resulted in you scoring 800 runs, allowing 800 runs, and finishing 81-81. Now, with those three runs, you scored 803 runs, allowed 800, and still finished 81-81. Why is the second scenario worse than the first?

It's not. What I think is, that since you have control over runs more than you have control over Pythagoras (which is still mostly random, despite Whisnant's findings here), you should evaluate your team by runs. It's not "runs are fixed and let's distribute them efficiently." It's, "let's score as many runs as we can and hope they're efficiently distributed, even though we have less control over that."

Take an analogy to taxes. All things considered, we'd all like to pay a lower percentage of our income in taxes -- we'd like our take-home pay to be "efficient" compared to our nominal salary. But would you rather make \$100K and pay \$40K in taxes (for a 40% average tax rate), or make \$40K and pay \$10K in taxes (for a 25% tax rate)? Concentrating on the *rate* is the wrong thing to do: you'll be cheating yourself out of \$30,000 in cash. That's because the situation is NOT that you have a fixed amount of income and want to minimize taxes -- it's that the income AFFECTS the taxes, and in ways that you can't control much.

The ninth-inning analogy: if you score three runs, they'll be taxed at a higher rate than the rest of your runs (in the sense that, since you already had the lead, they won't contribute much to victory). But the rate is LESS than 100%. You're much better off accepting those runs, even if most of their value is taxed away. Having a higher "run" tax rate may sound like a bad thing when you isolate it from the rest of reality. But when you realize that a higher tax rate means that you've scored more runs and perhaps won more games ... it's no longer a bad thing.

But if you don't agree with me, and you think beating your pythagoras is still something to shoot for ... well, when you're behind 3-1 in the ninth, let the other team score an extra 10 runs or so. You'll still lose the game, but, boy, will your pythagorean projection make you look efficient!

Labels: ,