Sabermetric Research: Run estimator accuracy

This paper from the Retrosheet research page introduces a statistic that’s a “dramatic improvement” over OPS.

Gary M. Hardegree’s article (and his stat) is called “Base-Advance Average.” Here’s what you do: you consider how many total bases were available to be advanced; then you count how many of those bases the batter actually caused to happen.

So suppose there’s a runner on second. There are six total potential bases – two for the runner advancing home, and four for the hitter advancing home. If the batter then doubles, four of the six bases were advanced, and so the Base-Advance Average (I’ll call it BAA) for that plate appearance would be 4/6.

Hardegree figures that 84% of games were won by the team with the higher OPS. But 95.5% were won by the team with the higher BAA. Therefore, BAA is substantially better than OPS.

Which, of course, it is. But that’s because it uses a lot more information than other stats.

OPS is based on only the “baseball card” stats – the player’s basic batting line. BAA requires much more information than that – the number of bases available and the number advanced. The increase in accuracy is therefore not surprising -- by considering situational information, you can get as accurate a stat as you want.

Indeed, we could improve BAA further by adding all kinds of variables to the it. For instance, by somehow including the number of outs, we could get even better accuracy – weight the on-base portion of BAA higher if there’s nobody out, and the bring-runners-around portion higher if there’s two outs. Or add an adjustment based on who’s coming to bat next.

To take this to an extreme, there was a statistic floated around a few years ago, I think on the SABR mailing list, that predicted runs almost perfectly. If I recall correctly, it started like BAA, by counting all bases advanced and subtracting out bases lost via caught stealings and such. But it went one step further – it subtracted out “bases left on” at the end of an inning – so if the last out happened with a runner on third, the stat would subtract three bases.

What was left was exceedingly accurate – because, if you divided it by four, it was almost exactly identical to runs scored, by definition! Take bases advanced, subtract out bases given up, and you’re left only with bases belonging to runners who scored. The method was like the old joke about figuring out how many sheep are in a field – count the legs and divide by four.

So inventing a statistic that’s substantially better than OPS isn’t very difficult, so long as you’re willing to throw some play-by-play information into the formula.

Why, then, do we stick with OPS and Runs Created and such, when we can so easily be more accurate using BAA and other such stats? Two reasons.

First, OPS and BAA answer different explicit questions. OPS asks, “how can we determine offensive performance using only standard hitting stats?” BAA asks, “how can we determine offensive performance using detailed information on how many bases were advanced?”

The first question is probably more important, if only because we don’t really have play-by-play data for lots of players, such as Babe Ruth or some minor league callup.

Second, and much more significantly, the point of OPS is to take evidence of a player’s skill and translate that into a measure of his value. What’s better evidence of skill and talent – batting line, or BAA?

Suppose Player A bats always with the bases empty, and player B hits only with a runner on third. Each gets a single one out of every four times, and strikes out the rest.

A will wind up with a BA of .250. B will wind up with a BA of .250.

But:

A will wind up with a BAA of .063. B will wind up with a BAA of .100.

The two measures give different estimates of ability. One implicitly argues that BA is the true measure of talent, and B appearing to have better BAA skill than A is an illusion caused by the situation. The other implicitly argues that BAA is the true measure of talent, and B and A appearing equal in BA talent is an illusion caused by the situation. [Does this remind you of the grue-bleen paradox?]

This gives us two different theories of what happens if both batters now hit with the bases empty:

1. If the true skill is BA, A and B will both continue to hit .250 (which means that both will now have a BAA of .063).

2. If the true skill is BAA, A will continue to have a .BAA of .063 (which means he will now hit .250), and B will continue have a BAA of .100 (which means he will now hit .400).

Which will actually happen? Could anyone really argue number 2 in good faith?

I may be wrong, but I would argue that we’ve probably come close to the limit of accuracy for batting statistics that use a basic batting line. The same batting line can and will lead to a different number of runs scored, if the events happen in different orders (such as if the hits are scattered, or all come in the same inning). No statistic can do better than this natural limit of accuracy, and I suspect that the traditional batting line stats (like Runs Created, Linear Weights, and Base Runs) are coming pretty close.

If that’s true, then any new, more accurate stat would have to use situational information for its improvement. And, given the evidence that clutch hitting is essentially random, those new stats would basically be adjusting for dice rolls. Which isn’t terribly useful.

5 Comments:

At Wednesday, September 06, 2006 4:20:00 PM, Anonymous said...: A similar idea appeared in Esquire magazine about 70 years ago. Go to

http://www.baseballthinkfactory.org/btf/pages/essays/rickey/hoke.htm
At Wednesday, September 06, 2006 4:30:00 PM, Phil Birnbaum said...: Note that the end of the above link might be hidden on the screen ... if you cut and paste, make sure you get all of it including the hidden part.

Or, click here.
At Thursday, September 07, 2006 12:43:00 AM, Anonymous said...: Just starting to read more sabermetric blogs; I am a hardcore Padre fan, read www.ducksnorts.com; anyway, I agree with the jist of this; unless a batter can adjust their skill based on the situation (or maintain their skills while others deteriorate) OPS (or, as I find, that a slight bias towards OBP vs. SLG instead of 50/50 in OPS) is a measure of the batters' skill. OPS answers "what is the best measure of a hitter" correlated to runs scored. OPS (or 60% OBP, 40% SLG, according to my calculations) is that measure.
At Thursday, September 07, 2006 10:21:00 AM, Tangotiger said...: Not to get overly accurate, but 1.8*OBP+SLG is what you want. You can tell because if you do the "plus 1" method, (add a walk and a PA, or add a single, AB, and PA, etc), those proportions match reality.
At Friday, September 08, 2006 2:19:00 AM, Dan Agonistes said...: Yeah, the idea about counting the number of available bases seems to come up from time to time. Also saw it a couple years ago

http://danagonistes.blogspot.com/2004/11/actuaries-and-sabermetrics.html

<< Home

Sabermetric Research

Wednesday, September 06, 2006

Run estimator accuracy

5 Comments:

About Me

Previous Posts