Wednesday, April 22, 2009

Has Runs Created stopped working?

Does Runs Created not work any more?

The reason I ask is that, if you take a look at the 2008 AL and NL pages on Baseball Reference, you'll see that RC overestimated actual runs for 28 of the 30 teams. The average discrepancy was a huge +58 runs in the NL, and +19 runs in the NL.

To emphasize: that's not the average after removing the signs, that's the average *including* the signs. If half the teams had been +58 and the other half had been –58, the average would have been zero. It wasn't.

So what I'm saying is, Runs Created now appears to be biased too high.

This has been happening since the mid 90s. Here is the average team discrepancy by season:

1985 -4
1986 –1
1987 +2
1988 –5
1989 –5
1990 +4
1991 –7
1992 +7
1993 +0
1994 +8
1995 +7
1996 +7
1997 +19
1998 +15
1999 +19
2000 +19
2001 +18
2002 +19
2003 +19
2004 +27
2005 +25
2006 +27
2007 +24
2008 +26

Now, we know that Runs Created is biased too high for higher run environments, so that might be part of it. But it's not all of it. In the three seasons 1994 to 1996, there were 4.92, 4.84, and 5.03 runs per game respectively. But in 2005 there were only 4.59 runs per game, and in 2008, only 4.65 runs per game.

Could it be that the pattern of offensive events is different? Maybe there are different patterns of offensive events than there used to be (maybe more walks per single, or something?), and Runs Created doesn't work well when that happens?

By the way, I tried Base Runs, using the first version found on page 18 here (.pdf) with X=.535; the results weren't as extreme, but they were similar.

Anyone know what's going on? Is this a well-known problem and I just missed it?

P.S. For the record, I think I'm using the "technical" version of Runs Created found on this Wikipedia page.

Labels: ,


At Wednesday, April 22, 2009 1:13:00 PM, Anonymous Guy said...

I can't explain the sudden change in 1997 -- are you sure that doesn't reflect a change to a new version of RC?

But two thoughts on why there might be a general change in these years:
1) strikeouts have risen considerably. Although the difference is small, outs on BIP have more value than Ks because they sometimes advance runners. (And you're accounting for DPs separately.) So this should result in less scoring than RC predicts.

2) The SB success rate has increased considerably. If RC overvalues net SBs at all (I'm not sure that it does), then this would be exacerbated by the increased success rate.

At Wednesday, April 22, 2009 2:49:00 PM, Blogger Phil Birnbaum said...

Hey, Guy,

I'm pretty sure it's not a new version of RC, because I programmed the same formula for all years. Strikeouts could be it ... maybe I'll investigate.

At Wednesday, April 22, 2009 3:54:00 PM, Anonymous Guy said...

Phil: I think what you're mainly seeing is a function of OBP and SLG. RC overvalues SLG compared to OBP -- it values 1 point of OBP about 25% more than a point of SLG, while in fact the ratio is about 1.8:1 in terms of runs scored.

If you compare the current game to the pre-1993 game, OBP is only up about 10 points, while SLG is up 30-40 points. So it makes sense that RC's error has grown. Similarly, if you compare 1994-96 to 2006-08, OBP has fallen about 3 points while SLG has increased slightly. So RC's overestimate increases.

My guess is that if you looked at teams in terms of their SLG/OBP ratio, the high SLG/OBP teams will have a greater disparity between RC and RS.

Of course, Tango or Patriot could probably give you the answer off the top of their head....

At Wednesday, April 22, 2009 3:56:00 PM, Blogger Phil Birnbaum said...

That makes a lot of sense.

BTW, I found a similar (although lower) effect with BaseRuns ... I'm not a BsR expert, but isn't there a factor that has to change every year? Without that factor having changed, there was a similar overshooting.

At Thursday, April 23, 2009 3:55:00 AM, Blogger Colin Wyers said...

You don't HAVE to adjust the B multiplier for BsR every year; as with any formula, tuning to the specific run environment provides more accuracy at the team level.

The most obvious explanation is probably a change in how Baseball Reference quantifies its events, possibly a change in data providers? I don't know off the top of my head what years Retrosheet got from what data sources.

At Thursday, April 23, 2009 10:53:00 PM, Anonymous Eddy Elfenbein said...

The ratio of GDP to 1B+BB+HBP has been rising in recent years. That could play a factor.


Post a Comment

<< Home