Sabermetric Research: Disputing Hakes/Sauer, part III

(This may not be of general interest ... it's mostly a technical addendum to part I and part II, which is where the meat of the argument is. You should read those first, if you haven't already.)

To recap again: the Hakes/Sauer study ran a regression to predict the log of player salary from last season's "eye" (BB/PA), "bat" (batting average), and "power" (TB/H). I argued that one of the problems with the regression is that they were using a geometric model for an arithmetic relationship.

Specifically: when it comes to salaries, research has shown that every expected walk is worth the same -- about $150,000, for a free agent. But the Hakes/Sauer model had every walk increasing salary by a certain *percentage*. That would make good players' walks appear to be worth more than mediocre players' walks, at the margin.

I just wanted to show evidence that that's happening. In my own regression, where I used "next year salary value of offense" instead of just salary, I got these coefficients, which I reported last post:

eye bat power
---------------------
2001 4.06 5.52 1.43
2002 5.82 3.53 1.20
2003 5.97 4.76 1.46
2004 8.48 4.76 1.46
2005 6.92 2.58 1.08
2006 7.61 5.98 0.82

Now, I'll give you the same chart, but with the sample divided into "full time" (400+ PA) and "part time" (130-399) players.

This is full time:

eye bat power
---------------------
2001 7.49 8.94 1.98
2002 4.90 3.28 1.39
2003 5.10 9.97 1.21
2004 6.04 5.68 1.29
2005 4.43 5.26 1.26
2006 4.77 9.90 1.63

And this is part time:

eye bat power
---------------------
2001 0.41 1.87 0.88
2002 6.89 3.84 1.10
2003 7.19 1.03 1.61
2004 11.58 4.63 1.07
2005 10.48 1.50 1.03
2006 10.33 3.60 0.77

There's a huge difference in coefficients for the two groups.

The return on "eye" is consistently higher for part-time players than for full-time players, as I suspected. It's the reverse, though, for batting average.

How come? Isn't the logic the same?

Maybe the difference is this: when one part-time player has a higher batting average than another part-time player, it's probably just luck, and there's not much relationship to next year's value. But when one part-time player has a higher *walk rate* than another part-time player, that's more likely to be real, and that comes out in next year's stats.

Walk talent does vary more among players than batting average talent ... the SD of batting average, among players with 4000+ career AB, is about 21 points. For walk average, it's about 30 points. Furthermore, the season SD due to luck is about 50% higher for batting average (since hits happen more often than walks).

So, overall, the "signal to noise ratio" is more than twice as high for walks as for hits.

I think that's what's going on, why the walk numbers show one effect and the batting average numbers show another.

------

Oh, and, in case it's not obvious: the entire "Moneyball" effect for walks comes from the part-time players. There's no effect at all for the full-time players. What drove the Hakes/Sauer result, I think, is that, in 2004 and 2005, part-time players with walks just happened to have good numbers the following year.

Labels: baseball, economics, regression

Sabermetric Research

Saturday, July 13, 2013

Disputing Hakes/Sauer, part III

0 Comments:

About Me

Previous Posts