### Disputing Hakes/Sauer, part I

The renowned 2003 book, "Moneyball," famously suggested
that walks were undervalued by baseball executives. Jahn K. Hakes
and Raymond D. Sauer wrote a paper studying the issue, in which they concluded
that teams immediately adapted to the new information. They claim that, as early as the very
next season, teams had adjusted their salary decisions to eliminate the market
inefficiency and pay the right price for walks.

Here's the paper (.pdf): "The Moneyball
Anomaly and Payroll Efficiency: A Further Investigation."

Hakes and Sauer’s claim seems to have been widely accepted as
conventional wisdom, as far as I can tell. A quick Google search shows many
uncritical references.

Here's Business
Week from 2011. Here's Tyler
Cowen and Kevin Grier from
the same year. This is J.C.
Bradbury from one of his books, and later on
the Freakonomics blog. Here's David Berri
from 2006 and Berri and
Schmidt in
their second book (on the
authors' earlier, similar paper). Here’s Berri
talking about the new paper, just a couple of months ago. Here's more and more and more and more.

I reviewed the study back in 2007, but, on
re-reading, I think my criticisms were somewhat vague. So, I thought I’d revisit the subject,
and do a bit more work. What
I think I found is strong evidence that what the authors found *has nothing to
do with Moneyball or salary.* There
is no evidence there was an inefficiency, and no evidence that teams changed
their behavior.

Read on and see if you agree with me. I’ll start with the intuitive
arguments and work up to the hard numbers.

-----

First, the results of the study. Hakes
and Sauer ran a regression to predict (the logarithm of) a player’s salary,
based on three statistics they call "eye", "bat," and "power". "Eye" is walks per PA. "Bat" is batting
average. "Power"
is bases per hit.

They predict this year’s salary based on last year’s eye/bat/power, on
the reasonable expectation that a player’s pay is largely determined by his
recent performance. They
included a variable for plate appearances, and dummy variables for year,
position, and contracting status (free agent/arbitration/neither).

Here are the coefficients the authors found:

eye bat power

---------------------

1986 0.69 2.26 0.22

1987 1.27 3.87 0.46

1988 0.20 2.76 0.37

1989 1.15 4.04 0.50

1990 1.48 1.75 0.63

1991 1.13 1.20 0.52

1992 0.40 2.76 0.57

1993 0.71 4.42 0.65

1994 0.36 4.78 0.86

1995 2.86 5.33 0.76

1996 0.78 1.85 0.73

1997 1.84 5.80 0.52

1998 2.21 4.23 0.74

1999 2.77 3.81 0.77

2000 2.72 5.30 0.73

2001 0.53 5.28 0.84

2002 1.52 3.64 0.68

2003 2.12 3.07 0.57

**2004 5.26 4.14 0.78**

**2005 4.19 5.38 0.86**

2006 2.14 4.66 0.58

Moneyball was published in 2003 … the very next season, the coefficient
of "eye" -- walks -- jumped by a very large amount! Hakes and Sauer claim this shows how
teams quickly competed away the inefficiency by which players were undercompensated
for their walks.

Those 2004/2005 numbers are indeed very high, compared to the other
seasons. The next highest "eye"
from 1986 on was only 2.86. It
does seem, intuitively, that 2004 and 2005 could be teams adjusting their
payroll evaluations.

But it’s not, I will argue.

--------

First: it’s too high a jump to happen over one season. At the beginning of the 2004 season,
most players will have already been signed to multi-year contracts, with their
salaries already determined. You’d
think any change in the market would have to show itself more gradually, as
contracts expire over the following years and players renegotiate in the newer
circumstances.

Using Retrosheet transactions data, I found all players who were signed
as free agents from October 1, 2003 to April 1, 2004. Those players wound up accumulating
40,840 plate appearances in the 2004 season. There were 188,539 PA overall, so
those new signings represented around 22 percent.

The Retrosheet data doesn’t include players who re-signed with their old
team. It also doesn’t
include players who signed non-free-agent contracts (arbs and slaves). Also, what’s important for the
regression isn’t necessarily plate appearances, but player count, since Hakes
and Sauer weighted every player equally (as long as they had at least 130 PA in
2003).

So, from 22 percent, let’s raise that to, say, 50 percent of eligible
players whose salary was determined after Moneyball.

That means the jump in coefficient, from 2.12 to 5.26, was caused by
only half the players. Those
players, then, must have been evaluated at well over 5.26. If the overall coefficient jumped
around 3 points, it must have been that, for those players affected, the real
jump was actually six points.

Basically, Hakes and Sauer are claiming that teams recalibrated their
assessment of walks from 2 points to 8 points. That is -- the salary value of walks
*quadrupled* because of Moneyball.

That doesn’t make sense, does it? Nobody ever suggested that teams were
undervaluing walks by a factor of four. I
don’t know if Hakes and Sauer would even suggest that. That’s way too big. It suggests an undervaluing of a
free-agent walk by more than $100,000 (in today’s dollars).

For full-time players, the SD of walks is around 18 per 500 AB. That means your typical player would
have had to have been misallocated -- too high or too low -- by $1.8
million. That seems way too
high, doesn’t it? Can you
really go back to 2003, adjust each free agent by $1.8 million per 18 walks
above or below average, and think you have something more efficient than before?

Also: even if a factor of four happened to be reasonable, you’d expect
the observed coefficient to keep rising, as more contracts came up for
renewal. Instead, though,
we see a drop from 2004 to 2005, and, in 2006, it drops all the way back to the
previous value! Even if you
think the effect is real, that doesn’t suggest a market inefficiency -- it
suggests, maybe, a fad, or a bubble. (Which
doesn't make sense either, that "Moneyball" was capable of causing a
bubble that inflated the value of a walk by 300 percent.)

In my opinion, the magnitude, timing, and pattern of the difference
should be enough to make anyone skeptical. You can’t say, "well, yeah, the
difference is too big, but at least that shows that teams *did* pay more, at
least for one year." Well,
you can, but I don’t think that’s a good argument. When you have that implausible a
result, it’s more likely something else is going on.

Suppose I ask a co-worker what kind of car he has, and he says, "well,
I have three Bugattis, eight Ferraris, and a space shuttle." You don’t leave his office saying, "well,
obviously his estimate is too high, but he must at least have a couple of
BMWs!" (Even if it
later turns out that he *does* have two BMWs.)

--------

Second: the model is wrong.

We know, from existing research, that salary appears to be linear in
terms of wins above replacement, which means it’s linear in terms of runs,
which means it’s linear in terms of walks. That is: one extra walk is worth the
same number of dollars to a free agent, regardless of whether he’s a superstar
or just an average player.

The rule of thumb is somewhere around $5 million per win, or $500K per
run. That means a walk,
which is worth about a third of a run, should be worth maybe around
$150,000. (Turning an out
into a walk is more, maybe around $250,000.)

But the Hakes/Sauer study didn’t treat events as linear on salary. They treated them as linear on the *logarithm* of
salary. In effect, instead
of saying a walk is worth an additional $150K, they said a walk should be worth
(say) an additional 0.5% of what the salary already is.

That won’t work. It
will try to fit the data on the assumption that, at the margin, a $10 million
player’s walk is *ten times as valuable* as a $1 million player’s walk.

The other coefficients in the regression will artificially adjust for that. For instance, maybe plate appearances
takes over the slack … if double the plate appearances *should* mean 5x the
salary, the regression can decide, maybe, to make it only 2x the salary. That way, the good player’s walk may
be counted at 10 times as much as it should, but his plate appearances will be
counted at only 40 percent as much as they should.

There are other factors that work in one direction or another. For instance, a utility player’s walks
actually *should* be worth less, since, with fewer plate appearances,
differences between players are more likely to be random luck. Also, the authors used walk
*percentage*, and it takes fewer walks to increase walk percentage with fewer
AB. So, that will also work
to absorb some of the "10 times" difference.

But there’s not guarantee all that stuff evens out … in fact, it would
be an incredible coincidence if it did.

So that means that the coefficient of walks now means something other
than what you think it means. And,
so, when you have the coefficient of a walk jumping between seasons … you can’t
be sure it’s really measuring the actual salary assigned to the walk. It could be just a difference in the
distribution of plate appearances, or one of a thousand other things.

Again, I would argue that this flaw -- on its own -- is enough to have
us reject the conclusions of the study. When
you try to fit a linear relationship to a non-linear regression -- or vice
versa -- all bets are off. The
results can be very unreliable. I
bet I could create an artificial example where walks would appear be worth almost any
reasonable-sounding value you could name.

---------

These two objections are nice in theory, but I bet they won’t convince
many people who already believe the study’s conclusions are correct. My arguments sound too conjectural,
too nitpicky. There, you
have a real study with hard numbers and confidence intervals, and, here, you
just have a bunch of words about why it shouldn't work.

So, next post, I’ll get into the numbers. Instead of arguing about why my
coworker's sarcasm shouldn't be used as evidence, I'll try to actually show you his driveway.

UPDATE: Here's Part II.

## 3 Comments:

I don't doubt that the result is probably just noise, looking at some of the other jumps in the coefficients there (did someone write Anti-Moneyball in 1994?), but I'll nitpick one of your arguments. You're concerned that they use log salary when walks should be linear with salary, but then you note that the authors used walk percentage. Could using percentage make the log appropriate (or less inappropriate)?

An interesting topic of study is the opposite of the effect you describe in point 2, which is whether or not contracts should be contract value elastic/inelastic to (assumed) increases in player output. I'm of the persuasion that teams should be _inelastic_ in baseball, but highly _elastic_ in basketball. However, given the positively-skewed talent distribution in most sports, I imagine that people may argue against me in regards to baseball (I know one researcher who has). It certainly would change the linearity assumptions made by sabermetricians when analyzing win shares per dollar.

Thoughts?

It is astounding to me that someone could think that they have a reasonable regression formula when the coefficients have that much variation year to year. You've got to think that either the data is bad, the model is bad or there is just too much noise to tell anything.

Post a Comment

<< Home