Saturday, October 21, 2006

Study: error rates and official scorer bias both declining

The latest issue of JQAS came out this week, and its five articles include one paper on baseball.

Simply titled “
Baseball Errors,” it’s a nice research study on the history of error rates. It’s written in a more conversational style than most other academic papers, and its citation list includes a number of mainstream articles and interviews.

Authors David E. Kalist and Stephen J. Spurr start out with a discussion of official scorers, who, of course, are the ones who decide where a ball is a hit or an error. They include a bit of history, including a discussion on how much scorers have been paid over the years (this year, $130 per game).

Kalist and Spurr are economists, so their interest turns to whether scorers may be biased. One of their interesting counterintuitive observations is that if the scorer does indeed favor the home side, he should call more errors on the home team than on the visiting team. That’s because an error on the visiting team hurts the home batter’s stats, depriving him of a base hit. But when the home team misplays a ball, either the home pitcher’s record is hurt (via a hit allowed and potential earned runs), or the home fielder’s record is hurt (via fielding percentage). There is no obvious reason for the scorer to prefer the hitter to the pitcher, and so there’s no incentive to call fewer errors on the home team.

The authors also quote LA Times writer Bill Platchke “that scorers are allowed to drink on the job, fraternize with players, and play in rotisserie leagues in which a fictitious ‘team’ can win thousands of dollars.” And, they note cases in which scorers made controversial calls that kept alive a hitting streak or an no-error streak. “While we certainly do not claim that our survey … is complete,” they write, “all the articles we have seen involved calls made in favor of a player on the home team.”

So, are scorers actually biased towards the home team? In the paper’s second regression, using Retrosheet data for games from 1969-2005, Kalist and Spurr find that from 1969-1976, scorers called 4.2% more errors against the home team. But from 1977-2005, the difference between home and visitors was essentially zero. Is that the result of an decreasing trend? We don’t know. The authors don’t give us year by year data for this variable, just the 1976 cutoff.

The rest of the regression doesn’t tell us a whole lot we didn’t already know. Error rates have steadily fallen over the last 36 years, fewer errors are committed on turf, expansion teams commit more errors than established teams, and the more balls in play, the more errors. What was interesting is that signficantly more errors were committed in April than in other months (5% more than in May, for instance), and the difference is statistically significant.

The paper’s other regression predicts annual error rates, instead of per-game rates. The authors divide baseball history into decades, and again find, unsurprisingly, that error rates have steadily declined. One finding that was surprising is that during World War II, error rates continued to decline from pre-war levels (even though conventional wisdom is that the caliber of play declined in almost all other respects).

The authors also tried to determine a relationship between error rates and speed. However, they used stolen base rates as a proxy for speed, and, over periods as long as a decade, SBs are probably much more related to levels of offense than to leaguewide speed. Managers play for a single run more frequently when runs are scarce and games are close – so you’d expect more SBs in the sixties because it was a low-scoring era, rather than just because those players were faster than players in the 70s. (And, of course, more steals may simply mean the catchers’ arms aren’t as good.)

However, the study does find a very significant relationship between stolen base rates and errors, in a positive direction. So I may be wrong. Or there might be other factors operating; for instance, with steals being more important, managers started filling their rosters with speedier players.

Also, the National League sported error rates 2% higher than in the American League – and the study starts in the 19th century, so the difference can’t be just the DH. The authors discuss this a bit (although you wish they had included a DH dummy variable). They found that the NL error rate was much higher (they don’t say by how much, but p=.0006) than the AL rate from 1960-72. And they note that fielders may be better in the AL, since old sluggers can DH instead of playing defense.

One feature of the article I liked is that not only do the authors give a brief explanation of what “multiple regression on the log of the error rate” means, but they also illustrate how to interpret the results mathematically. It’s a small courtesy that acknowledges that JQAS has many layman readers, and helps make this study one that should be of interest to many different kinds of baseball fans, not just the sabermetrics crowd.


Post a Comment

<< Home