Home field advantage seems to disappear in 3-run games
The first article in the new issue of JQAS is a baseball paper on home field advantage (HFA), by William Levernier and Anthony Barilla. Unfortunately, the authors appear to be unfamiliar with some pertinent sabermetric results, and there are, in my opinion, a couple of problems with their analysis.
They start by examining run scoring: they find that in games of 2004-2005, the home team scored .093 runs per game more than the visiting team. This number looks small, and so the authors conclude this "little supports" the explanation that home teams are "more proficient at scoring runs."
Of course, .093 runs per game is reasonably significant. Using the rule of thumb that 10 runs equals one win, the effect the authors found is about 1.5 wins per season -- or a winning percentage difference of .009. Given that the entire home field advantage in 2004-5 (as found by this study) was .036, the run differential explains 25% of it.
But the authors didn't take into account the fact that home teams often don't bat in the bottom of the ninth inning. That is, the home team scores only .093 runs per game more than the visiting team, but *in fewer opportunities*. If we make a rough guess that home teams lose the equvalent of 36 full innings over an 81-home-game season, and that they score 5 runs per nine-inning game on average, that's 20 runs right there -- another two wins out of 81 home games, or 4 wins in 162. That brings home teams up to about .534, almost exactly what the authors found.
(The difference is some combination of the home team both scoring more runs and preventing opposition runs.)
The authors then note that the observed HFA is higher in close games, and lower in blowouts:
.602 Games decided by one run
.539 Games decided by two runs
.500 Games decided by three or more runs
This surprised me a bit; I didn't expect to see this kind of effect. Why can it happen? I can only think of two explanations:
First, you have a small effect caused by walk-off games where the home team doesn't get to pile on more runs (but the visiting team does). Second, blowout games are disproportionately won by the better team, and better teams have smaller home field advantages than average teams. (I think Bill James showed this once, and started by observing that a 1.000 team must have a home field advantage of zero.)
It seems to me that these two factors alone shouldn't be enough to account for home teams' .500 record in 3+ run games. But I don't know. Are there other explanations?
In any case, the authors run a logistic regression on home/road, runs scored, run differential (unsigned), and roster size (25 or 40). They find everything significant except home/road. I'm not sure how to interpret that, but I think the idea is that if you know that the home team scored 7 runs, you're pretty much assured that they won, home or road. If they scored 1 run and it was a two run game, you know that they lost, home or road. The leftover games may be sufficiently few that a statistically significant result doesn't appear -- especially considering that runs scored aren't adjusted for innings.
In any case, it's a bit weird trying to predict home field advantage based on the run differential of the final score of the game. Shouldn't the predictions go the other way? Following the authors' logic, you could say the HFA is infinite in games won by walk-off home runs.
It may be true that teams were only about .500 in games decided by more than two runs. But the statement "HFA is non-existent in games decided by more than two runs" is false. There is a home field advantage in those games, but selective sampling makes it look like there isn't.