"Scorecasting:" is home field advantage just biased officiating?
There's a new book that's about to come out, that you might have heard about. It's called "Scorecasting," and the publisher was kind enough to send me a review copy. It's basically a Freakonomics for sports, in intent, in tone, in writing style, and even down to its similar authorship -- one academic economist (Tobias Moskowitz, a finance professor) and one journalist (Sports Illustrated's L. Jon Wertheim).
The book's website is here; it has an excerpt from one of the chapters, and you'll find other excerpts online if you search the authors' names.
The topics will be very familiar to sabermetricians and regular readers of this blog. There are chapters on the Hot Hand, on competitive balance, on NBA refereeing, on steroids, and so on. There isn't a huge amount of breakthrough stuff there, although there are certainly a few new insights. Mostly, the authors summarize what they've learned from academic articles on sports, and they add the results of a few little studies they did themselves.
Alas, by concentrating on the journals, they've missed much of the scholarship of us amateurs. For instance, in the chapter of competitive balance, they argue that baseball is less balanced than the football because MLB teams play 162 games, while NFL teams only play 16. That, of course, is only a small part of the story. There are other parts, such as the distribution of talent in the league, and the internal details of the game itself. Tom Tango has effectively solved the problem of comparing different sports (here's just one of his many posts), but the authors seem unaware of that (although they do mention Tango in the book once, in a mention of leverage, referring to him as a "stats whiz.")
And they're occasionally completely off, as when they say the sample size of the MLB playoffs is enough that the best team ought to win the series.
Still, a lot of the material is solid; the authors are at their strongest when they're reviewing one of the more famous and established studies, like the Romer "fourth down" paper, and the Massey/Thaler NFL draft study. (I'll probably do a full review of the book later, but, for now, just picture a sports Freakonomics that's not as rigorous as most of the websites, but does mention a few things that you didn't know before.)
Anyway, I'm going through the book, and suddenly I see that the authors claim to have solved the problem of what causes home field advantage (HFA). That was sort of shocking. My personal subjective view is that HFA is the biggest unsolved problem in sabermetrics, and very little progress has been made. There's so little progress, in fact, that I've started to take seriously a hypothesis that seems way off the wall -- the theory that humans have built in evolutionary programming that makes them more physically and mentally effective when defending their own turf. (I'm not saying that it's necessarily true, just that I have a bizarre attraction to it.) In that light, finding these HFA claims was a bit like picking up a newspaper article on math, and finding that the reporter has proved Fermat's Last Theorem.
So what's the authors' solution to the long-standing HFA conundrum? Refereeing. After dismissing most of the usual suspects (fan enthusiasm, travel, tailoring the team to the park), Moskowitz and Wertheim believe that most, or all, of HFA can be explained by biased officiating.
They list a bunch of supporting evidence, which I'll summarize here. If you want to follow along, some of this stuff is also in a long excerpt from the book that appeared a couple of weeks ago in the Jan. 17 issue of Sports Illustrated (the article, unfortunately, does not appear to be online).
1. In soccer, the referee controls how much extra "injury time" is added to the end of a match. It turns out that injury time is longer when the home team would benefit. When the home side was ahead by a goal, there were two minutes of injury time, on average, in a sample of Spanish league games. But when they were *behind* by a goal, it was four minutes.
2. In 1998, the point structure changed to give the winning team three standings points instead of two. Immediately, the above injury time bias increased.
3. The same bias exists in England, Italy, Germany, Scotland, and the US.
4. "... home teams receive many fewer red and yellow cards even after controlling for the number of penalties and fouls on both teams."
5. In baseball, the authors looked at the percentage of called pitches that are strikes. In crucial situations (high leverage), home teams got a lot more favorable calls. But in low-leverage situations, it was *road* teams that got more favorable calls. "This makes sense," the authors write. "If the umpire is going to show favoritism to the home team, he or she will do it when it is most valuable -- when the outcome of the game is affected the most. You might even contend that it noncrucial situations the umpire might be biased against the home team to maintain an overall appearance of fairness."
6. " ... the success rates of home teams in scoring from second base on a single or scoring from third base on an out -- typically close plays at the plate -- are much higher than they are for their visitors in high-leverage/crucial situations. yet they are no different or even slightly less successful in noncrucial situations."
7. Over a large sample of 5.5 million pitches, "called strikes and balls went the home team's way, *but only* in stadiums without QuesTec ... Not only did umpires not favor the home team when QuesTec was watching them, they actually gave *more* strikes and *fewer* balls to the home team. In short, when umpires knew they were being monitored, home field advantage on balls and strikes didn't simply vanish; the advantage swung all the way to the visiting team."
8. In low-leverage situations, even in non-QuesTec parks, there was no bias at all.
9. The authors then analyzed pitches using Pitch f/x data, to see how many pitches were miscalled based on the recorded location. For pitches on the corner of the strike zone, there were more miscalls in the home team's favor than in the visiting team's favor. The home advantage was largest on full-count pitches, followed by other three-ball counts, other two-strike counts, and, lastly, all other counts. So, the more crucial the pitch, the greater the HFA.
10. "Over the course of the season, all of this adds up to 516 more strikeouts called on away teams, and 195 more walks awarded to home teams than there otherwise should be, thanks to the home plate umpire's bias. And that includes only terminal pitches -- where the next called pitch will result in either a strikeout or a walk. Errant calls given earlier in the pitch count could confer an even greater advantage on the home team."
11. "This adds up to an extra 7.3 runs per season given to each home team by the plate umpire alone. That might not sound significant, but cumulatively, home teams outscore their visitors by only 10.5 runs in a season." [That latter number isn't correct ... in 2010, it was 23.5 runs. 23.5 runs equals 2.35 wins out of 81, which is a .530 winning percentage. (UPDATE: Oops! I forgot to adjust for the home team not batting in the bottom of the ninth when leading. If you adjust for that, the home advantage is a lot bigger than 10.5 or 23.5 runs.)]
12. In the NFL, "Home teams receive fewer penalties than away teams -- about half a penalty less per game -- and are charged with fewer yards per penalty. Of course, this does not necessarily mean officials are biased. But when we looked at more crucial situations in the NFL ... we found that the penalty bias [increases]."
13. When instant replay came to the NFL, the home winning percentage declined from 58.5 percent (1985-98) to 56 percent (1999-2008). "Before instant replay, home teams enjoyed more than an 8 percent edge in turnovers ... When instant replay came along ... the turnover advantage was cut in half." Also, "the home team does not actually fumble or drop the ball less often than the away team ... they simply lose fewer fumbles than away teams. After instant replay was installed, however, the home team advantage of *losing* fewer fumbles miraculously disappeared, whereas the frequency of fumbles remained the same. ... In close games, where referees' decisions may *really* matter ... home teams enjoyed a healthy 12 percent advantage in recovering fumbles. After instant replay was installed, that advantage simply vanished."
14. After instant replay, there was no change in the relative frequency of home and away penalties. That might be because penalties can't be challenged.
15. Away teams have their challenges upheld 37 percent of the time, versus 35 percent for home teams. But when the home team is losing, the visiting team wins 40 percent, versus only 28 percent for the home team. So it looks like the referees favor the home team more when they need it more.
16. In the NBA, fouls and turnovers that are not subjective referee calls (like shot clock violations) are equal for home and road teams. But for subjective calls, away teams get between 1 and 1.5 more of those per game. Visiting players are 15 percent more likely to be called for traveling than home players.
17. "How much of the [HFA] in the NBA is due to referee bias? If we attribute the differences in free throw attempts to referee bias, this would account for 0.8 points per game. If we gave credit to the referees for the more ambiguous turnover differences ... this would also capture another quarter of the home team's advantage. Attributing some of the other foul differences to the referees and adding the effects of those fouls (other than free throws) ... brings the total to about three-quarters of the home team's advantage. And, remember, scheduling in the NBA [visiting teams play more back-to-back games than home teams] explained about 21 percent of [HFA]. This adds up to nearly all of the NBA home court advantage."
18. In the NHL, home teams get 20 percent fewer penalties and receive fewer minutes per penalty. "On average, home teams get two and a half more minutes of power play opportunities ... than away teams. That is a *huge* advantage." If you multiply that by a 20 percent success rate, you get an extra 0.25 goals per game for the home team. Since the average overall differential is only 0.3 goals for the home team, "this alone accounts for more than 80 percent of the home ice advantage in hockey."
19. There is no apparent HFA in shootouts, where refereeing makes no difference. Also, in NBA foul shooting. And, even in Pitch f/x data. Visiting pitchers throw no worse, according to Pitch f/x, than home teams do. It's only the umpires' calls that are different.
It's an impressive array of evidence and argument. But, at least some of it doesn't hold up.
Look at number 5: in baseball, in low leverage situations (I believe this means the bottom 50%), the authors say that umpires favor the visiting team. That would mean that, in less critical situations, we should find a "visiting field advantage." But home teams outscore visiting teams even in medium-leverage situations. For instance, here's the breakdown of home and road runs scored by inning (1954 to 2007). The last column is the percentage by which the home team outscored the visiting team:
1 61872-52071 +18%
2 46823-42539 +10%
3 53590-48188 +11%
4 53357-49593 +8%
5 53203-48448 +10%
6 54401-50603 +8%
7 52231-48641 +7%
8 50451-47781 +6%
You would think that you'd have more high-leverage events in the later innings -- but the HFA goes *down* in the last few innings, not up.
But I might be wrong about that, maybe the eighth inning has no more high-leverage situations than the first inning (after all, there are more 8-1 games in the eighth than in the first). So, let's look at innings where, at the start, one team was at least four runs ahead of the other. Those should all be low leverage, for the most part, and should show the visiting team having the advantage.
2 2543-2139 +19%
3 4583-4176 +10%
4 8817-7801 +13%
5 10940-10057 +9%
6 14371-13279 +8%
7 15698-14583 +8%
8 16935-16180 +5%
Now, this could be just because, in a four-run game, the home teams are a lot better than the visiting teams. What if we look at situations when the *visiting* team is ahead by at least four runs? Then, we should see a huge effect in favor of the visiting team: first, they're probably a much better team, and, second, the low leverage means the umpire should still be favoring them.
But, no. Even in those situations, the home team still performs a little better, on average, having the advantage in five of the seven cases:
2 957-1022 -6%
3 1974-1799 +10%
4 3609-3355 +8%
5 4435-4645 -5%
6 6269-5705 +10%
7 6627-6562 +1%
8 7309-7179 +2%
So, I just don't see it. If umpires DO call more strikes for visiting teams in low-leverage situations, maybe that's compensated for by those pitches actually being strikes ... but being worse pitches in location and movement and velocity. That is, maybe HFA comes from pitchers throwing more accurately, but more hittably.
In any case, if my data are correct, and the authors' data are also correct, it can't be the case that the authors' findings are an explanation of HFA.
Now, let's look at number 18, the hockey case. The authors argue that HFA is caused almost entirely by penalties. If that's the case, then you'd expect home and visiting teams to have similar numbers at equal strength.
They do not. The NHL.com website has home/road goal breakdowns. Here they are for the 2008-09 season, averaged by team:
Even strength... 124-110 (home advantage 12.5%)
Power play...... 35-30 (home advantage 15.1%)
Shorthanded..... 4-4 (home advantage 1.0%)
There's almost as large an advantage at even strength as there is on the power play. Admittedly, the extra power play boost is probably caused by more penalties, as the authors say, but the overall contribution of the extra penalties seems to be pretty small.
Just to make sure it wasn't a fluke, I ran the same numbers for 2009-10:
Even strength... 121-106 (home advantage, 13.9%)
Power play...... 30-25 (home advantage, 21.0%)
Shorthanded..... 4-3 (home advantage, 32.9%)
A bit more extreme in favor of power play. But how do you explain the sizeable advantage for home teams at even strength? One possible explanation is that visiting teams have to play an overcautious game, to avoid being penalized by biased referees. But for a 13.9% disadvantage, that caution would have to be way out of line, wouldn't it?
Both of these examples -- and, by the way, they're the only two I checked -- cast doubt on the authors' hypothesis that HFA is almost all refereeing. I have never disagreed that *some* of it might be refereeing, but there's obviously a lot more going on.
And I have to say that the authors have indeed provided a blueprint for how this kind of research should go -- try to break down performance into its constituent parts, and check those.
If there's no home advantage in foul shooting, why not? If there's no HFA in hockey shootouts, why not? If we get a list of areas with high HFA, and a list of areas with low HFA, we can maybe start narrowing down what the causes might be.
But the authors have amassed a lot of evidence, and the must be something to at least some of it, no? For instance, I can't think of any explanation for the injury time phenomenon (maybe I should look up the relevant study). And it seems reasonable that referees will call more fouls on visitors, even if they're unbiased. Why? Because they might be using crowd noise as a guide ot what is and what isn't a foul. If the fans scream when a visitor trips an opponent, but not when a home player trips an opponent, that will simply make it more likely that an unbiased referee will have enough evidence to correctly "convict" the visiting player.
But the question is not just whether referee bias exists, but *how much* of it there is, and how much of HFA it's responsible for. The authors of "Scorecasting" seem more focused on "existence" evidence, and it seems to me they've made only a small dent in terms of explaining the real-life observed HFA. I wish the authors had provided more details of some of their findings, so we can figure out what's going on and maybe quantify it a bit more ... but I guess it is what it is.
I know there are a lot of working sabermetricians reading this ... if you have expertise or evidence on any of the authors' points, please weigh in.
UPDATE: I have a full review of the book here.