How can we separate referee bias from other sources of home-field advantage? Part II
You have two competing theories about home field advantage (HFA). Theory one is that HFA is almost all the result of referee bias. Theory two is that referee bias is part of the cause, but only a small part, and that there are other factors involved.
What kind of evidence would distinguish between the two? You need something that's independent of referee or umpire influence. That's harder than it looks. For instance, in the NHL, you'd expect most of the referee bias to come from penalty calls. However, even if you leave out power plays, you can see home teams doing significantly better than road teams. With both teams at full strength, the home side still outscores the visiting side.
Does that prove that something other than officiating is at work? No, it doesn't. Because it's possible that, knowing that the referees are likely to call more infractions against them, road teams are forced to play a less aggressive style of hockey, one that costs them goals even at full strength. So that part of HFA could still be referee bias at work, even if you don't see it in the penalty calls.
In order to more legitimately show that there are other factors at work, you need to find some statistic where there's no plausible story about how the refereeing could be the cause.
For basketball, we have foul shooting. The evidence shows that HFA in foul shooting exists, and is about what you'd expected it to be. Can we do the same for baseball? A couple of weeks ago, I asked for suggestions. Someone e-mailed me to look at fastball speed. That's a great idea, but I don't know where to find data (can anyone help)? So, I tried something else: wild pitches.
It's hard to see how a wild pitch (or passed ball) could be the result of umpire bias. It's a straightforward call based on what happens on the field. I suppose that it's possible that if the catcher tries to throw out a baserunner, and the umpire is more likely to call that runner out, that would be a bias for the home team (since a WP is not charged when a runner is thrown out at the next base). But that happens so infrequently that it's not an issue.
Also, it has been found (by John Walsh, in the 2011 Hardball Times Annual), that home plate umpires favor home teams in their pitch calling (by 0.8 pitches per game). And so, you could argue that maybe visiting pitchers have to pitch a little differently to overcome that disadvantage, and that's what causes any increase in wild pitches.
That's possible. However, you'd expect any effect to go the other way, to *fewer* wild pitches, wouldn't you? If the strike zone is smaller for the visiting pitcher, he's more likely to compensate by generally being closer to the strike zone. It's hard to see how that would lead to more pitches in the dirt.
So, all things considered, I think wild pitches are a pretty good candidate. For all MLB games from 2000 to 2009, I figured how many wild pitches were thrown by home teams and road teams. I omitted wild pitches on third strikes, for reasons I'll explain later.
The raw numbers:
So far, it looks like a definite HFA exists. But, since a wild pitch can only occur with runners on base, maybe it's just that home teams aren't in that situation as much as visiting teams (since home teams generally pitch better). Also, maybe road teams throw more pitches per batter than home teams, for whatever reason.
So I adjusted for the number of pitches thrown with runners on base (leaving out intentional balls). The results, per 100,000 pitches:
Road: 460 WP per 100,000 pitches
Home: 448 WP per 100,000 pitches
That's a 2.7% difference, which is reasonably large. For passed balls, the difference was even wider:
Road: 97 PB per 100,000 pitches
Home: 91 PB per 100,000 pitches
The difference is 6.6% this time.
In case you're interested, there were 1,506,500 pitches for the visiting team, and 1,522,647 pitches for the home team. The difference is mostly the home team always having to pitch the ninth inning, partially offset by the fact that the home team faces proportionally fewer situations with men on base.
One other possible objection is that the probability of a wild pitch could vary by count. Maybe "wasting a pitch" on 0-2 leads to more wild pitches than at other counts.
So, I rechecked, but included only pitches at 0-0 counts, which seems like a reasonable control. The results:
Road: 428 WP per 100,000
Home: 401 WP per 100,000
Road: 116 PB per 100,000
Home: 110 PB per 100,000
Very similar. I'll check the statistical significance before I present this at SABR next week, but I'm pretty sure that the chance of this happening by luck is pretty low.
What does this mean for HFA?
For the 1988 American League (which is all I have handy right now), the linear weight for a wild pitch was about 0.27 runs. For the entire sample, the total difference was 181 pitches (combining WP and PB). So that's 49 runs, which is about 5 wins.
5 wins divided by 30 teams, divided by 10 years, divided by 81 home games, equals 0.0002 wins per game. Since the total HFA per game is about .040, that means that wild pitches (+PB) make up one-half of one percent of home field advantage.
That seems reasonable to me.
So, I think we have some good evidence that there's HFA in aspects of the game not influenced by the umpire -- namely, wild pitches.