Umpires' racial bias disappears for other years of data -- Part II
Hopefully this will be my last post on that Hamermesh umpire bias study ...
Two days before I was going to talk about the study at a presentation last week, I discovered that someone else was presenting a poster on it at the same conference.
Jeff Hamrick and John Rasp ran the data for 21 years of MLB, 1980-2010; the original Hamermesh study was for three years, 2004-2006.
Here's the chart they got. The numbers are, as usual, percentages of called pitches that were strikes:
Pitcher ------ White Hspnc Black
White Umpire-- 30.70 30.60 29.30
Hspnc Umpire-- 31.70 31.30 28.80
Black Umpire-- 30.80 30.30 28.70
(Their numbers are actually only to one decimal place, not two, but I added the zero to make the numbers line up with the previous charts.)
There is no evidence of bias here ... the diagonal entries are not really any bigger than they "should" be. Still, after controlling for a whole bunch of stuff, including the identity of the pitcher, batter, and umpire, they got a significance level of 0.075. That's below the standard .05 threshold. Even if you accept the .075 as real, the amount of bias is very, very small.
As you may recall, the 3-year sample, which was significant at (I think) somewhere between .01 and .05, could have been caused by a mere 35 pitches per season. This study used seven times as much data, and got a lower significance level. The square root of 7 is about 2.6, so we divide 35 by 2.6 to get about 13 pitches per season. And, since the new study found only maybe 1.7 SDs instead of 2.5, we divide by another 1.5 to get maybe 9 pitches per season.
(That 9 pitches is the minimum, if maybe only one or two umpires are biased. It could be a lot more, if it's white pitchers who are biased. But there's no way to know for sure from the data.)
However, one thing we have to consider. The Hamermesh study found an effect only for low attendance situations. This new Hamrick/Rask data is for all attendance situations. However, when Hamrick and Rask included attendance in their regression, they got a significance level of 0.977, which shows the effect is almost completely random with regard to attendance.
So it's probably safe to conclude that, when you extend the Hamermesh study from 3 seasons to 21, the effect goes away.
Thanks to Jeff and John for making the data available.