### The Hamermesh umpire/race study revisited -- part I (addendum)

In the previous post, I ran some regressions on the Hamermesh data using a 1/100 sample. But thanks to a couple of commenters who suggested "gretl" regression software, I am now able to regress on the full dataset (a bit over a million pitches).

I thought the full data would give almost the same results as the smaller dataset -- after all, I cut down the data in exact proportion. But the rounding errors proved to be more important than I thought. When I divided by 100 and rounded to nearest strike, that was a maximum of an 0.5 strike error in the new, smaller sample, which is the equivalent of a maximum 50 strike error in the large sample. That's a fairly large difference, considering that we're dealing in hundredths of percentage points.

So I'm going to rerun the results for the larger sample, here, just to be as consistent as possible with the data in the study. If you're not interested in the details, I'll tell you the conclusions right now so you can skip the rest of this post. The full regression gives:

-- slightly different numbers;

-- slightly less evidence of same race bias;

-- and pretty much the same overall conclusions.

For those who want to see the updated regressions, keep reading.

-----

Here's the "expected" strikes matrix for the regression that did NOT include a variable for racial bias:

Pitcher ------ White Hspnc Black

--------------------------------

White Umpire-- 32.06 31.46 30.64

Hspnc Umpire-- 32.03 31.43 30.61

Black Umpire-- 31.83 31.23 30.41

Subtracting that from the real-life observations in the original Table 2 matrix :

Pitcher ------ White Hspnc Black

--------------------------------

White Umpire-- +0.00 +0.01 –0.03

Hspnc Umpire-- -0.12 +0.37 +0.18

Black Umpire-- +0.10 –0.36 +0.35

Converting that to pitches:

Pitcher ------ Wht Hsp Blk

--------------------------

White Umpire-- -15 +23 -08

Hspnc Umpire-- -29 +27 +01

Black Umpire-- +45 -50 +06

The results are a bit different than in the 1/100 sample. For instance, white umpires are even less biased in favor of their own race, by negative 15 pitches here vs. negative 4 pitches in the other regression. And the same-race cells add up to only +22, as compared to the +37 from before, which is also less suggestive of same-race bias.

-----

Now, here's the regression that includes the UPM variable for same-race bias:

Chance of a called strike equals:

30.4448%

---- plus .1906% if the ump is white

---- plus .1827% if the ump is hispanic

---- plus 1.377% if the pitcher is white

---- plus .8206% if the pitcher is hispanic

---- plus .0513% UPM (if the umpire matches the pitcher)

The old regression had the UPM term at .1169 – this one has it at less than half that, at .0513.

Now that we're using the full dataset, we can get a signficance level for the UPM parameter. It turns out it's not significant at all, with a p-value of about .83, far more than the .05 required for significance. In fact, the real-life data show *less* racial bias than if the data were random (which would give a p-value of 0.5).

Doing the calculation for baseball significance shows that the proportion of pitches affected by the presence of a same-race pair is somewhere between 1 pitch in 2855, and 1 pitch in 6140.

-----

It's interesting how a such small change in the observed percentages – caused just by rounding! – could bring on such a large difference in the estimate of racial bias. In part, that's because this regression is trying to reproduce the numbers in the nine cells, regardless of whether those cells contain 800 pitches or 700,000 pitches. While the number of observations certainly does affect the results of the regression, it seems that the raw numbers in the cells matter even more.

So the conclusions in the previous post still stand – no evidence of bias, no statistical significance, and no baseball significance either.

And, again, we haven't actually got to the Hamermesh study itself yet. We'll do that next.

Labels: baseball, Hamermesh update, race

## 6 Comments:

By the way, for those reading this far:

1. My results might still vary slightly from the real-life data, because the original study displayed the percentages rounded off to two decimal places.

2. My version of the Hamermesh Table 2 dataset is available on request if anyone wants it.

3. The "gretl" software appears to have gone into an endless loop (while consuming CPU cycles) when I asked it to generate the actual, fitted, and residuals. So I did those manually. (Any other suggestions for free regression software?)

I think maybe you mentioned this previously. Assuming the effect was actually significant, perhaps the Ump-Pitcher connection is only due to a handful of umps. Since the effect is so small, maybe there was 1 or 2 or so umps out there that made the difference. In other words, all umps aren't 0.05% race-conscious, but a few guys really are. Do any of the umps in the data stand out in their 'UPM?'

Maybe the connection has nothing to do with race, but with common colleges, hometowns, etc. Let's say an ump gives extra strikes to his fellow Boise native. They're probably both white. Or another ump gives extra strikes to another guy who comes from inner-city Detroit. Or from the same church...Or who reminds him of himself when he was young...

It wouldn't be racism, but a third common factor that shows up in the regression results. For some reason the name for this effect escapes me. Lurking-something...

Hi, Brian,

Quite possible. Over at "The Book," mgl and Tango also suggested that most of the effect might be due to one umpire ... they said Angel Hernandez would be good to look at because of his reputation. I don't have the data broken down by individual umpires, but it would be good to look at.

I agree that the effect might have to do with some commonality other than race ... my thought was, maybe it's just an effect of umpires liking certain players and disliking others. For instance.

I'm going to think up more theories in time for Part III of this.

Phil:

This would be a good project for one of the analysts working with the pitch/fx data. You could use that to establish the "true" strike% on all called pitches for pitchers of each race. That would solve the problem of relying on possibly-biased umpires to establish those true rates. (I know the data doesn't include all pitchers, but I would think the sample is big enough to give you pretty good estimates, at least for White and Hispanic pitchers.)

Right, when your "population" is 3 umpires, I wouldn't look to "race-bias", even if I had 1 million PA.

Imagine if the "population" was all pitchers who weighed at least 290 lbs. You'd think pitchers like that are biased to have a great K/BB ratio. Well, if that population was exactly CC Sabathia and David Wells, then, it's not their weight that is necessarily helping them. Or, what about looking at the "population" of lefty pitchers at least 6'10".

It's a complete joke to the other two hispanic umpires that their "population" includes a nutcase like Angel Hernandez.

情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,情趣,情趣,情趣用品,情趣用品,情趣,情趣,A片,A片,A片,A片,A片,A片,情趣用品,A片,情趣用品,A片,情趣用品,a片,情趣用品

A片,A片,AV女優,色情,成人,做愛,情色,AIO,視訊聊天室,SEX,聊天室,自拍,AV,情色,成人,情色,aio,sex,成人,情色

免費A片,美女視訊,情色交友,免費AV,色情網站,辣妹視訊,美女交友,色情影片,成人影片,成人網站,H漫,18成人,成人圖片,成人漫畫,情色網,日本A片,免費A片下載,性愛

色情A片,A片下載,色情遊戲,色情影片,色情聊天室,情色電影,免費視訊,免費視訊聊天,免費視訊聊天室,一葉情貼圖片區,情色視訊,免費成人影片,視訊交友,視訊聊天,言情小說,愛情小說,AV片,A漫,avdvd,情色論壇,視訊美女,AV成人網,情色文學,成人交友,成人電影,成人貼圖,成人小說,成人文章,成人圖片區,成人遊戲,愛情公寓,情色貼圖,成人論壇

av女優,av,av片,aio交友愛情館,ut聊天室,聊天室,豆豆聊天室,色情聊天室,尋夢園聊天室,080聊天室,視訊聊天室,080苗栗人聊天室,上班族聊天室,成人聊天室,中部人聊天室,一夜情聊天室,情色聊天室,情色視訊

視訊聊天室,聊天室,視訊,,情色視訊,視訊交友,視訊交友90739,免費視訊,免費視訊聊天,視訊聊天,UT聊天室,聊天室,美女視訊,視訊交友網,豆豆聊天室,A片,尋夢園聊天室,色情聊天室,聊天室尋夢園,成人聊天室,中部人聊天室,一夜情聊天室,情色聊天室,080中部人聊天室,080聊天室,美女交友,辣妹視訊

Post a Comment

## Links to this post:

Create a Link

<< Home