Did NFL teams discriminate against black coaching candidates? Part II
I posted recently about a "Rooney Rule" study that appeared in the Journal of Sports Economics. In that paper, the authors found that, from 1990 to 2002, NFL teams with black head coaches won 1.1 games per season more than teams with white head coaches. The authors took this as evidence that the NFL was discriminating against black candidates -- hiring only the best black coaches, and not the average ones.
A few more thoughts on the issue:
1. I'm not a subject matter expert (SME) on NFL coaching, but it seems to me very, very unlikely that a sample of 29 coaches, no matter how you selected them, could be, on average, as much as 1.1 games better than average. That seems way too high. Maybe one coach could, sure, under very specific circumstances (say, if he figures out he should start Tom Brady instead of Drew Bledsoe). But the average of 29 coaches? That would be nearly impossible, wouldn't it?
And it's not like the study chose the best 29 coaches -- they chose the only 29 black coaches there were. That means the best black coaches of the 29 would have to be substantially better than 1.1 wins, season after season. That, again, seems implausible.
It's a critical question, because, if the effect is too big to be coaching, the study is no evidence at all -- it literally has zero value!
Here's the logic. If you argue that the 1.1 games is statistically significant, then you're saying that there's evidence that the teams with the black coaches are significantly different, in some way, from the teams with the white coaches. You may believe that the difference is the coach's race. But since 1.1 is too big an effect to be just the coaches, the difference must be, in part, something else. So, since there must be something else going on, you have very little basis for thinking that there's evidence that even *any part of it* is coaching. After all, whatever the "something else" is, it could be just as easily responsible for all of the 1.1 as part of it. In fact, it could be responsible for *more* than 1.1 games, and the black coaches might be *worse* than the white coaches!
If you get an effect size that couldn't possibly be what you're looking for, then all you have evidence for is that there's something else causing the effect. That means there are confounding factors your study hasn't controlled for, which means you have no evidence at all for your particular hypothesis. That doesn't mean you're wrong -- it's not that you have evidence against it, it's just that you have no evidence *for* it.
This is a little bit counterintuitive -- it means a small effect is better evidence than a large effect. If you get statistical significance with a difference of 1.1 wins, that means nothing. But if you If you get statistical significance with a difference of 0.1 wins, now at least there's a chance that you're seeing something real.
2. In a different post a while ago, I quoted Bill James on psychology:
"... in order to show that something is a psychological effect, you need to show that it is a psychological effect -- not merely that it isn't something else. Which people still don't get. They look at things as logically as they can, and, not seeing any other difference between A and B conclude that the difference between them is psychology."
After this coaching study, it occurs to me that Bill's argument holds for *any* possible cause, not just psychology. Racial bias, for instance. Editing Bill's quote:
"... in order to show that something is a racial bias effect, you need to show that it is a racial bias effect -- not merely that it isn't something else. Which people still don't get. They look at things as logically as they can, and, not seeing any other difference between A and B conclude that the difference between them is racial bias."
The typical study will spend a lot of time and paragraphs and numbers persuading you that there is evidence that A and B are different at a statistically significant level. But then they'll give you only a few sentences *about what that evidence really means*. Shouldn't it be the other way around?
It's as if you're on trial for murder, and the prosecution spends five days nailing down how many millions of dollars you stand to inherit from the victim. They call a stockbroker, a banker, a real estate agent, all of whom testify for hours about how much the guy left you in his will, down to the penny. And then, after all that, the prosecutor says to the judge, "so, obviously, the accused must have done it. We rest our case."
That's backwards. Showing that A and B are different is the easy part -- it's just regression. The hard part is figuring out *why* A and B are different. Most of the effort should go into the argument, not into the statistics.
3. A reader was kind enough to send me a similar study from "Labour Economics." It's called "Moving on up: The Rooney rule and minority hiring in the NFL," by Benjamin L. Solow, John L. Solow, and Todd B. Walker. (Here's a press release.)
The authors create a model to predict whether a "level-two" assistant coach is promoted to head coach, based on performance, age, calendar year, years of experience, and race. It turns out that race is not significant, either before or after the Rooney Rule. Nonetheless, the coefficient for "minority coach" (most are black) is slightly negative (-0.6 SD) before, and slightly positive (+0.8 SD) after.
If you choose to interpret the pre-2003 coefficient at face value, even though it's not statistically significant (which I don't recommend), it's equivalent to two extra years of high-level coaching experience.