Did NFL teams discriminate against black coaching candidates?
The "Rooney Rule," adopted by the NFL in December, 2002, required all teams searching for a head coach to interview at least one black candidate. Between 2002 and 2009, the number of black coaches roughly doubled. Was this the result of the rule, or not?
A paper in the latest "Journal of Sports Economics," by Janice Fanning Madden and Matthew Ruther, looks at some evidence on the question. It's called "Has the NFL's Rooney Rule Efforts "Leveled the Field" for African American Head Coach Candidates?" A version of the paper can be found here (.pdf).
The authors find that before the Rooney Rule, black head coaches guided their teams to significantly superior records: an average of 9.1 wins (instead of the overall and white coaches' mean of 8). For first-year coaches, the difference was even bigger: 9.6 wins versus 7.1 wins.
They note that these numbers are consistent with the hypothesis that black coaches had to be significantly better than average to get the job. That suggests discrimination on the part of hiring teams.
Again, that was before the Rooney rule. Afterwards, there was no appreciable difference between white and black coaches. Is the difference between the two time periods significant?
The authors start by doing a t-test on the pre-Rooney race difference of 1.1 games, and they find significance at 2.57 standard deviations from the mean. However, I'm not so sure about that. I think their t-test assumes all observations are independent. In real life, they're not. A team's record this year is positively correlated with its record last year. One black coach being hired by one (perennially) good team might have made all the difference.
And, indeed, the authors do find that black coaches get hired by better teams. They don't give us the data, but they mention it:
"... the teams that hired African American coaches in the 1990-2002 period had better records prior to the hires ... "
So they run a regression that tries to control for team quality, and they still get a significant result. But that regression uses payroll as a proxy for quality. The relationship between payroll and wins is probably pretty decent, but not as good as other possibilities. I'm sure you could find lots of teams that were excellent despite average payrolls, and, again, all it might take is one black coach to be hired by such a team.
Then, they try a regression that uses the Sports Illustrated preseason prediction as a variable. Again, that's not perfect, but it should be pretty good. Actually, it should be better than pretty good. SI writers are subject matter experts, and will use a wide assortment of data to make their predictions. They're probably not perfect, of course, and they're not as good as Vegas odds might be, but I think this is a pretty good way of doing it.
And, now, the result is no longer significant. It's only 1.43 SD, and probably less when you correct for the fact that seasons aren't independent.
But, in fairness, and as the authors mention, that may understate the significance if the SI staff adjust their predictions for the realization that the coach is of higher quality. I'm guessing that's not much of a factor, though.
The authors then look at firings. Controlling for several variables, including wins, whether the team made the playoffs, how many years the coach was with the team (and the square of that figure), they find that, before the Rooney rule, black coaches were more likely to be let go. After the Rooney rule, the difference disappeared.
But ... aren't coaches fired for performance relative to expectations rather than for raw performance? Since the black coaches started with better teams in the first place, you'd expect them to get fired faster for a given record, because it's easier to disappoint from a higher level than from a lower level. If you start 10-6, and then fall to 7-9, your job will be in jeopardy. But if you go from 8-8 to 7-9, you're more likely to be safe.
Since that result is only barely significant (2.15 SD), I'm guessing that if you used more realistic "disappointment" variables, the significance would disappear.
Finally, the authors look at offensive and defensive coordinators. They find no significant difference in the performances of black and white coordinators, either before or after the Rooney Rule. However, they do find that in the entire period of the study -- 1984 to 2009 -- not even one black offensive coordinator was promoted to head coach. The authors say that's statistically significant at p=.01.
But, again, I think the authors are assuming independence, which causes the significance level to be overstated. Moreover, the authors' own Table 8 shows that black offensive coordinators worked for worse teams than white offensive coordinators. After the Rooney Rule, for instance, black offensive coordinators worked for teams in the 34th percentile of performance, while black defensive coordinators worked for teams in the 54th percentile. Perhaps that explains part of the difference.
Also, there are many comparisons in the authors' charts, so it becomes more likely that at least one of them will show significance. My unscientific feeling is that this one datapoint is a random anomaly, and, in any case, not all that significant anyway.
My overall impression when reading this paper was ... geez, there were only 29 black coaches in the pre-Rooney Rule era. Why not actually look at them and see if their performance was unexpectedly good? That would require the assistance of subject matter experts (SME) -- people who knew the NFL -- which, admittedly, is not usual for an academic paper of this sort. And, of course, any SME judgments would necessarily be subjective.
But, still, if you want to get the best answer to the question, instead of the most journal-publishable answer to the question, that's the way to do it. Maybe coach X was hired just when player Y blossomed into a superstar, and so it would be incorrect to attribute the team's playoff success to the coach. Maybe black coaches are unproven, and teams are willing to hire an unproven coach only when they have a hugely disappointing season -- which suggests bad luck, which suggests maybe they bounce back to their previous level of excellence.
If there were thousands of datapoints, you couldn't check all those things. But, 29? That doesn't seem too difficult an obstacle. And, it's telling that the regression that comes closest to doing that -- the one that takes into account the SMEs at Sports Illustrated -- was the one that didn't find statistical significance.