Do NHL referees call "make up" penalties?
Among NHL fans, there's a perception that referees like to call "make up" penalties. If a ref has just called a minor penalty on one team, it's very likely that the next penalty will go to the other team.
I was skeptical, until I downloaded a bunch of data from The Hockey Summary Project ... they're like Retrosheet for hockey. (Their website is here, and if you want data downloads, you can join their group by going here.)
I looked at all penalties from 1953-54 to 1984-85 (for which the HSP data is almost complete). I eliminated all cases where there both teams got penalties at the same time. Then, I checked what was left, to see if the team that got the current penalty was less likely to get the next one.
Absolutely, very much so. There's a 60% chance the next penalty will go to the other team -- 59.7%, to be more exact. (But, since I'm not sure that database is complete, and I forgot to remove misconducts, and I didn't consider situations where both teams got a penalty but one team got an extra one, I'm happier to drop the decimal and just go with 60%.)
The effect is reasonably consistent over time, although it was a little stronger back in the six-team era. Here's a too-long chart.
1953-54: 62.7 888/1416
1954-55: 61.3 857/1397
1955-56: 60.6 912/1506
1956-57: 61.2 833/1360
1957-58: 58.8 793/1348
1958-59: 61.4 801/1305
1959-60: 62.3 723/1160
1960-61: 61.7 740/1199
1961-62: 62.0 821/1324
1962-63: 62.0 797/1286
1963-64: 61.8 826/1337
1964-65: 61.7 841/1362
1965-66: 58.6 820/1399
1966-67: 58.6 710/1212
1967-68: 60.0 1515/2527
1968-69: 59.5 1666/2800
1969-70: 60.5 1793/2962
1970-71: 60.4 1944/3220
1971-72: 57.8 1918/3317
1972-73: 60.6 2200/3633
1973-74: 58.8 2135/3628
1974-75: 56.7 2873/5069
1975-76: 56.3 2890/5130
1976-77: 57.2 2371/4144
1977-78: 59.1 2316/3916
1978-79: 59.5 2337/3925
1979-80: 59.3 3021/5091
1980-81: 60.1 3780/6293
1981-82: 60.7 3613/5957
1982-83: 60.3 3479/5774
1983-84: 60.2 3788/6296
1984-85: 60.6 3542/5847
-------------------------
Overall: 59.7 39597/58543
Even though the effect is real, we can't say for sure that it's referee bias. It could just be that, after a penalty, the penalized team plays more cautiously, trying to avoid a second penalty. Or, it could be that the just had the power play decides to play more aggressively.
(As an aside: why did penalties drop so much between 1975-76 and 1976-77? At first I thought it might be bad data, but then I checked power-play opportunities on Hockey Reference, and it checked out.)
Here's what I think is some relevant evidence. I broke down the stats by referee (minimum 300 datapoints). The database only has the referee named for about a quarter of the total games (mostly older ones), but I figure it's probably good enough to at least look at.
The first column is the main number, the percentage of penalties called against the team who drew the last one.
Pctg Z-sc Size Ref
---- ---- ---- ----------------------------
59.3 00.0 0509 Andy Van Hellemond
60.4 +0.4 0846 Art Skov
59.3 -1.2 1128 Ashley
60.0 00.0 0460 Bill Friday
57.0 -1.1 0537 Bob Myers
58.9 -0.3 0878 Bruce Hood
57.1 -0.9 0580 Bryan Lewis
59.3 -1.6 1453 Buffey
64.6 +1.6 0933 Chadwick
59.4 -0.1 0567 Dave Newell
60.4 -0.3 0356 Farelli
60.1 -0.3 0511 Friday
59.9 -0.1 0696 John Ashley
61.3 +0.8 0359 Lloyd Gilmour
61.3 -0.2 0789 Macarthur
56.7 -1.4 0319 Mehlenbecher
63.0 +0.5 0327 Olinski
60.4 -0.8 1906 Powers
55.2 -2.2 0698 Ron Wicks
63.7 +1.7 1095 Skov
57.7 -3.1 2197 Storey
63.1 +2.7 4717 Udvari
59.9 +0.4 0709 Wally Harris
The least "biased" referee is 55%, and the most "biased" is 64%. If you think it's only referee bias that keeps the numbers from being 50%, you'd have to think that EVERY referee is biased almost exactly the same way. It's hard for me to accept that none of the referees noticed the bias and saw fit to try to eliminate it.
The second column of the table is the Z-score, the number of standard deviations the referee is from expected (which is normalized to the seasons he officiated). Normally, you concentrate on those with at least plus or minus 2 SD. That gives you Red Storey and Ron Wicks (less biased than most) and Frank Udvari (more biased than most).
The standard deviation of the Z-scores was 1.29. If every referee were the same, and differences were only random, it would be 1.00. This suggests that there are real differences between referees. Specifically, the SD of referee tendencies (or "talent", you might say) is 0.8 (since 1 squared plus 0.8 squared equals 1.29 squared).
In English, you can perhaps interpret that as saying that the differences in the table are about half real and half random, with a little more random than real (since 1.00 is a little higher than 0.8).
The observed range is 55 to 64. Regressing to the mean, the actual range of referee tendencies is probably 57 to 62, or something like that.
So if you think it's referee bias, you have to explain why all the referees seem to be biased within such a tight range, especially, when, presumably, they are all working hard to be as unbiased as possible.
--------
Here's another interesting breakdown, by time since the previous penalty:
0:01 to 1:00: 69.1 (7000)
1:01 to 2:00: 64.7 (9444)
2:01 to 3:00: 68.5 (11778)
3:01 to 4:00: 64.2 (10574)
4:01 to 5:00: 61.2 (8831)
5:01 to 6:00: 59.7 (7470)
6:01 to 7:00: 58.9 (6328)
7:01 to 8:00: 58.3 (5333)
8:01 to 9:00: 56.6 (4399)
9:01 to 10:00: 55.5 (3719)
10:01 to 11:00: 56.7 (3000)
11:01 to 12:00: 55.5 (2591)
12:01 to 13:00: 53.8 (2193)
13:01 to 14:00: 55.2 (1837)
14:01 to 15:00: 53.3 (1565)
15:01 to 16:00: 53.1 (1376)
16:01 to 17:00: 53.1 (1135)
17:01 to 18:00: 52.4 (1019)
18:01 to 19:00: 51.9 (807)
19:01 to 20:00: 53.6 (757)
20:01 to 99:99: 51.8 (3883)
The longer the interval since the previous penalty, the less likely the next penalty will go to the other team. That's consistent with many theories. The "referees are biased" theory would say that referees "forget" to even things up as the game goes on. The "other team wants revenge and plays aggressively" theory would say that if they don't get revenge early, they don't need it as much later. And the "penalized team takes fewer chances" theory would say that as time goes on, the players "forget" that they have to be more careful.
So, the data doesn't help us choose, but it's interesting nonetheless.
By the way, the 1:01 to 2:00 group is an exception to the pattern, but that's probably due to power plays, since the first penalty is probably still in effect. Actually, I'd have expected that part to go the other way, with the first two minutes being *more* than 50 percent, on the logic that the shorthanded team playing in the defensive zone is more likely to be forced to take a penalty. But, that doesn't happen.
And here's an interesting breakdown of the first half of the first group:
81.9% within 5 seconds
78.1% between 6 and 10 seconds
76.0% between 11 and 15 seconds
73.8% between 16 and 20 seconds
69.3% between 21 and 25 seconds
67.0% between 26 and 30 seconds.
-----
Finally, one more question: after one team gets, say, four straight penalties, what happens then? Is there an even stronger bias for the other team to take the next penalty?
Yup.
57.1 after exactly 1 in a row (64858 datapoints)
64.0 after exactly 2 in a row (23850)
66.6 after exactly 3 in a row (7042)
66.0 after exactly 4 in a row (1781)
63.8 after exactly 5 in a row (442)
60.6 after exactly 6 in a row (127)
67.5 after 7 or more in a row (40)
------
So: what's going on? Any ideas?
UPDATE: Part 2 is here. Part 3 is here.
Labels: hockey, NHL, penalties, referee bias