Corsi, shot quality, and the Toronto Maple Leafs, part III
My last two posts argued that there might be an inverse relationship between shot quality and shot quantity -- or, in other words, between Corsi and shooting percentage (SH%).
Most hockey analysts disagreed with me. Which is fair enough; so far, the only evidence I've really put forward is the negative correlation, the last two years, between Corsi percentage and shooting percentage. In each of the seasons 2011-12 and 2012-13, in even-strength tie-game situations, the team correlation was -0.24. Before that, it was close to zero.
I spent the last week searching for other evidence, and I think I finally found something.
If shooting percentage in tie games is just luck -- or mostly just luck -- you should expect no correlation between shooting percentage this year, and shooting percentage next year. And that's pretty much what you see in the data. From 2007-08 to 2011-12, the team "that-season-to-next-season" correlation was only about 0.03 -- virtually zero. (All "Corsi" numbers in this post are even-strength tie-game situations only.)
Doing the same thing to predict goal percentage from goal percentage (the percentage of goals in your games that are yours), I found a stronger relationship, a correlation of 0.27.
And, as the hockey analytics community has shown, this year's Corsi is even better for predicting next year's goal percentage. That correlation came out to 0.40.
So: Corsi is an excellent predictor, and shooting percentage appears to be just random. So, you'd expect that if you try to predict next year's goals from this year's Corsi, knowing this year's shooting percentage shouldn't be much of a help. Right?
Well, apparently, that's *not* right. I ran that regression, and, surprisingly, shooting percentage was significant, both statistically and practically. The SH% variable had a p-value of .09, and the actual effect was large.
It turned out that every point of shooting percentage was worth 0.83 points of goals next year. So, if the league shooting percentage was 7.5%, but you had 8.5%, then, all things being equal, you should score 50.83% of goals next year.
Compare that to Corsi. Every point of Corsi was worth 0.72 points of goals next year. So, all things being equal, if this year you took 51% of shots, next year you should score 50.72% of goals.
So, the two were roughly similar, on a percentage-point-by-percentage-point basis.
In the particular case of the 2012-13 Maple Leafs ... they had a shooting percentage of 10.82, as compared to the league average of 7.77 (actually, that 7.77 is equally weighting all teams, so the true mean is probably a bit lower that that, but never mind). The difference of around 3.1 points corresponds to a goal percentage difference of 2.6. That's quite a lot.
Of course, the Leafs' low Corsi percentage of 43.8 more than makes up for that. So, I'm NOT saying this is a *good* strategy on the part of the Leafs (if, in fact, it *is* a strategy). I'm just saying that I think that (a) the high shooting percentage is partly real, and (b) the low Corsi is partly the flip side of the high shooting percentage.
But the relationship might even be stronger than that. The above coefficients were for a regression that included five seasons worth of data, 2007-08 to 2011-12. If I limit it to only the most recent three seasons, the results are even more striking.
For the past three seasons, the coefficient of SH% is substantially higher, at 1.21. Meanwhile, the coefficient of Corsi stays pretty much the same, at 0.82. Here, let me give you the regression equation:
Next Year G% = 0.82 (this year Corsi%) + 1.21 (this year SH%) - 0.12
In the Leafs' case, their 3.1-percentage-point advantage in SH% translates to a 3.75% increase in next year goal percentage. Of course, their 6.2 percent disadvantage in Corsi works out to a 5.08% *decrease* in goal percentage. Again, I'm not saying the Leafs are superstars or geniuses ... I'm just arguing that you might have to offset some of their exceptionally low Corsi with some of their exceptionally high SH%.
Now, you might argue that I'm guilty of selective sampling, when I choose to look at only the last three seasons. Which is true, except that I think you can justify the apparent cherry-picking, because there's other evidence that recent seasons are, in fact, different.
And, actually, that difference is actually the most interesting thing I found in these numbers -- there seems to have been a sudden, recent change in the relationship between shooting and winning.
Two years ago, Bruce McCurdy posted a study investigating the relationship between outshooting and winning games. He found that from 1997-98 to 2007-08, the team with the most shots on goal in a game was significantly more likely to win. In each of those seasons, the team with more shots had an aggregate winning percentages above .500. (The league-seasons ranged from about .505 to .550. The percentages were computed as the percentage of game points received, so as to create a league average of .500. And the study counted all shots, not just tie-game shots.)
So, back then, if you got more shots, you tended to win the game.
But, in 2009-10, the situation reversed -- taking more shots meant you were more likely to have *lost* the game. That year, the team with more shots had a winning percentage of only around .490.
In 2010-11, it got even more extreme. The team with more shots played at only a .475 rate. Put another way, teams that were outshot posted a .525 record! That's almost the exact reverse of what it had been in prior years.
Check out McCurdy's table from his article: twenty-five of the 30 teams had higher winning percentages in games where they were outshot.
That's kind of shocking, in two ways. First, who would have thought that getting outshot would tend to be connected to winning? And, second, who would have thought that the relationship between shooting and winning would reverse itself within two seasons, with hardly anyone noticing?
McCurdy's study was written immediately after the 2010-11 season; I don't know if the reversal continued in future years. However, I did calculate that last year, Toronto's record fit the pattern. When the Leafs outshot their opponents, they were 5-7-0. When they were themselves outshot, they were 21-10-5.
Here's another, similar, bit of evidence. In a recent post (hat tip: Tango), garret9 points to some correlation numbers from JLikens. Those look at early season numbers to try to predict future team wins in the same season.
In 2007-08, tie-score Corsi predicts the near future much better than goals. For instance, if you use the first 40 games of Corsi to predict the next 40 games of winning percentage, the correlation is .408. But if you use the first 40 games of goals, the correlation is only .312.
In 2008-09, Corsi is still better, but less so. The corresponding r values are .569 for Corsi, and .488 for goals.
But, in 2009-10, the difference disappears. Now, Corsi and goals are almost the same, at .419 and .409, respectively. As you move later in the season, the order reverses. If you try to predict 30 games from 50, Corsi and goals actually "tie" at .396. Predicting the last 20 games from the first 60, (or 10 from 70), goals "wins" outright.
That probably wouldn't happen if shooting percentage wasn't measuring something real, would it?
What would cause these sudden changes? Well, it could be that teams started to play an even more extreme style of hockey when leading. Maybe it used to be that, when ahead in the game, teams would get outshot at a rate of, I dunno, 30-25, or something. But, in recent years, maybe teams decided to switch to an even more "opportunistic" style, where they limit their offensive possessions and just capitalize on opponent mistakes, so then they started to get outshot at, I dunno, 33-22, or something.
I just made those numbers up, for illustration, but maybe that's what could be happening.
Hey, wait, I can check that. Seriously, as I'm writing this, I hadn't looked. Hang on.
Nope, it doesn't seem like that's the answer. Here are the Corsi percentages for teams behind by one goal:
It looks pretty level, with no obvious evidence of changes in "behind by 1" Corsi patterns.
So, if it's not that, what is it? And, does anyone know if the "getting outshot means it's more likely you won" effect continued after 2010-11?
Anyway, even without fully understanding what's going on, I think all this constitutes enough evidence that we need to take the "shot quality matters" hypothesis seriously. As we've seen:
1. Since 2011-12, there's a negative correlation between Corsi and SH%;
2. Since 2010-11, a high shooting percentage this year is an indicator of more goals next year;
3. In 2009-10, goals suddenly became as good a within-season predictor as Corsi;
4. In 2009-10 and (especially) 2010-11, outshooting your opponent became an indicator of losing the game, rather than winning.
I guess it's possible that what we're just looking at is random luck, and that the same stroke of random luck created all four phenomena. But ... really? I think that, at this point, it seems that some kind of "not luck" explanation is more reasonable.
Much more plausible than luck, IMO, is that I've just done something wrong, or that I haven't interpreted the evidence correctly, or that there are other arguments that counterbalance what I've said here, or that I just screwed up the data somehow.
And, again, I'm not saying that this is *proof* that shot quality and Corsi are related, just that it's good enough evidence that you have to at least consider it. Seriously, doesn't it seem like there might be something there?
UPDATE: On October 29, one day before this post, Tyler Dellow published a post about the anomaly where winning teams switched from outshooting to getting outshot. (Save it now if the link works; his site is likely to be down a fair bit.)
The trend of winning teams getting outshot continued in 2011-12, but then things went back to "normal" in 2012-13 and so far in 2013-14.
Tyler promises an investigation of what changed, which I'm looking forward to, because I have no idea.
(There are seven parts. Part II was previous. This is Part III. Part IV is next.)