Tuesday, December 09, 2014

Corsi vs. Tango

Tom Tango is questioning the conventional sabermetric wisdom on Corsi. In a few recent posts, he presents evidence that Corsi can be improved upon as a predictor of future NHL team performance. Specifically: goals are important, too.

That has always seemed reasonable to me. In fact, it seems so reasonable that I wonder why it's disputed. But it is. 

Goals is just shots multiplied by shooting percentage (SH%). The consensus among the hockey research community is that their studies show SH% is not a skill that carries over from year to year. And, therefore, goals can't matter once you have shots.

I've been disputing that for a while now -- at least seven posts worth (here's the first). But I've been doing it from argument. Tango jumps right to the equations. He split seasons in half randomly, and ran a regression to try to predict one half from the other half. Goals proved to be very significant. In fact, when you try to predict, you have to weight goals *four times as heavily* as Corsis. (Five times as heavily as unsuccessful Corsis.)

In a tongue-in-cheek jab at statistics named after their inventors, he called that new statistic the "Tango." 

Despite Tango's regression results, the hockey analysts who commented still disagree. I'm still surprised by that ... the hockey sabermetrics community are pretty smart guys, very good at what they do, and a lot of them have been hired by NHL teams. I've had times when I've wondered if I'm missing something ... I mean, when it's 1 dabbler against 20 analysts who do this every day, it's probably the 20 who are right. Well, now it's two against the world instead of one ... and the second is Tango, which makes me a little more confident. 

Also ... Tango jumps right to the actual question, and proves goals significantly improve the prediction. That's hard to argue with; at least, it's harder to argue with than what I'm doing, which is attacking the assumption that shooting percentage is all random. 

You can see one response here, and Tango's reply here.  

Tango got his data from war-on-ice.com, who agreed to make it available to all (thank you!!!). I was planning to do some work with the data myself, but ... I guess Tango and I think about things from different angles, because, the more I thought about it, the more of my "usual" arguments I thought of, the less direct ones. So, there'll be another post coming soon. I'll play with the data when my thoughts wind up somewhere that I need to look at it.

For this post, a few of my observations from Tango's posts and the discussion that followed.


In one of his posts, Tango wrote,

"One of the first things that (many) people did was to run a correlation of Corsi v Tango, come up with an r=.98 or some number close to 1 and then conclude: “see, it adds almost nothing”. If only that were true. "

Tango is absolutely right (you should read the whole thing). It's just another case of jumping to conclusions from a high or low correlation coefficient.

Sabermetrics is pretty good at figuring out good and bad. It has to be -- I mean, even fans and sportswriters are pretty good at it, and the whole point of sabermetrics is to do better. We're already in "high correlation" territory, able to separate the good teams and players from the bad teams and players pretty easily. 

Find a 10-year-old kid who's a serious sports fan -- any sport -- and get him to rank the players from best to worst -- whether by formula, or by raw statistics. Then, find the best sabermetric statistic you can, and rank the players that way.

I bet the correlation would be over 0.9. Just by gut.

We're already well into the .9s, when it comes to understanding hockey. Any improvements are going to be marginal, at least if you measure them by correlation. And so, it follows that *of course* Tango and Corsi are going to correlate highly. 

Also, and as Tom again points out, if Corsi already has a high correlation with something, at first glance, Tango can appear to increase it only slightly. If you start with, say, 0.92, and Tango improves it to 0.93 ... well, that doesn't look like much, intuitively. But it IS much. If you look at it another way -- still intuitively -- it was 8% uncorrelated before, and now it's only 7% uncorrelated. You've improved your predictive ability by 12%!

The point is, you have to think about what the numbers actually mean, instead of just having it click "93 isn't much bigger than 92, so who cares?"

Tom illustrates the point by noting that, even though Tango and Corsi appear to be highly correlated to each other, Tango improves a related correlation from .44 to .50. There must be some significant differences there.


There's a more important argument, though, than to not underestimate how much better .93 is than .92. And that argument is: *it's not about the correlation*. 

Yes, it's nice to have a higher and higher r-squared, and be able to reduce your error more and more. But it's not really error reduction we're after. It's *knowledge*. 

It's quite possible that a model that gives you a low correlation actually gives you MORE of an understanding of hockey than a model that gives you a high correlation. Here's an easy example: the correlation between points in the standings, and whether or not you make the playoffs, is very high, close to 1.0. The correlation between your Corsi and whether or not you make the playoffs is lower, maybe 0.7 or something (depending how you do it -- which is another reason not to rely on the correlation alone). 

Which tells you more about hockey that you didn't already know? Obviously, it's the Corsi one. Everyone already knows that points determine whether you qualify for the playoffs. When you find out that shots seem to be important, that's new knowledge -- the knowledge that outshooting your opponents means something. (Of course, *what* it means is something you have to figure out yourself -- correlation doesn't tell you what type of relationship you've found.)

And that's what's going on here. Corsi has a high correlation with future winning, but Corsi *and goals* has an even higher correlation (to a significant extent). What does that tell us? 

Goals matter, not just shots. 

That's an important finding!  You can't dismiss it just because the predictions don't improve that much. If you do, you're missing the point completely. 

You wouldn't do that in other aspects of life, would you? Those faulty airbags in the news recently, the ones that kill people with shrapnel ... those constitute a small, small fraction of collision deaths. If you looked only at predicting fatalities, knowing about those airbags is a rounding error. 

But the point is not just to predict fatality rates!  Well, not for most of us. If you're an insurance company, then, sure, maybe it doesn't make that much difference to you, a couple of cents on each policy you write. But that doesn't mean the information isn't important. It's just important for different questions. Like, how can we reduce fatalities? We can reduce fatalities by replacing the defective air bags!

Also, the information means it is false to state that faulty airbags don't matter. You can still argue that they don't matter MUCH, relative to the overall total of collision deaths; that might be true. But for that, you can't argue from correlation coefficients. You have to argue from ... well, you can use the regression equation. You can say, "only 1 person in a million dies from the airbag, but 1000 in a million die from other causes."

In this case, Tango found that a goal matters four times as much as a shot. If, roughly speaking, there are 12 shots for every goal, then every 12 shots, you get 12 "points" of predictive value from the shots, and 4 "points" of predictive value from the goals. 

The ratio isn't 1000:1 like for the airbags. It's 3:1. How can you dismiss that? Not only is it important knowledge about hockey, but the magnitude of the effect is really going to affect your predictions.


Why does the conventional wisdom dispute the relevance of goals? Because the consensus is that shooting percentage is random -- just like clutch hitting in baseball is random.

Why do they think that? Because the year-to-year team correlation for shooting percentage is very low.

I think it's the "low number means no effect" fallacy. Here are two studies I Googled. This one found an r-squared of .015, and here's another at .03. 

If you take the square root of those r-squareds, to give you correlations, you get 0.12 and 0.17, respectively.

Those aren't that small. Especially if you take into account how much randomness there is in shooting percentage. A signal of 0.17 in all that noise might be pretty significant.


It's well-known that both Corsi and shooting percentage change with the score of the game. When you're up by one goal, your SH% goes up by almost a whole percentage point -- and your Corsi goes down by four points. When you're up two or more, the differences are even bigger. 

That's probably because when teams are ahead, they play more defensively. Their opponents, who are trailing, play more aggressively -- they press in the offensive zone more, and get more shots. 

So, teams in the lead see their shots go down. But their expected SH% goes up, because they get a lot of good scoring chances when the opposition takes more chances -- more breakaways, odd-man rushes, and so on.

It occurred to me that these score effects could, in theory, explain Tango's finding. 

Suppose Team A has 30 Corsis and 5 goals in a game, and Team B has 30 Corsis and no goals. 

Even if shooting percentage is indeed completely random, team A is probably better than team B. Why? Because, with 5 goals, team B probably had a lead most of the game. If it had 30 Corsis despite leading, it must be a much better Corsi team to overcome the “handicap” of shooting so much when it’s ahead. So, when it’s behind, it’ll probably kick the other team’s butt.

I don't think Tango's finding is *all* score effects. And, even if it were, all that would mean is that if you didn't explicitly control for score, "Tango" would be a more accurate statistic than Corsi. And most of the hockey studies I've seen *don't* control for score.


Here's one empirical result that might help -- or maybe won't help at all, I'm actually not sure. (Tell me what you think.)

My hypothesis has been that some teams have better shooting percentages, along with lower Corsis, because they choose to set up for higher quality shots. Instead of taking a 5% shot from the point, they take a 50/50 chance on setting up a 10% shot from closer in. 

As I have written, I think the evidence shows some support for that hypothesis. In 5-on-5 tied situations, there's a negative correlation between Corsi rate and SH%. In 2013-14 (raw stats here), it was -0.16. In the shortened 2012-13 season, it was -0.04. In 2011-12, -0.14. 

Translating the -0.14 into hockey: for every additional goal due to shot quality, teams lowered their Corsi by 2.1 shots.

That's a tradeoff of around 2 Corsis per goal. Tango found 4 Corsis per goal. Does that mean that two of Tango's four Corsis come from shot quality, and the other two come from score effects?

Not sure. There's probably too much randomness in there anyway to be confident, and I'm not completely sure that they're measuring the same thing. But, there it is, and I'll think about it some more.


UPDATE, 30 minutes later: Colby Cosh pointed out, on Twitter, that Tango's regression used Corsi and goal *differential* -- offense minus defense.  That means the "goals against" factor is partially a function of save percentage, which partially reflects goalie talent, which, of course, carries over into the future. So, goalie talent absolutely has to be part of the explanation of the "goals" term.

So now we have two factors: score effects and goalie effects. Could that fully explain the predictive value of goals, without resort to shot quality?  I'll have to think about the actual numbers, whether the magnitudes could be high enough to cover the full "4 shots" coefficient.

Labels: , , , , ,


At Tuesday, December 09, 2014 3:34:00 PM, Anonymous Anonymous said...

Hey Phil; It's great to have the pot stirred.:)
I ran just a basic linear regressions on 6 years (458g) of game tied data P% - to each of our stats:
G/60/ Ga/60 S/60 Sa/60 M/60 Ma/60 B/760 Ba/60 (I know it includes randomness with shootouts & perhaps 4on4)
My results in order of Greatest to least:
SF 0.42
GA 0.41
SA 0.38
GF 0.33
MA 0.32
Def. Blocked shots 0.26
Misses F 0.24
O Blocks att. 0.07

Does this mean generating shots is the most imp. skill?
then goals against (goalie skill)
then Shots against (team defense)
then Goals for (sh% skill)
Then Misses F and Block att(more team D skill)
Finally Blocks attempted

IS this of any value to our discussion.


At Tuesday, December 09, 2014 4:08:00 PM, Anonymous Anonymous said...

Can I just say that I really enjoyed the way you presented this and the examples you gave helped me to understand a lot more than the original Tango post about this. Also, I appreciate your approach. It was inviting and made my mind more open. So, thank you.

At Tuesday, December 09, 2014 10:12:00 PM, Anonymous Phil Kessel said...

Great article! Really helps explain the value-add of the "Tango". I agree that over the long-run Tango should take precedent over Corsi.

Two quick points:

1) I feel that there is an element of a strawman argument here as it relates to the pro-Corsi populace. You suggested that shooting percentage is largely dismissed as random. However, the vast majority of the "enlightened" media that I follow agree (granted could be selective bias) that shooting percentage is an important factor in evaluating players. Typically players' shooting percentages are compared with their career averages to determine whether it can be expected to continue. Which leads into my second point...

2) The reason goals are given less emphasis usually, in favour of shots, is because sh% tends to fluctuate wildly over short periods (as countless article regarding PDO have shown). So would a metric that relies on goals not suffer similarly over short periods of time? It seems to me that Corsi retains the edge for smaller samples but that Tango would overtake it with full season sample sizes and larger. Would love to hear your thoughts.

Thanks and keep up the good work!


Post a Comment

<< Home