Sabermetric Research: November 2014

Thursday, November 20, 2014

Does inequality make NBA teams lose?

I

Some people believe that income inequality can hurt group performance. They think that people work better together when employees are more likely to see themselves as equal.

I don't know if that's true or not. But it's a coherent hypothesis, that makes sense in terms of cause and effect.

On the other hand, here's something that doesn't make sense: the idea that when salaries are more unequal, the result is that the total becomes lower. That doesn't work, right? You can tell the CEO, "if you paid your people more equally last year, the company would have done better." But you can't tell the CEO, "if you paid your people more equally last year, they'd have collectively taken home more money."

Because, the relationship between total pay and individual pay is already known. The total is just the sum of the individuals. Equality can't possibly cause any additional pay, beyond adding up the amounts.

It would be like saying, "You shouldn't carry $50 bills and $1 bills in the same wallet. If you reduced inequality by carrying only $5 bills and $10 bills, you'd have more money."

That would be silly.

-------

Well, that's almost exactly what's happening in a recent NBA study, by the same poverty researcher who wrote the baseball inequality article I posted about three weeks ago.

The author looked up individual player Win Shares (WS) for the 2013-14 season. He measured Win Share inequality within each team by calculating the Gini Coefficient for the population of players. He then ran a regression to predict team wins from player inequality. He found a strong negative correlation, -0.43.

In other words, the more equal teams won significantly more games.

The author suggests this might be evidence of the benefits of equality. On the more equal teams, the better performance might have been created by the "psychological and motivational benefits" of the weaker players having "better opportunity to develop and showcase their skills."

But ... no, that doesn't make sense, for exactly the same reason as the $10 bill example.

Win Shares is really just a breakdown and allocation of actual team wins. The formulas take the number of games a team won, and apportions that total among the players. In other words, the team totals equal the sum of the individual totals, the same way the total amount in your wallet equals the sum of the individual bills. (*)

Last year, the Spurs won 62 games, while the 76ers won only 19. That can't have anything to do with equality. It's due to the fact that the Spurs had 62 win dollars in their wallets -- say, eleven $5 bills and seven $1 bills -- while the 76ers had only 19 win dollars -- say, a $10 bill and nine $1 bills.

It's true that the 76ers players' Win Shares were more concentrated among their best players. In fact, the top 5 percent of their players accounted for more than half the team total. But that doesn't matter. They had 19 wins total because they had a total of 19 wins individually.

If you want the Philadelphia 76ers to win 50 games this year, find players who add up to 50 Win Shares. It doesn't matter if you find ten guys with 5 WS each, or one guy with 30 WS and ten guys with 2 WS.

In fairness to the author, he does explicitly say that the correlation does not necessarily imply causation here. But the point is: he doesn't realize that he's looking at a relationship where correlation CANNOT POSSIBLY imply causation.

And that's what I found so interesting about the study. At first reading, it looks like such a strong finding, that equality may cause teams to win more ... but after a bit of thought, it turns out it's logically impossible!

The only other time I remember seeing that kind of logical impossibility was that study "proving" that listening to children's music makes you older, by retroactively changing your year of birth. And that one had been created deliberately to make a point.

-------

II

As an aside, another thing I found interesting: in his baseball article, the author argued against unequal pay for baseball players because, he believed, pay seemed to have so little to do with actual merit. But, here, by measuring inequality in Win Shares instead of dollars, he seems to be arguing against inequality of merit itself!

Well, that may be just a tiny bit unfair. Reading between the lines, I think the author thinks Win Shares are much more heavily based on opportunity than they actually are. He writes, "maybe top teams, by virtue of their abundance of success, are more willing to share the glory ... Lack of opportunity, by contrast, can lead to despair and diminished performance."

But, actually, the author never demonstrates that the bad teams have more inequality of opportunity (playing time). I suspect that they don't.

In any case, we can see that the 76ers high Gini isn't much caused by differences in opportunity. Even limiting the analysis to "regulars," players with 1,000 minutes or more, the effect remains. On the 76ers, the top two players had 53 percent of the regulars' total Win Shares. On the Spurs, it was only 29 percent.

-------

III

So why is it that the unequal teams tend to be worse? I think it's a combination of (a) the way the Gini coefficient measures inequality, and (b) the mechanism by which NBA performance creates wins.

Suppose that on a good team, the five regulars have field goal percentages (FG%) of 59, 57, 55, 53, and 51 percent, respectively. On a bad team, the five players are at 49, 47, 45, 43, and 41 percent.

If you measure inequality on the two teams by variance, it comes out equal: a standard deviation of 2.8 on each team. But if you measure it by Gini coefficient, or a similar calculation of "proportion of total wealth," they're different.

On the good team, the total percentage points add up to 275. The top player, with 59, has 21.5 percent of the total.

On the bad team, the total percentage points add up to 225. The top player, with 49, has 21.8 percent of the total. So, the bad team is equal by SD, but less equal by "percent of total."

The Gini is more than just the top player, of course ... the formula it involves every member of the dataset. Using an online calculator, I found:

The Gini of the good team is 0.029.
The Gini of the bad team is 0.036.

So, by Gini, the bad team is less equal than the good team. (A higher Gini means less equality.)

Why does this happen, that the Gini is higher but the variance is the same? Because of the way the two measures differ. Variance stays the same when you *add* the same amount to every player. But not the Gini. The Gini stays the same when you *multiply* every player by the same amount.

If you *add* a to every player instead of multiplying, the Gini drops. (And, if you *subtract* a positive number from every player, it increases.)

That's often what you want to have happen -- for incomes, say. If I make $50K and you make $10K, we're very unequal. But if you give both of us a $100K raise, now we're at $150K and $110K -- much more equal, intuitively.

The Gini confirms that. Before our raise, the Gini is 0.33. Afterwards, it's 0.08. (But if we use the variance instead, we look the same both ways.)

But for Win Shares, is the Gini-type of inequality really what we want? Are two players with 7 WS and 6 WS, respectively, really that much more equal, in an intuitive basketball sense, than two players with 2 WS and 1 WS? What about two players at 0.002 and 0.2 wins? In that case, one player has 100 times the wins of the other. But does "100 times" really give a proper impression of how different they are?

I don't think so. I think it's just an artifact of the way performance translates to wins.

What's wins? It's performance above replacement value. (Well, actually, WS is measuring above zero value, which is lower, but I'll call it "replacement value" anyway since the logic is the same.)

So, to get wins, you start with performance, and subtract a constant. As we saw, when you subtract the same positive number from every player, the Gini goes up. It's a "negative raise" that makes employees less equal.

Suppose the average FG% is 50 percent. Suppose that 40 percent is "replacement level" that leads to exactly zero wins, the level at which a team is so bad it will never win a game. Conversely, 60 percent is the level at which a team is so good it will never lose a game.

If the relationship is linear, it's easy to convert player FG% to Win Shares. Actually, I'll convert to "wins per 100 games," because the "out of 100" scale is easier to follow.

On the good team we talked about earlier, the players had FG% of 59, 57, 55, 53, and 51. That corresponds to W100 of 95, 85, 75, 65, and 55.

On the bad team, the players' FG% of 49, 47, 45, 43, and 41 translate to W100s of 45, 35, 25, 15, and 5.

See what happens? The FG% looks a lot more equal than the wins. On the bad team, the best player was only 20 percent better than the worst player in field goal percentage (49 vs. 41). But in wins ... he's 800 percent better! (45/5.) On the good team, though, there's still enough performance after subtracting that the numbers look reasonably equal.

The actual Gini coefficients:

FG%: The Gini of the good team is 0.029.
FG%: The Gini of the bad team is 0.036.

Wins: The Gini of the good team is 0.11.
Wins: The Gini of the bad team is 0.32.

That's just how the math works. The Gini coefficient is very sensitive to where you put your "zero". If you measure zero as 0 FG%, inequality looks low. If you measure zero as zero wins (say, FG% of 40 percent), inequality looks higher. If you measure zero as replacement level (say, FG% of 43 percent), inequality looks even higher. And if you measure zero as an NBA average team (say, FG% of 50 percent), it's even more unequal -- the top half of the teams have 100 percent of the wins! (**)

The higher the threshold that you call zero, the greater the inequality.

In baseball, a player hitting .304 has only about 1% more hits than a player hitting .301. But he has 300% more "hits above .300".

In the economy, the top 10% of families may have (say) 45% of the income -- but probably close to 100% of the new Ferraris.

And a real-life example: In the NHL, over the last ten seasons, the Black Hawks have 13% more standings points than the Maple Leafs -- but 500% more playoff appearances.

-----

One last analogy:

Take a bunch of middle-class workers, and tax them $40,000 each. They become much more unequal, right? Instead of making, say, $50K to $80K, they now take home $10K to $40K. There's a much bigger difference, now, in what they can afford relative to each other.

But if you tax the same $40K away from a bunch of doctors, it matters less. They may have ranged from $200K to $300K, and now it's $160K to $260K. They're a bit less equal than before, but you hardly notice.

Measuring after the $40K tax is measuring "income above $40K," which is like measuring "FG% above replacement level of 40%" -- which is like measuring Win Shares.

So that's why bad teams in the NBA appear more unequal than the good teams -- because "Wins" are what's left of "Performance" after you levy a hefty replacement-level tax. Most of the players on the good teams stay middle-class after paying the tax -- but on the bad teams, while some stay middle class, more of the others drop into poverty.

It has nothing to do with the social effects of equality or inequality. It's just an artifact of how the Gini Coefficient and basketball interact.

------

* Actually, there's a bit of wiggle room in the particular version of WS the author used, the version from basketball-reference.com. That version doesn't add up perfectly, but it promises to be close, certainly close enough that it doesn't make a difference to this argument.

** That's if you give the bottom teams zero. If you give them a negative, the Gini actually winds up at infinity. (The overall total has to be zero relative to the average, and you can't divide by zero.)

Labels: basketball, income inequality, merit, NBA

Tuesday, November 04, 2014

Corsi, shot quality, and the Toronto Maple Leafs, part VII

In previous posts, I've argued that when it comes to shots, NHL teams might differ in how they choose to trade quantity for quality. That might partly explain why the Toronto Maple Leafs, for the past few seasons now, have had ugly-looking shot stats, but with an above-average shooting percentage.

Skeptics argue that team shooting percentage (SH%) doesn't seem to have predictive value from season to season, which suggests it's luck rather than skill or strategy. But, at the same time, Corsi for teams seems to have a negative correlation to SH%, which is one piece of evidence that shot quality strategy might be a real issue.

Anyway, read the previous six posts for that argument. This is just an anecdote.

It comes from a piece by James Mirtle, the Maple Leafs beat writer for the Globe and Mail. Mirtle notes that the Toronto coaching staff has directed Morgan Rielly to increase his shot attempts:

[In the October 28 game vs. Buffalo,] Rielly rang up two assists – including a beauty cross-crease pass on James van Riemsdyk’s goal – and was all over the puck generally, generating nine shot attempts.

That propensity to shoot has been Rielly’s biggest shift from a year ago. The coaches want him putting more pucks on the net, and he has responded in dramatic fashion, with 2.8 shots a game compared to 1.3 in his rookie year despite similar ice time.

Even more impressively, Rielly leads all NHL defencemen in generating shot attempts, with 21.6 per 60 minutes at even strength, meaning he’s getting a look at the net roughly every 2.5 minutes he’s on the ice.

He’s winding up more frequently than not only every Leafs defenceman but every Leaf, including shot demon Phil Kessel, something that’s helping drive Toronto to respectable totals on the shot clock most nights.

Entering [the October 31] game against the injury-plagued Blue Jackets in Columbus, the Leafs have been outshot, but only by one: 281-280.

"I told myself this year that I would shoot more," Rielly said.

Well, isn't that exactly the kind of thing Corsi skeptics should be looking for? It's evidence that coaching decisions can affect shot quantity and quality -- in other words, Corsi and SH%.

It's a small sample size -- the Leafs had played only nine games when Mirtle's piece came out -- but let's see what happens if we take Rielly's numbers at face value and make a few estimates.

Assume Rielly gets 20 minutes of ice time per game. If 80 percent of that is at even strength. it's 16 minutes at 5-on-5. Let's call it 15 to make the calculations easier.

Since he's generating 21.6 even-strength Corsis per 60 minutes, that's 5.4 even-strength Corsis per 15-minute game.*

*I'm assuming that the "shot attempts" in the article refers to Corsi. If it refers to Fenwick, the effect is even larger than what I'm about to calculate, because the denominators are smaller (since Fenwick leaves out blocked shots).

Rielly's shots roughly doubled since last year, so let's assume his Corsis doubled too. That means his increase from last year must be about 2.7 Corsis per game.

Last year, those extra 2.7 Rielly shot attempts would have been passes or stickhandles. Assuming half those attempts would have eventually resulted in shots by other players, the increase due to Rielly's shooting is down to 1.3.

How significant is 1.3 Corsis per game? In 2013-14, the Leafs were out-Corsied 4,342 to 3,259 at even strength, giving them a league-worst 42.9 Corsi percentage. If you add in 107 Corsis to the "for" side (1.3 times 82 games), it's now 4,342 to 3,366. That would bump Toronto to 43.7 percent. Now, only second worst.

It's not huge, but it's something that would indeed show up in the stats. And, according to Mirtle, it's something that's due to a deliberate coaching decision.

How big would the effect be if the coaches decided everyone should shoot more, instead of just one defenseman whose minutes comprise only about six percent of total player ice-time?

-------

Also, you would think those extra shots would have to result in a reduction in shooting percentage, right? Last year, when Rielly wasn't shooting as much, it was probably because he thought he could set up a better quality shot some other way. And, I would assume, Rielly's shots are taken farther from the net than average, since defensemen usually play the point.

You could come up with a scenario where shot quality wouldn't drop ... maybe shots from the point lead to a lot of juicy rebounds, so long shots lead to a certain number of extra dangerous shots. Sure, that's possible. But I doubt if that effect, or any other, would make up the quality difference completely. If there were *never* a tradeoff between quantity and quality, every team would be shooting all the time. So, there must be some level of "dangerousness" above which a point shot is a good idea, and below which a pass is better. For shot quality would stay the same when Rielly shoots more, all his new shots would have to come in situations where not only was the shot the best move, but the shot was SO dangerous from the point that it would even be higher quality than the best alternative from closer in.

That's unlikely to be happening if Rielly now leads the league in shot attempts by defensemen. There aren't that many ultra-super-dangerous shot opportunities, never mind ultra-super-dangerous shot opportunities that Rielly wouldn't have taken advantage of last year.

-----

As I write this, it's only eleven Leaf games into the season, which is a very small sample size. But I checked anyway. (Here's a YTD link that may be outdated if you're reading this later than today.)

In those 11 games, the Leafs have an above-average Corsi at 50.9% in 5-on-5 tied situations. But they've scored only 5 goals in 121 shots. That's a shooting percentage of 4.13%, dead last in the league.

There's not enough data for that to really be meaningful, but it's interesting nonetheless.

(There are seven parts. Part VI was previous. This is Part VII.)

Labels: corsi, hockey, NHL, shot quality, statistics

Sabermetric Research

Thursday, November 20, 2014

Does inequality make NBA teams lose?

Tuesday, November 04, 2014

Corsi, shot quality, and the Toronto Maple Leafs, part VII

About Me

My stuff

Hardcore Sabermetric Research Links

Other Sports Research Links

Medium Core Sabermetric/Baseball Links (more to come)

More Baseball Stuff

Blogroll

Previous Posts

Archives