Wednesday, November 29, 2006

Does "Win Score" overvalue rebounds?

In the past little while, there's been a debate about a basketball statistic from "The Wages of Wins" called "Win Score." The statistic, invented by authors Berri, Schmidt, and Brook, attempts to calculate how many wins each player contributed to the team. One of its forms is

Win Score = Points + Rebounds + Steals + ½*Assists + ½*Blocked Shots - Field Goal Attempts – ½*Free Throw Attempts – Turnovers – ½*Personal Fouls

The debate, for which details can be found at TWOW posts here and here, is this: does this statistic overrate rebounds?

King Kaufman believes it does. John Hollinger believes it does. I also believe it does.

First, the data shows that not every player has the same opportunity to try for a rebound. After a missed shot, only about 30% of rebounds are secured by the offense; the other 70% by the defense. (I got that 30% figure from
this comment.)

Obviously, the circumstances of where players find themselves has a bearing on who gets the rebound. Otherwise, the breakdown would be 50-50, not 70-30.

So, for some reason, players have different chances of rebounding that are related to positioning, rather than raw skills. Crediting a player for plays he makes only because of his position tends to overrate the value of those skills. I don't know enough about basketball to know if or how certain players are somehow set up for more rebounds – but to the extent to which that happens, if any, rebounds will tend to be overvalued in players' accounts. Just like cleanup hitters have more RBI opportunities just due to circumstances, some players may have more rebounding opportunities due to circumstances. And the 70-30 split shows there is certainly some of that going on. And the more it's circumstances, the less it's skill on the part of the player.

To see why, consider a more extreme example. Imagine that the NBA institutes a new rule: the offense is prohibited from touching a rebound until it has bounced three times on the floor.

That rule change will do nothing to affect TWOW's regression or logic. A defensive rebound still constitutes a change of possession, and is therefore still worth exactly the same number of wins as it was before. But, now, instead of 70% of rebounds going to the defense, the number is now 99%. Dennis Rodman might still snag a large proportion of rebounds, but now, instead of having to run and jump and position himself and maybe fight off an opposing player, he can just jog to where the ball is and pick it up.

Given that there is now no skill at all, doesn't it overrate Rodman to give him credit for those rebounds? Obviously, any excess rebounds picked up by Rodman, instead of his teammates, are positioning, luck, or opportunities given him by his coach and team. Even a caveman could get them.

The argument for 99% also applies to 70%, but to a lesser extent. Some, but not all, of Rodman's rebounds are, in effect, his team "letting him" have the ball more. Those are perhaps better classified as team rebounds, rather than individual rebounds. Since they aren't, Rodman winds up overrated.

That's opportunities. But there's a second reason rebounds are overrated, a much more important reason, and it has to do with the construction of Win Score itself.

It's the reason John Hollinger gives, the one TWOW disputes in the above links. That argument is that part (or even most) of the credit for a rebound should go to the other members of the team, for making the rebound possible. As Hollinger writes
here, "missed shots can be rebounded while turnovers can't, and ... a defensive rebound is merely the completing piece of a sequence that began by forcing a missed shot."

Suppose the NFL makes a rule change. Starting immediately, a touchdown is worth zero points instead of six – but, to compensate, the convert [extra point] is now worth seven points instead of one. A touchdown and convert is still worth seven points total. And since almost all converts are good, this doesn't change scoring in the NFL very much.

But now, running a regression assigns the entire seven points to the kicker. So suddenly, kickers are overrated, because they get credit for seven points instead of one! There's a 90-yard drive... the quarterback takes the team down the field, the receivers make some great catches, the running back drags two defenders three yards down the field for a third-down conversion, and they finally get the ball into the end zone. But, if you do a regression, it's the kicker that gets all the points ... the rest of the players come out at zero!

And the regression is absolutely correct – all things being equal, only the kicker matters. It's the interpretation of the regression that's questionable.

Really, the touchdown drive and the kick are one unit. No matter how good the kicker is, the only way he can get an opportunity to try for seven points is to have the rest of the team score a touchdown first. We know in our gut that it's really the touchdown that's worth the seven points, not the kick, because that's where the important skills came out. But the regression has no idea where the skills lie. It has no idea about what really caused the points, in the human sense. It sees when a kick is good, that's seven points. When a kick is bad, it's zero points. And everything else is irrelevant.

A similar situation happens for rebounds in basketball. To get the opportunity for a defensive rebound (convert), the defense must first force the opposition to miss (touchdown). The defensive rebound is a combination of the two acts: good defense for up to 24 seconds, and one grab of the ball. Crediting the rebounder with the full value of the defensive play is like crediting the kicker with all seven points of the touchdown.

And to get the opportunity for an offensive rebound, the shooter must have missed a field goal attempt. Win Score sees the two events superficially – the missed field goal is a turnover, and gets scored as such, and the offensive rebound is treated like a steal back from the defense. The shooter is charged with minus one possession, and the rebounder is credited with plus one possession.

But that's the wrong weighting. Any field goal attempt has, intrinsically, built into it, the embedded feature that a missed shot results in a 30% chance of getting the ball back. The miss includes a consolation prize, a lottery ticket with a 30% chance of winning back the possession. The shooter figured that into his decision about whether to make the shot. That 30% chance belongs to the shooter. In effect, he hasn't wasted a whole possession with his miss, he's only wasted 70% of a possession. Remember Hollinger's point – a missed shot gives the team a chance to recover, but a turnover doesn't. Obviously, the shooter should be debited less for getting a shot away than for letting the shot clock expire.

I think the correct way to handle rebounds in a stat like Win Score is to start by ignoring them. Take the league average rebounding stats, and give the entire contribution to the shooter and defense.

For offensive rebounds, note that on average, a missed field goal causes no damage 30% of the time. And so give the shooter back his 30% and charge him with only 70% of a turnover.

For defensive rebounds, note that they are the statistically average outcome of a defense good enough to force a missed shot. And so give all the credit for defensive rebounds – 70% of opposition missed shots -- to the defense, and ignore the rebounder.

(Remember that assigning values this way is completely compatible with the empirical data. If you were to run a regression that leaves out rebounds entirely, those are the weights you'd get – 70% of a turnover for a missed shot by either team.)

After all that, if the team turns out to be different from average, we can figure out how much different, and assign the credit or debit it to the players in proportion to what we think their contribution is. The hard part is figuring that out. Is Dennis Rodman a great rebounder with average opportunities, or an average rebounder with lots of opportunities? That's something you have to analyze properly, or you'll get bad results.

How much can the TWOW method overrate a rebounder? Let's take Kevin Garnett as an example. In 2005, the
Timberwolves had 947 offensive rebounds and 3527 defensive rebounds. Garnett was responsible for about 16% of the team's playing time. If rebounding were exactly proportional to playing time, Garnett would have come in at 150 offensive rebounds and 559 defensive. His actual numbers were 247 and 861. Garnett got to 399 more rebounds than average, or about 56% more than expected.

Is that difference a matter of skill, or opportunity? It's hard to argue that it's completely a matter of skill. The average team gets 70% of defensive rebounds. If Garnett is 56% better, a team of five Garnetts would get 109% of defensive rebounds! Now, you could argue that the five Garnetts would get in each other's way and take rebounds away from each other – there's only one ball, after all. But if you argue that five Garnetts would take rebounds away from each other, then you have to admit that there are times when two players both have a chance to make the play. And, therefore, there must be cases where Garnett takes rebounds away from his existing teammates! And so we have deduced that not all of that 56% can be simply Garnett's exceptional skill, because some of his rebounds would be snagged by a teammate if he weren't there. There must be at least some effect of opportunity there, and possibly a lot.

Now take the other extreme -- suppose Garnett is just an average rebounder, and his numbers are completely the result of opportunity. Then Garnett is being credited with wins that should really be going to the defense (for defensive rebounds) and the shooters who missed (for offensive rebounds). 401 rebounds is worth 14 wins. When we reallocate the defensive-rebound wins among all the players, about two will come back to Garnett. If we reallocate the offensive-rebound wins among shooters who missed, maybe half a win will come back to Garnett. Call it three wins total.

So if Garnett's rebounding is simply a matter of other players deferring to him, Garnett would be overrated by 11 wins. That's huge. Instead of being responsible for 30 wins out of his team's 45, he'd be responsible for only 19.

The correct number is somewhere between 19 and 30. Logic and evidence suggest that it has to be at least somewhat lower than 30. And so I think Hollinger and Kaufman are right -- and that TWOW's Win Points do indeed seriously overvalue rebounders.

Labels: , , , ,


At Wednesday, November 29, 2006 10:32:00 AM, Blogger Phil Birnbaum said...

There is also discussion of the TWOW measures on the Sonics APBRmetrics board here.

At Wednesday, November 29, 2006 12:50:00 PM, Anonymous Anonymous said...

Experienced hoops stats geeks have always made a distinction between offensive and defensive rebounds. It makes no logical sense to put them together -- that would be like tracking a category for baseball players called "outs" which counted all the outs made by a player while playing as a batter and as a fielder.

I regressed team rebounding on individual player rebounding, for both offensive and defensive. The slopes are different. You can see the graph here.

At Wednesday, November 29, 2006 2:45:00 PM, Blogger Phil Birnbaum said...

Cool. From your post and the thread in the link you sent, I see why that distinction exists.

Good stuff, thanks for the link.

At Wednesday, November 29, 2006 2:54:00 PM, Blogger Phil Birnbaum said...

And, by the way, the thread at Ed's link does suggest strongly that many defensive rebounds are unrelated to rebounding skill. (This is probably old news to APBRmetricians, but new to me.)

That, of course, strongly supports the conclusion that giving rebounders 100% credit for their defensive rebounds will badly overrate the players with the most.

At Wednesday, November 29, 2006 4:42:00 PM, Blogger Bob Timmermann said...

Hey, down here we call 'em extra points or point after touchdowns, not converts!

In the early days of footballs, a touchdown was worth four points and so was the PAT. Eventually the values changes as American football became less "footy".

At Thursday, November 30, 2006 12:29:00 AM, Blogger Phil Birnbaum said...

Oops, I gotta learn to stop speaking Canadian ...

At Thursday, November 30, 2006 11:22:00 AM, Anonymous Anonymous said...

Interesting observations, Phil. I think another good analogy would be team defense in baseball. If you ran a regression using putouts and assists per batter faced (essential OBP), it would do a very good job of predicting teams' runs allowed. But if you then assumed that players were contributing "value" based on their POs and As, you would conclude that catchers and first baseman provide almost half of all defense, while OFs hardly matter at all. The distribution of PO+A looks something like this:
C 18%
1B 27
2B 14
SS 12
3B 8
CF 7
LF 5
RF 5
P 4

Baseball fans know that this tells us virtually nothing about the defensive contribution of players. Giving catchers credit for pitcher strikeouts, for example, is silly. But I could 'prove', using regression, that my model does an excellent job of explaining runs allowed.

At Thursday, November 30, 2006 4:23:00 PM, Blogger Phil Birnbaum said...

Perfect analogy, Guy, wish I had thought of it.

At Friday, December 01, 2006 3:16:00 PM, Anonymous Anonymous said...

Beyond the problem of crediting one player with the value of RBs that could often have been made by other players, I wonder if WS doesn't overemphasize rebounds in the aggregate. Just playing around with 2005-2006 data, I find a correlation of .79 for actual wins and Win Score (using the formula in this post). If I simply remove rebounds -- disregard them entirely -- the R is .78, just as strong. It seems that including rebounds adds almost no explanatory power. However, if you remove points from the formula, the R drops to .55.
There is a weak correlation between Wins and RBs overall, but some of that is just the fact that a team with a better shooting percentage than its opponents will naturally have more RBs (as DRBs are easier to get than ORBs). Once you control for FG%, there can't be much there.

At the team level, including RBs doesn't seem to do any damage. But since it has a huge impact on individual player ratings, if it actually has little or no link to actual winning then that's a big problem for the metric.

Phil: Do the authors provide evidence in the book that RBs correlate with wins, or that including it in their metric improves accuracy?

* *

On a related issue, it seems to me that basketball stats might benefit by borrowing the replacement concept from sabermetrics. Berri says again and again that various metrics are clearly wrong because a below-avg percentage shooter can achieve a higher score by taking more shots -- as though this were self-evidently incorrect. But the best baseball metrics would also assign more value to a below-average hitter with 600 PAs than the same hitter with 200 PAs. And that's correct, so long as the player performed above replacement level.

So in basketball, Iverson is only hurting his team by taking a lot of shots if they have a player who can achieve a higher FG% with those possessions. And that means achieving a higher % even when Iverson is no longer drawing most of the attention of the best defenders. Maybe the 76s do have such players -- I don't know -- but Berri doesn't even seem to understand that this matters.
Too often, his analysis fails to ask the basic question: "as opposed to what?"

At Friday, December 01, 2006 5:21:00 PM, Anonymous Anonymous said...

Ah, I see a problem with my first point: as long as points is in the equation, that will capture much of the value of RBs (which lead to more points). Still, it seems odd that the WS metric essentially values points on an above-average basis (by subtracting FGAs), yet values RBs on an absolute basis. Mixing the two seems likely to overrate RBs.


Post a Comment

<< Home