David Berri's FAQ and rebounding
A few years ago, "The Wages of Wins" introduced a basketball rating statistic called "Wins Produced" (WP). Since the book came out, there's been some debate about whether WP has a problem with rebounds when evaluating players. I have argued that it does; two of my posts on it are here and here, but you can probably find others elsewhere.
Dave Berri has recently updated a FAQ that tries to take on us doubters. I'll get to that, but first I guess I should summarize the disagreement, since it's been a few years.
WP values a rebound at +1 point. That's because a rebound takes a possession that would go to the other team, and effectively eliminates it. Since a possession is worth about one point, on average, so is the rebound.
There's no argument about that. The argument is about *who should get credit* for the +1 point. "The Wages of Wins" (or, more specifically, Berri, who is co-author and main blogger and spokesperson for the book), gives the entire +1 to the player who snagged the ball. Others argue that this is wrong.
The opposing argument goes something like this:
When a shot is missed, the ball is likelier to go some places than others. Whoever is at that position is more likely to be in a position to pick up the rebound. When you award the entire value of the rebound to that player, you are mostly rewarding him for being in that spot. As Guy pointed out in a comment way back, it's like putouts in baseball. Getting an out is quite valuable, and many putouts are made at first base. But that doesn't mean that your 1B is five times more valuable than your CF just because he makes five times the putouts. His high total is because of where he plays.
It's obvious that this is also true in basketball. Here is one sample of overall rebounding percentage based on position played:
13.8% Power Forward
5.9% Point Guard
Obviously, it's not the case that centers are 250% as good at rebounding as point guards -- it's just that because of the way offenses and defenses work, they happen to be in position for a rebound much more often. That's why Berri adjusts WP scores for position. Otherwise, the numbers wouldn't make sense, and point guards as a group would look like they're horrible basketball players.
A slight variation on this is "diminishing returns". This is an argument that, when you have one player who snags a lot of rebounds, it's not just that he's good at rebounds -- it's that he's being given more opportunities that would otherwise go to his teammates. Perhaps he's also going into other players' "territory" to get them. Or, perhaps, the team has assigned him the role of primary defensive rebounder, reducing other players' rebounding responsibilities to allow them to better transition to offense.
If that's the case, a player shouldn't necessarily get credit for every extra rebound, because, if he didn't get it, one of his teammates would have. That is, there are diminishing opportunities available to the other four players on the team.
So, while there's no question that a rebound is worth +1 to the team, it certainly doesn't seem that the full value should be credited to the skill of the individual player.
Berri, however, is not as convinced. He acknowledges that there is *some* diminishing returns happening, but he still gives the entire +1 to the player who picked up the rebound.
OK, now to Berri's FAQ. He makes four separate arguments for why his WP stat doesn't overvalue or misappropriate credit for rebounds. All four of those arguments, I think, are easily rebutted.
I'll go through them one at a time, using Berri's own numbering and titles. Keep in mind that I am *not* trying to provide evidence here for the other side of the debate -- I'm just trying to show why Berri's arguments do not prove his position.
Response #1 -- The Consistency of Rebounds
Rebounds per minute, for individual players, are consistent from year to year. Berri reports a correlation coefficient of over .9, and 0.83 even after adjusting for position played. This is higher than similar correlations in other sports. For instance (examples are Berri's):
0.65 -- Baseball OPS
0.47 -- Baseball batting average
0.37 -- Baseball ERA
0.36 -- NFL rushing yards per attempt (for running backs)
0.24 -- NHL goalie save percentage
0.07 -- NFL QB interceptions per attempt
First, you can't really compare the numbers that way. The actual correlations depend on all kinds of things other than skill -- mostly, the number of opportunities and the variance of the circumstances in which those opportunities happen. The fact that one correlation coefficient is higher than another doesn't necessarily mean that the underlying cause is more consistent.
But, suppose we let that go, and assume, with Berri, that rebounding is more consistent than (say) batting average.
So what? Even if you show that rebounding is consistent, that doesn't prove that rebounding is a skill. To go back to Guy's analogy, the consistency of putouts in baseball would be just as high: Albert Pujols had a lot of putouts in 2009 and 2010, and Alex Rodriguez had a lot fewer putouts in both 2009 and 2010. That doesn't prove that Pujols is a much better "putouter" than A-Rod ... it just proves that Pujols plays first base and Rodriguez plays third base.
To that, you could argue that the analogy isn't perfect, because Berri did adjust by position. Still, there are other reasons you could get a high baseball correlation, other than skill. Maybe some 3B play for teams with pitchers who give up a lot of ground balls, and others don't. That would create a higher correlation, while having nothing to do with skill. Maybe some LF play for teams with lots of RH pitching, so they get fewer fly balls hit to them. And so on.
Or, consider saves. There is a very high correlation between saves one year and saves the next year. That doesn't mean that David Aardsma has a talent for saves, but Felix Hernandez doesn't. It just means that, even though they play the same position on paper -- pitcher -- they are used in very different ways. In this case, the consistency isn't of talent, but of managerial decision-making. (I previously expanded on this thought here.)
So, when Berri says,
"When we look at rebounds, we see a higher correlation than all of these [other sports'] statistics. This leads one to conclude that rebounding is a skill that is primarily about the player credited with the rebound."
... it's obvious that doesn't follow. A high degree of consistency in rebounding rate could mean a consistency of talent, or it could mean a consistency of covering more of the other players' territory.
Consistency just means you're measuring something real. It doesn't mean that the "something real" is necessarily talent.
Response #2 -- Rebounds Are Not the Same For All Teams
"If a player's rebounds are all "stolen" from his teammates, then teams would have to be getting the same number of rebounds. So do all teams end up with the same number of rebounds?"
As written, this is an egregious straw man. Nobody is saying that rebounds are *all* "stolen" from teammates -- just enough to make the raw statistic unreliable. And nobody is saying teams are exactly the same -- we're saying that teams show more similarity than you'd expect by just adding up the individuals. But I'll assume that Berri knows that, and is just exaggerating for effect.
To show how rebounds differ highly across teams, Berri goes on to compare various statistics by "coefficient of variation" (the SD divided by the mean). Again, as I have written before, that number is not meaningful in the way Berri thinks it is.
For offensive rebounding percentage, Berri gets a figure of .106, which is probably something like .027/.265. The .027 is the SD of OR%, and the .265 is the overall average.
But, what if you changed "offensive rebounding percentage" to "offensive rebounding missed percentage"? That is, suppose you start counting missed rebounds instead of made rebounds. In that case, the SD stays the same, but the mean reverses, from .265 to .735 (26.5% made is 73.5% missed). Now, you now get a "coefficient of variation" of .027./.735, which is .036. That now almost exactly matches the other stats Berri cites (which range from .035 to .043). Still, that doesn't matter, because, as just a raw number, "coefficient of variation" has little do to with the subject at hand.
Intuitively, it may *look* like it does, at least to Berri. But it doesn't.
More generally, I don't understand Berri's argument that the more variation there is among teams, the more skill there is in the statistic. There's a lot more variation in sacrifice bunts than there is in batting average, isn't there? But bunting numbers vary mostly because of managerial decisions, not because of talent. The same is true for intentional walks by pitchers. And, to a lesser extent, it's also true for stolen bases.
Response #3 -- Do We Overvalue Rebounds?
Berri makes an argument that goes something like this: suppose rebounds were overvalued, the way his opponents think they are. Then, if we credit a player for only half the rebounds he makes (and spread the other half around to his teammates), that should change things a lot. But, when you look at the top 20 players in the league, the ranking doesn't change that much. (Chart in FAQ, or alone here.) And the new and old statistic correlate with each other at 0.95.
To which the response is:
First: It DOES make a significant difference in the rankings. Some of those top-20 players drop significantly. Carlos Boozer, for instance, goes from 16.2 wins to 12.5 wins. More importantly, it's the evaluations of Boozer's teammates that would change a lot. Since the Jazz player's stats still have to sum to Utah's total wins, Boozer's teammates will get quite a boost. The standings of the top 20 players may not change a whole lot, but, in the middle, where players are very close together, there will be a wholesale re-evaluation, with non-rebounders moving up and rebounders moving down.
Second: Of the top 20 players last year, 19 of them drop in total wins produced when you credit them only half their rebounds (the 20th one stays the same). That means that every one of the top 20 players was at or above his team's average in rebounds (otherwise, replacing half his rebounds with half his teammates' rebounds would make him look better). It looks like the average drop among the top 20 players is a win or two.
That means that it makes a big difference to whether you get it right. If you're an NBA general manager, whether a player is worth 12 wins or 14 is very significant at contract negotiation time.
Third: A correlation coefficient of .95 does not imply that there's not much difference. It's true that .95 seems like a "big number," but you have to evaluate it in context. I feel pretty certain that I could take the established, proven values for baseball events, screw them all up to make them significantly wrong, and still come up with a .95 correlation to the original. I mean, think about it: any not-too-far wrong stat will put Babe Ruth at the top and Mario Mendoza at the bottom. In that light, mismeasuring some of the components will still leave the correlation pretty high.
Response #4 -- WP isn't just about rebounds
This argument of Berri's says that rebounds aren't such a big deal in the entire context of the WP calculation. They're just one small part. Even if rebounds *were* misallocated, it doesn't matter all that much in context, not nearly enough to invalidate WP.
What's the evidence? Well, Berri shows how much a 1 percent change in various statistics changes the final value of WP:
+5.2% -- points per FG attempt
+3.2% -- rebounds
+1.2% -- free throw percentage
-1.1% -- personal fouls
-0.9% -- turnovers
+0.7% -- steals
+0.2% -- blocked shots
"Rebounding certainly matters. ... But WP is more "responsive" to shooting efficiency from the field."
Yes, except: a 1% change in Points Per FG Attempt is much less common in the NBA than a 1% change in Rebounding.
For an analogy, consider baseball. The average player might hit .260 with 12 home runs. Now, a 100% change will increase home runs from 12 to 24 -- a significant increase, but not out of this world. On the other hand, a 100% change in batting average will have the player go from .260 to .520 -- which is pretty much impossible.
So the extent to which a statistic is influenced by one of its components is the product of two factors: "elasticity" (responsiveness to change), as Berri calculated, and the extent to which players actually differ in real life (that is, the variance). Berri has only considered the first.
What he could have done, instead, is something that's commonly done in other studies: show the response, not to a 1% change in the value, but to a 1 *standard deviation* change in the value. If Berri had done that, he would have noticed that the SD of rebounds is (I think) approximately 45% of average, while the SD of shooting percentage is only about 11% of average.
So, a 1 SD change in shooting percentage increases value by 5.2 times 11 -- 57.2%. And a 1 SD change in rebounds increases value by 3.2 times 45 -- 144%. So rebounds are indeed much more influential than shooting.
Now, in fairness to Berri, the real-life results won't be that extreme. Berri adjusted all players' stats by position, and, as we saw above, some positions rebound a lot more than others. The adjustment, therefore, will pull the SD of rebounding down. (Having said that, field goal percentage was also adjusted by position, and some positions probably shoot better than others too, so the SD of shooting percentage will drop too. But probably not as much.)
But my point is not to come up with a definitive answer to the question -- it's to argue that Berri's elasticity calculation doesn't mean what Berri thinks it does.
The strange thing is, that, for this particular narrow question, it would actually make sense to compare correlation coefficients. You could look at the r (or even r-squared) for player rebounds vs. WP, and compare it to the one for player Points Per FG Attempt vs. WP. That would give you an intuitive idea of which season stat affects WP the most. But, in this case, Berri chose not to run a regression.
(And, while I know I promised not to argue for the facts either way, one note. Commenter Guy, in an e-mail, told me that last year, WP had a .75 correlation with rebounds, but only a .5 correlation with shooting percentage.)
So, that's why I think that Berri's four counterarguments are not relevant to the question of whether rebounds are misallocated to players. As for actual evidence and argument one way or another, there have been some posts lately at various basketball sabermetrics sites, that perhaps I will comment on in future. Here, for instance, is one of them -- both Guy and Berri make appearances.