Friday, January 28, 2011

Box-score statistics are the RBIs of basketball

Kevin Pelton has responded to my "basketball box score statistics don't work" post. (It's here, at Basketball Prospectus, but free even if you don't subscribe.)

He makes a couple of points.

First, rebounds. I argue that rebounds aren't necessarily a good measure of a player's skill in rebounding, because players vary in how much they "steal" rebounds from other players' territory. In response, Kevin shows that when players change teams, their rebounding numbers stay fairly constant, compared to other statistics. Doesn't that suggest, Kevin asks, that rebounds are relatively independent of the player's team, coach, and environment?

To which I answer: no, not really. I think that players have a certain style of how they approach rebounds, and that doesn't necessarily change from team to team. I might be wrong about this, but if a player is known for his rebounding, it doesn't seem like the new coach will say, "yes, we thought player X was excellent at rebounding, which is why he acquired him, but we're now asking him to cover less territory and pick up fewer rebounds."

If it's a player who's *not* known for his rebounding, he might be traded to a team that has a famous "stealing" rebounder. His rebounds will go down, but not by very much: the headline rebounder takes his extra rebounds from among many teammates, so the effect on any individual teammate is small.

That's why the correlation doesn't change much.

Compare that to field goal percentage, where the correlation changes a lot when players switch teams. That's because of a big difference between rebounds and field goal percentage. With rebounds, one player takes from the others, and the team doesn't change much. With FG%, every player raises or lowers every other player with him, so the team changes a lot.

Obviously, you'll get a lower individual correlation when the sum of all the players varies a lot (FG%), versus when the sum of all the players varies only a little (REB).


Second, Kevin especially disagrees with my conclusions from the Lewin-Rosenbaum (L-R) study. That study found that when you're trying to predict next year's wins based on this year's individual player stats, "minutes played" is a better predictor than some of the sabermetric basketball box-score stats.

Kevin says that, further on in the paper, the authors show a result more favorable to the box score stats, finding they correlate with plus-minus (which we both agree is an accurate stat if you can get past the random noise) much better than minutes played (MP).

He's right about that, and I should have mentioned that in my original post. Also, I shouldn't have given the impression that my conclusion is based on minutes played being better. That was just meant to be icing on the cake.

What I *should* have said was that, even though the correlations are higher for the new stats, that doesn't matter to my argument. My argument isn't that the new stats have a low correlation -- it's that they're biased.

Suppose that you have 30 centers. Some centers are better rebounders than others. And some centers "steal" rebounds more than others. In fact, some centers are big stealers, and some centers "negative" stealers, in that they let their teammates get lots of rebounds they could have got instead.

If you rate each center by the actual number of rebounds he takes in, you are going to be biased in almost every case, because of the "stealing" variable.

Suppose the average skill is 11 rebounds per game for a center. You have center A who steals 3 rebounds per game, so he's at 14 every year. You have center B who lets his teammates take 3 rebounds away, but he's really good, so gets an extra 3 from opponents, and he's at 11 every year.

Center B is a lot better than center A, 3 rebounds better. But he comes off looking 3 rebounds *worse*. And that'll happen over an entire career, so long as A and B have the same styles of play their whole careers.

This is NOT random variation, that one year, by luck, A winds up looking better than B. It's a bias in the statistic, that it fails to accurately measure what it claims to be measuring, and that the errors go a specific way for each player.

It's in that specific sense that I argue that the statistics "don't work".

A good baseball analogy would be RBIs. In 1985, one of the best years of Tim Raines' career, he had 41 RBIs. In 1990, one of the worst years of Joe Carter's career, he had 115 RBI.

Obviously, that RBI total, by itself, is very misleading. By any reasonable standard, Raines had a much, much better season than Carter. In "OPS+", one of the most respected and accurate baseball rate statistics, Raines came in at 151, which means his OPS was 51% higher than league average. Carter was at 85, which is 15% below average. It's no contest.

But that wasn't just randomness. Almost every full-time season of Carter's career, he had more RBI than almost every full-time season of Raines' career. Why? Like rebounds, it's a matter of sending opportunities to teammates. Carter batted fourth, where his teammates were able to get on base for him to drive them in. Raines batted leadoff, where a lot of the time he would hit with nobody on base. Carter's manager played Carter to "steal" opportunities from his teammates, while Raines' manager played Raines to have his teammates "steal" opportunities from *him*.

Again like rebounds, more skilled players do get more RBIs, all else being equal. If Carter had been better in 1990, he would have got more than 115 RBIs. And if Raines had been better, he would have got more than 41. But, in this particular case, the difference is opportunities, not skill.

So, what happens if you rate players' effectiveness by RBIs? On the whole, you get a substantial positive correlation between RBIs and wins, just like Lewin and Rosenbaum got a substantial positive correlation between Wins Produced and Plus-Minus, or PER and Plus-Minus. But, in individual cases, you can't draw any conclusions. You never know for sure that you're not making an awful, awful error and rating a Carter ahead of Raines. Or even a medium-size error, which probably happens pretty often.

If you're thinking you can argue that, well, you're measuring other skills than rebounding, so the basketball stats aren't that bad ... well, that's not true. First, other statistics are just as biased (FG% is also heavily teammate-dependent, but with positive instead of negative correlation). And, second, even with other stats, the bias in rebounds will still come up and bite you in the ass, just on a smaller scale. (That's one of the reasons that no sabermetric baseball stats include RBIs in their formulations -- their bias make predictions harder, not easier.)

Indeed, I bet if you rank every NBA player by even the best of the box-score statistics, and then got a bunch of NBA scouts to ranked them based on their own expertise, the scouts would beat the crap out of the stats. That wouldn't happen in baseball, if you used the good sabermetric stats -- I bet the stats would beat the scouts, or at least come close -- but it WOULD happen in baseball if you just used RBIs.

The analogy between sabermetric basketball box-score statistics and RBI's is actually pretty strong. In both cases:

1. When you add up the individual totals, the correlation to team totals is almost perfect.

2. If you're a better player, your individual numbers are better.

3. Year-to-year individual player correlations are fairly high.

4. Individual player correlations to known-good stats (plus-minus in basketball, OPS in baseball) are also fairly high.

5. However, individual numbers depend not just on skill, but on teammates and role within the team.

6. If you move teams, you generally keep your same role, which means the correlation stays high.

7. This means that the statistic is biased for certain types of players, and the bias does not disappear with sample size.

8. Still, if you look casually, players at the top are much better than players at the bottom, which means the statistic looks like it works.

9. But there will be many cases where players with significantly higher totals will actually be worse players than others with significantly lower totals.

In fact, I think this is my new argument in one sentence: "Box score statistics are the RBIs of basketball." They just don't work well enough to properly evaluate players.

Labels: , , , ,


At Friday, January 28, 2011 11:38:00 AM, Anonymous DSMok1 said...

Strong article! This is a big issue.

*thinking about the implications*

At Tuesday, February 01, 2011 3:58:00 PM, Blogger Hank Gillette said...

It's funny that baseball got it right with hits: most people would recognize that batting average is a better indicator of ability than the raw number of hits, but way too many people take the raw number of RBIs as an indicator of ability. It's indicative that Joe Carter had over twice as many 100+ RBI seasons as Mickey Mantle, despite Carter not being able to carry Mantle's jockstrap.

Maybe creating a rate stat for RBI was considered too difficult. To be fair, you'd want to do some sort of weighting depending on which base the runners were on: it's much easier to get an RBI with a runner on third than on first, especially with fewer than two out.

I'm not familiar with any kind of RBI "average" stat. Maybe it's because the people who were smart enough to do something like that skipped RBI entirely and went on to develop better measurements of ability.


Post a Comment

<< Home