Sunday, November 12, 2006

How does a player's team affect his output?

Does what team a batter is on affect his performance? We know there are some factors, such as park effects, in which the characteristics of the team affect a hitter. Are there others? And how important are they?

In "
A Variance Decomposition of Individual Offensive Baseball Performance," David Kaplan tries to find out. I didn't understand the full details of his statistical methodology, but he starts off by using a statistical technique called "analysis of variance." I studied this a long time ago, and have since forgotten most of it. But if I understand it correctly, it examines the variance of the performance of players within teams, and then the variance of the team means themselves, to figure out how important teams are relative to players.

For instance, suppose there are three teams, each with three players, and their batting averages are:

Expos ....... .260 .260 .260
Pilots ...... .270 .270 .270
Browns ...... .280 .280 .280

In this case, we can conclude that 100% of the variance comes from the teams, and zero percent is inherent in the players. Perhaps the Browns' manager is great at bringing out the best in his hitters, or their home park is very hitter-friendly, or perhaps the Browns can afford to sign all the .280 hitters while the other teams can't.

Now, suppose the numbers look like this:

Expos ....... .260 .270 .280
Pilots ...... .260 .270 .280
Browns ...... .260 .270 .280

Here, the opposite is happening: there is zero variance between teams, and so 100% of the variance is among the individual players.

And here's one more case:

Expos ....... .260 .270 .280
Pilots ...... .270 .280 .290
Browns ...... .280 .290 .300

In this example, the variance among teams seems to be the same as among the players on the team, so we can probably say the variance is decomposed 50-50 between players and teams.

And, of course, in real life, the percentage can be anywhere between 0 and 100% -- not just 100%, 0%, or 50% like in the three examples above.

So what did Kaplan find? Here are a few of his numbers. He did 2000 and 2003 separately; I've taken the simple average of those two percentages. What's shown is how much of the variance can be attributed to the team:

12% R
20% H
05% 2B
11% RBI
02% SB
01% SH
12% RC
28% TB
03% SLG
02% OBP
04% BA
03% OPS

What to make of this? I have no idea. I'm not sure why teams would be responsible for 28% of the variance in total bases, but only 12% of the variance in runs created.

However: Kaplan appears to have used actual counts, not rates, for all the counting stats. And he considered everyone with 200 plate appearances. What that means is that the variances are very heavily influenced by how the manager used his players. For instance, suppose the Browns and Pilots have identical players, but the Browns platoon and the Pilots don't. Then their hits might look like this:

Pilots: 120, 140, 160
Browns: 60, 60, 70, 70, 80, 80

This makes it look like there's big differences between the teams, since Browns players average 70 and Pilots players average 140. Futhermore, the Pilots' within-team variance is four times as high as for the Browns!

I'd guess that player usage patterns are the reason for some of the numbers, such as between-team effects being 28% of the variance of total bases, but only 3% of slugging percentage, when, really, those two statistics measure the same skill.

Also, the percentages result from a large muddle of a bunch of factors:

-- how much a team spreads out its at-bats among players;
-- park effects
-- whether teams that have good players are more likely to have other good players, because they can afford to spend more
-- whether teams that have good players are less likely to have other good players, because there's effectively a salary limit on how many good players they can afford
-- whether GMs concentrate on certain types of players, such as Billy Beane buying up a lot of OBP
-- whether managers are more likely to give playing time to certain types of players
-- managers' decisions on elective strategies like bunts and steals
-- and lots more that you can probably think of.

(As far as that point about elective strategies: it's safe to assume that variation in the rate of sacrifice hits should be very strongly team-related. That's because the bunt is a strategy held in different levels of esteem by different managers, especially in the American League. In the
2003 AL, sac hits ranged from 11 (Blue Jays) to 65 (Tigers). The fact that this study didn't show any reasonable effect – it showed that only 0.5% of bunt variance was team-related – suggests that the methodology is flawed, or at least not powerful enough for its intended purpose.)

Because the methodology tangles so many causes together, I don't know what this study tells us. I have no idea, absolutely none, of what any of Kaplan's numbers might mean in the baseball sense. Kaplan doesn't make any suggestions either. Could it be there's nothing we can conclude? Is it possible that the figures in the chart are no better than random numbers? Or am I missing something important?

At Monday, November 13, 2006 1:43:00 PM,  Guy said...

Phil: One big reason that teams get more 'credit' for the counting stats is that team OBP has a big impact on how many PAs a player gets, and thus his Hs, TBs, etc.

I can see how team (other than park) could impact rate stats on the margins. For example, a LH on a high OBP team may have a slightly higher BA because he often hits with 1B holding a runner on. But hard to imagine it's a very big factor.

At Monday, November 13, 2006 3:17:00 PM,  Phil Birnbaum said...

Thanks, Guy, that makes sense about the OBP.

At Wednesday, November 15, 2006 10:58:00 PM,  Anonymous said...

Hi Phil,

Have to disagree with you about the bunts. Despite the fact that variation between teams is significant (max =6x min) differences within teams must be huge. Some players almost never or never bunt, and some players probably account for a third to half of their teams bunts (max = 20-40x min). So any statistical measure should show bunts as more dependent on the player than on the team.

At Wednesday, November 15, 2006 11:18:00 PM,  Phil Birnbaum said...

Sure, I agree with you that players should have significantly more influence than teams ... my point was that I'd expect bunts to be closer to the top of the percentage list, maybe 20%/80% instead of the observed 1%/99%. That is, sac hits is one area where we KNOW teams make a significant difference, and the study doesn't show it.

But I see what you're saying ... every player on a team with 200 AB will have reasonable numbers of hits, but only one or two may have *all* the sac bunts. That huge player variance hides the team effect.

So overall I think you may be right, that maybe I shouldn't have expected the study to show a team effect. Maybe this technique isn't suited to picking up this kind of situation, where the variance between teams is concentrated among one or two players on the team.

At Monday, April 20, 2009 3:51:00 AM,  cvxv said...