Sabermetric Research: Pitchers influence BAbip more than the fielders behind them

It's generally believed that when pitchers' teams vary in their success rate in turning batted balls into outs, the fielders should get the credit or blame. That's because of the conventional wisdom that pitchers have little control over balls in play.

I ran some numbers, and ... well, I think that's not right. I think individual pitchers actually have as much influence on batting average on balls in play (BAbip) as the defense behind them, and maybe even a bit more.

------

UPDATE: turns out all the work I did is just confirming a result from 2003, in a document called "Solving DIPS" (.pdf). It's by Erik Allen, Arvin Hsu, and Tom Tango. (I had read it, too, several years ago, and promptly forgot about it.)

It's striking how close their numbers are to these, even though I'm calculating things in a different way than they did. That suggests that we're all measuring the same thing with the same accuracy.

One advantage of their analysis over mine is they have good park effect numbers. See the first comment in this post for Tango's links to "batting average on balls in play" park effect data.

------

For the first step, I'll run the usual "Tango method" to divide BAbip into talent and luck.

For all team-seasons from 2001 to 2011, I figured the SD of team BAbip, adjusted for the league average. That SD turned out to be .01032, which I'll refer to as "10.3 points", as in "points of batting average."

The average SD of binomial luck for those seasons was 7.1 points. Since

SD(observed)^2 = SD(luck)^2 + SD(talent)^2

We can calculate that SD(talent) = 7.5 points.

"Talent," here, doesn't yet differentiate between pitcher and fielder talent. Actually, it's a conglomeration of everything other than luck -- fielders, pitchers, slight randomness of opposition batters, day/night effects, and park effects. (In this context, we're saying that Oakland's huge foul territory has the "talent" of reducing BAbip by producing foul pop-ups.)

So:

7.2 = SD(luck)
7.5 = SD(talent)

For a team-season from 2001 to 2011, talent was more important than luck, but not by much.

I did the same calculation for other sets of seasons. Here's the summary:

Obsrvd Luck Talents
--------------------------------
1960-1968 11.41 6.95 9.05
1969-1976 12.24 6.86 10.14
1977-1991 10.95 6.94 8.46
1992-2000 11.42 7.22 8.85
2001-2011 10.32 7.09 7.50
-------------------------------
"Average" 11.00 7.00 8.50

I've arbitrarily decided to "average" the eras out to round numbers: 7 points for luck, and 8.5 points for talent. Feel free to use actual averages if you like.

It's interesting how close that breakdown is to the (rounded) one for team W-L records:

Observed Luck Talent
--------------------------------
BABIP 11.00 7.00 8.50
Team Wins 11.00 6.50 9.00
--------------------------------

That's just coincidence, but still interesting and intuitively helpful.

-------

That works for separating BAbip into skill and luck, but we still need to break down the skill into pitching and fielding.

I found every pitcher-season from 1981 to 2011 where the pitcher faced at least 400 batters. I compared his BAbip allowed to that of the rest of his team. The comparison to teammates effectively controls for defense, since, presumably, the defense is the same no matter who's on the mound.

Then, I took the player/rest-of-team difference, and calculated the Z-score: if the difference were all random, how many SDs of luck would it be?

If BAbip was all luck, the SD of the Z-scores would be exactly 1.0000. It wasn't, of course. It was actually 1.0834.

Using the "observed squared = talent squared plus luck squared", we can calculate that SD(talent) is 0.417 times as big as SD(luck). For the full dataset, the (geometric) average SD(luck) was 21.75 points. So, SD(talent) must be 0.417 times 21.75, which is 9.07 points.

We're not quite done. The 9.07 isn't an estimate of a single pitcher's talent SD; it's the estimate of the difference between that pitcher and his teammates. There's randomness in the teammates, too, which we have to remove.

I arbitrarily chose to assume the pitcher has 8 times the luck variance of the teammates (he probably pitched more than 1/8 of the innings, but there are more than 8 other pitchers to dilute the SD; I just figured maybe the two forces balance out). That would mean 8/9 of the total variance belongs to the individual pitcher, or the square root of 8/9 of the SD. That reduces the 9.07 points to 8.55 points.

8.55 = SD(single pitcher talent)

That's for individual pitchers. The SD for the talent of a *pitching staff* would be lower, of course, since the individual pitchers would even each other out. If there were nine pitchers on the team, each with equal numbers of BAbip, we'd just divide that by the square root of 9, which would give 2.85. I'll drop that to 2.5, because real life is probably a bit more dilute than that.

So for a single team-season, we have

8.5 = SD(overall talent)
-------------------------------------
2.5 = SD(pitching staff talent)
8.1 = SD(fielding + all other talent)

------

What else is in that 8.1 other than fielding? Well, there's park effects. The only effect I have good data for, right now (I was too lazy to look hard), is foul outs. I searched for those because of all the times I've read about the huge foul territory in Oakland, and how big an effect it has.

Google found me a FanGraphs study by Eno Sarris, showing huge differences in foul outs among parks. The difference between top and bottom is more than double -- 398 outs in Oakland over two years, compared to only 139 in Colorado.

The team SD from Sarris's chart was about 24 outs per year. Only half of those go to the home pitchers' BAbip, so that's 12 per year. Just to be conservative, I'll reduce that to 10.

Ten extra outs on a team-season's worth of BIP is around 2.5 points.

So: if 8.1 is the remaining unexplained talent SD, we can break it down as 2.5 points of foul territory, and 7.7 points of everything else (including fielding).

Our breakdown is now:

11.0 = SD(observed)
--------------------------
7.1 = SD(luck)
2.5 = SD(pitching staff)
2.5 = SD(park foul outs)
7.7 = SD(fielders + unexplained)

We can combine the first three lines of the breakdown to get this:

11.0 = SD(observed)
--------------------------
7.9 = SD(luck/pitchers/park)
7.7 = SD(fielders/unexplained)

Fielding and non-fielding are almost exactly equal. Which is why I think you have to regress BAbip around halfway to the mean to get an unbiased estimate for the contribution of fielding.

UPDATE: as mentioned, Tango has better park effect data, here.

------

Now, remember when I said that pitchers differ more in BAbip than fielders? Not for a team, but for an individual pitcher,

8.5 = SD(individual pitcher)
7.7 = SD(fielders + unexplained)

The only reason fielding is more important than pitching for a *team*, is that the multiple pitchers on a staff tend to cancel each other out, reducing the 8.5 SD down to 2.5.

-------

Well, those last three charts are the main conclusions of this study. The rest of this post is just confirming the results from a couple of different angles.

-------

Let's try this, to start. Earlier, when we found that SD(pitchers) = 8.5, we did it by comparing a pitcher's BAbip to that of his teammates. What if we compare his BAbip to the rest of the pitchers in the league, the ones NOT on his team?

In that case, we should get a much higher SD(observed), since we're adding the effects of different teams' fielders.

We do. When I convert the pitchers to Z-scores, I get an SD of 1.149. That means SD(talent) is 0.57 as big as SD(luck). With SD(luck) calculated to be about 20.54 points, based on the average number of BIPs in the two samples ... that makes SD(talents) equal to 11.6 points.

In the other study, we found SD(pitcher) was 8.5 points. Subtracting the square of 8.5 from the square of 11.6, as usual, gives

11.6 = SD(pitcher+fielders+park)
--------------------------------
8.5 = SD(pitcher)
7.9 = SD(fielding+park)

So, SD(fielding+park) works out to 7.9 by this method, 8.1 by the other method. Pretty good confirmation.

-------

Let's try another. This time, we'll look at pitchers' careers, rather than single seasons.

For every player who pitched at least 4,000 outs (1333.1 innings) between 1980 and 2011, I looked at his career BAbip, compared to his teammates' weighted BAbip in those same seasons.

And, again, I calculated the Z-scores for number of luck SDs he was off. The SD of those Z-scores was 1.655. That means talent was 1.32 times as important as luck (since 1.32 squared plus 1 squared equals 1.655 squared).

The SD of luck, averaged for all pitchers in the study, was 6.06 points. So SD(talent) was 1.32 times 6.06, or 8.0 points.

10.0 = SD(pitching+luck)
------------------------
6.1 = SD(luck)
8.0 = SD(pitching)

The 8.0 is pretty close to the 8.5 we got earlier. And, remember, we didn't include all pitchers in this study, just those with long careers. That probably accounts for some of the difference.

Here's the same thing, but for 1960-1979:

9.3 = SD(pitching+luck)
------------------------
6.0 = SD(luck)
7.2 = SD(pitching)

It looks like variation in pitcher BAbip skill was lower in the olden times than it is now. Or, it's just random variation.

--------

I did the career study again, but compare each pitcher to OTHER teams' pitchers. Just like when we did this for single seasons, the SD should be higher, because now we're not controlling for differences in fielding talent.

And, indeed, it jumps from 8.0 to 8.8. If we keep our estimate that 8.0 is pitching, the remainder must be fielding. Doing the breakdown:

10.5 = SD(pitching+fielding+luck)
---------------------------------
5.8 = SD(luck
8.0 = SD(pitching)
3.6 = SD(fielding)

That seems to work out. Fielding is smaller for a career than a season, because the quality of the defense behind the pitcher tends to even out over a career. I was surprised it was even that large, but, then, it does include park effects (and those even out less than fielders do).

For 1960-1979:

10.2 = SD(pitching+fielding+luck)
---------------------------------
5.7 = SD(luck)
7.2 = SD(pitching)
4.4 = SD(fielding)

Pretty much along the same lines.

------

Unless I've screwed up somewhere, I think we've got these as our best estimates for BAbip variation in talent:

8.5 = SD(individual pitcher BAbip talent)
2.5 = SD(team pitching staff BAbip talent)
7.7 = SD(team fielding staff BAbip talent)
2.5 = SD(park foul territory BAbip talent)

And, for a single team-season,

7.1 = SD(team season BAbip luck)

For a single team-season, it appears that luck, pitching, and park effects, combined, are about as big an influence on BAbip as fielding skill.

Labels: BABIP, basebal, fielding, statistics, Tango

Sabermetric Research

Thursday, May 28, 2015

Pitchers influence BAbip more than the fielders behind them

1 Comments:

About Me

Previous Posts