Splitting defensive credit between pitchers and fielders (Part III)
UPDATE, 2021-02-01: Thanks to Chone Smith in the comments, who pointed out an error. I investigated and found an error in my code. I've updated this post -- specifically, the root mean error and the final equation. The description of how everything works remains the same.
------
Last post, we estimated that in 2018, Phillies fielders were 3 outs better than league average when Aaron Nola was on the mound. That estimate was based on the team's BAbip and Nola's own BAbip.
Our first step was to estimate the Phillies' overall fielding performance from their BAbip. We had to do that because BAbip is a combination of both pitching and fielding, and we had to guess how to split those up. To do that, we just used the overall ratio of fielding BAbip to overall BAbip, which was 47 percent. So we figured that the Phillies fielders were -24, which is 47 percent of their overall park-adjusted -52.
We can do better than that kind of estimate, because, at least for recent years, we have actual fielding data that can substitute for that estimate. Statcast tells us that the Phillies fielders were -39 outs above average (OAA) for the season*. That's 75 percent of BAbip, not 47 percent ... but still well within typical variation for teams.
(*The published estimate is -31, but I'm adding 25 percent (per Tango's suggestion) to account for games not included in the OAA estimate.)
So we can get much more accurate by starting with the true zone fielding number of -39, instead of the weaker estimate of -24.
-------
First, let's convert the -39 back to BAbip, by dividing it by 3903 BIP. That gives us ... almost exactly -10 points.
The SD of fielding talent is 6.1. The SD of fielding luck in 3903 BIP is 3.65. So it works out that luck is 2.6 of the 10 points, and talent is the remaining 7.3. (That's because 2.6 = 3.65^2/(3.65^2+6.1^2).)
We have no reason (yet) to believe Nola is any different from the rest of the team, so we'll start out with an estimate that he got team average fielding talent of -7.3, and team average fielding luck of -2.6.
Nola's BAbip was .254, in a league that was .296. That's an observed 41 point benefit. But, with fielders that averaged .00074 talent and -0.0026 luck, in a park that was +0.0025, that +41 becomes +48.5.
That's what we have to break down.
Here's Nola's SD breakdown, for his 519 BIP. We will no longer include fielding talent in the chart, because we're using the fixed team figure for Nola, which is estimated elsewhere and not subject to revision. But we keep a reduced SD for fielding luck relative to team, because that's different for every pitcher.
9.4 fielding luck
7.6 pitching talent
17.3 pitching luck
1.5 park
--------------------
21.2 total
Converting to percentages:
20% fielding luck
13% pitching talent
67% pitching luck
1% park
--------------------
100% total
Using the above percentages, the 48.5 becomes:
+ 9.5 points fielding luck
+ 6.3 points pitching talent
+32.5 points pitching luck
+ 0.2 points park
-------------------
+48.5 points
Adding back in the -7.3 points for observed Phillies talent, -2.6 for Phillies luck, and 2.5 points for the park, gives
-7.3 points fielding talent [0 - 7.3]
+6.9 points fielding luck [+10.2 - 2.6]
+6.3 points pitching talent
+32.5 points pitching luck
+2.7 points park [0.2 + 2.5]
-----------------------------------------
41 points
Stripping out the two fielding rows:
-7.3 points fielding talent
+6.9 points fielding luck
-----------------------------
-0.4 points fielding
The conclusion: instead of hurting him by 10 points, as the raw team BAbip might suggest, or helping him by 6 points, as we figured last post ... Nola's fielders only hurt him by 0.4 points. That's less than a fifth or a run. Basically, Nola got league-average fielding.
--------
Like before, I ran this calculation for all the pitchers in my database. Here are the correlations to actual "gold standard" OAA behind the pitcher:
r=0.23 assume pitcher fielding BAbip = team BAbip
r=0.37 BAbip method from last post
r=0.48 assume pitcher OAA = team OAA
r=0.53 this method
And the root mean square error:
13.7 assume pitcher fielding BAbip = team BAbip
11.3 BAbip method from last post
10.2 assume pitcher OAA = team OAA
10.0 this method
-------
Like in the last post, here's a simple formula that comes very close to the result of all these manipulations of SDs:
F = 0.8*T + 0.2*P
Here, "F" is fielding behind the pitcher, which is what we're trying to figure out. "T" is team OAA/BAbip. "P" is player BAbip compared to league.
Unlike the last post, here the team *does* include the pitcher you're concerned with. We had to do it this way because presumably we have data for the team without the pitcher. (If we did, we'd just subtract it from team and get the pitcher's number directly!)
It looks like 20% of a pitcher's discrepancy is attributable to his fielders. That number is for workloads similar to those in my sample -- around 175 IP. It does with playing time, but only slightly. At 320 IP, you can use 19% instead. At 40 IP, you can use 22%. Or, just use 20% for everyone, and you won't be too far wrong.
-------
Full disclosure: the real life numbers for 2017-19 are different. The theory is correct -- I wrote a simulation, and everything came out pretty much perfect. But on real data, not so perfect.
When I ran a linear regression to predict OAA from team and player BIP, it didn't come out to 20%. It came out to only about 11.5%. The 95% confidence interval only brings it up to 15% or 16%.
The same thing happened for the formula from the last post: instead of the predicted 26%, the actual regression came out to 17.5%.
For the record, these are the empirical regression equations, all numbers relative to league:
F = 0.23*(Team BAbip without pitcher) + 0.175*P
F = 0.92*(Team OAA/BIP including pitcher) + 0.115*P
Why so much lower than expected? I'm pretty sure it's random variation. The empirical estimate of 11.5% is very sensitive to small variations in the seasonal balance of variation in pitching and fielding luck vs. talent -- so sensitive that the difference between 11.5 points and 20 points is not statistically significant. Also, the actual number changes from year-to-year because of variation. So, I believe that the 20% number is correct as a long-term average, but for the seasons in the study, the actual number is probably somewhere between 11.5% and 20%.
I should probably explain that in a future post. But, for now, if you don't believe me, feel free to use the empirical numbers instead of my theoretical ones. Whether you use 11.5% or 20%, you'll still be much more accurate than using 100%, which is effectively what happens when you use the traditional method of assigning the overall team number equally to every pitcher.