Wednesday, November 16, 2022

Home field advantage is naturally higher in a hitter's park

The Rockies have always had a huge home-field advantage (HFA) at Coors. From 1993 to 2001, Colorado has played .545 at home, but only .395 on the road. That's the equivalent of the difference between going 89-73 and 64-98. 

Why such a big difference? I have some ideas I'm working on, but the most obvious one -- although it's not that big, as we will see -- is that higher scoring naturally, mathematically, leads to a bigger HFA.

When teams play better at home than on the road -- for whatever reason --the manifestation of "better" is in physical performance, not winning percentage as such. The translation from performance to winning percentage depends on the characteristics of the game. 

In MLB, historically, the home team plays around .540. But if the commissioner decreed that now games were going to be 36 innings long instead of 9, the home advantage would roughly double, with the home team now winning at a .580 pace.

(Why? With the game four times as long, the SD of the score difference by luck would double. But the home team's run advantage would quadruple. So the run differential by talent would double compared to luck. Since the normal distribution is almost linear at such small differences (roughly, from 0.1 SD to 0.2 SD), HFA would approximately double.)

But it's not *always* that a higher score number increases HFA. If it was decided that all runs now count as 2 points, like in basketball, scoring would double, but, obviously, HFA would stay the same. 

Roughly speaking, increased scoring increases the home advantage only if it also increases the "signal to noise ratio" of performance to luck. Increasing the length of the game does that; doubling all the scores does not.

In 2000, Coors Field increased scoring by about 40%. If that forty percent was obtained by increasing games from 9 innings to 13 innings, HFA would be around 20% higher. If the forty percent was obtained by making every run count as 1.4 runs, HFA would be 0% higher. In reality, the increase could be anywhere  between 0% and 20%, or beyond.

We probably have the tools available to get a pretty good estimate of the true increase.


Let's start with the overall average HFA. My subscription to Baseball Reference allowed me to obtain home and road batting records, all teams combined, for the 1980-2022 seasons:

         AB        H     2B    3B    HR     BB     SO
home   3209469 846723 161290 19928 95790 321178 612545
road   3363640 859813 163954 17203 96043 308047 668363

What's the run differential between those two batting lines? We can look at actual runs, or even the difference in run statistics like Runs Created or Extrapolated Runs. But, for better accuracy, I used Tom Tango's on-line Markov Calculator (the version modified by Bill Skelton, found here). It turns out the home batting line leads to 4.79 runs per nine innings, and the road batting line works out to 4.36 R/9.

         AB        H     2B    3B    HR     BB     SO    R/9
home   3209469 846723 161290 19928 95790 321178 612545  4.79
road   3363640 859813 163954 17203 96043 308047 668363  4.36
difference                                              0.43

That's a difference of 0.43 runs per game. Using the rule of thumb that 10 runs equals one win, a rough estimate is that the home team should have a win advantage of 0.043 wins per game, for a winning percentage of .543. 

That's a pretty good estimate -- home teams actually went .539 in that span (51832-44409). But, we'll actually need to be more accurate than that, because the "10 runs per win" figure will change significantly for higher-scoring environments such as Coors. 

So let's calculate an estimate of the actual runs per win for this scoring environment.

The Tango/Skelton Markov calculator includes a feature where, given the batting line, it will show the probability of a team scoring any particular number of runs in a nine-inning game. Here's part of that output:

          home   road
2 runs:  .1201  .1342
3 runs:  .1315  .1404
4 runs:  .1282  .1309

From this table, which actually extends from 0 to 30+ runs, we can calculate how many runs it would take for the road team to turn a loss into a win.

Case 1:  If the road team is tied after 9 innings, it has about a 50% chance of winning. With one additional run, it turns that into 100%. So an additional run in a tie game is worth half a win.

How often is the game tied? Well, the chance of a 2-2 tie is .1202*.1342, or about 1.6%. The chance of a 3-3 tie is .1315*.1404, or 1.8%. Adding up the 2-2 and the 3-3 and the 0-0 and the 1-1 and the 4-4 and the 5-5, and so on all the way down the line, the overall chance is 9.7%.
Case 2:  If the road team is down a run after 9 innings, it loses, which is a 0% chance of winning. With one additional run, it's tied, and turns that into a 50% chance. So, an additional run there is also worth half a win.

How often is the road team down a run? Well, the chance of a 3-2 result is .1315*.1342, or about 1.8%. The chance of 4-3 is .1282*.1404, another 1.8%. And so on.

The total: a 9.54% chance the road team winds up losing by one run.

What's the chance that the additional run will give the *home* team the extra half win? We can repeat the calculation, but instead of 3-2, we'll calculate 2-3. Instead of 4-3, we'll calculate 3-4. And so on.

The total: only 8.54%. It makes sense that it's smaller, because the better team is less likely to be behind by a run than ahead by a run.

We'll average the home and road numbers to get 9.04%. 

So, we have:

9.7% chance of a tie
9.0% chance of behind one run
18.7% chance that a run will create half a win

Converting that 18.7% chance to R/W:

    0.187 half-wins per run  
=   5.35 runs per half-win 
=   10.7 runs per win

So, we'll use 10.7 runs per win for our calculation.

(Why, by the way, do we get 10.7 runs per win instead of the rule of thumb that it should be 10.0 flat? I think it's becuase the Markov simulation always plays the bottom of the ninth, even when the home team is already up. It therefore includes a bunch of meaningless runs that don't occur in reality. When some of the run currency is randomly useless, it pushes the price of a win higher.

We'd expect that roughly 1/18 of all runs scored are in the bottom of the ninth with the home team having already won. If we discount those by multiplying 10.7 by 17/18, we get ... 10.1 runs per win. Bingo.)

We saw earlier that the home team had an advantage of 0.43 runs per game.
 Dividing that by 10.3 runs per win, gives us

Predicted: HFA of .42 wins per game (.542)
Actual:    HFA of .39 wins per game (.539)

We're off a bit. The difference is about 2 SD. My guess is that the Markov calculation, which is necessarily simplified, is very slightly off, and we only notice because of the huge sample size of almost 100,000 actual games. 


OK, now let's do the same thing, but this time for Coors Field only.

I could do the same thing I did for MLB as a whole: split the combined Coors batting line into home and road, and calculate those individually. The problem with that is ... well, if I do that, I'll be getting the Rockies' actual HFA at Coors, which is huge, because it includes all kinds of factors that we're not concerned with, like altitude acclimatization, tailoring of personnel to field, etc.

So, I'm going to try to convert the Coors line into an approximation of what the split would look like if it were similar to MLB as a whole.

Here's that 1980-2022 MLB split from above, except I've added the percentage difference between home and road (on a per-AB basis) below:

         AB        H     2B      3B     HR     BB     SO
home   3209469 846723 161290   19928  95790 321178 612545
road   3363640 859813 163954   17203  96043 308047 668363
diff            +3.2%  +3.5%  +21.4%  +4.5%  +9.3%  -3.9%

I'll try to create something similar for 2000 Coors.  The overall batting line, for both teams, looked like this:

         AB    H   2B 3B  HR  BB  SO     R/9   
Coors  5843  1860 359 56 245 633 933    7.43

Here's my arbitrary split, into Rockies vs. road team, in such a way to keep roughly the same percentage differences as in MLB overall, while also keeping the R/9 roughly 7.43. Here's what I came up with:

          AB      H      2B     3B      HR     BB     SO  
  home   5843   1884    362     66     249    672    936
  road   5843   1826    350     54     238    615    974
  diff         +3.2%  +3.4%  +22.2%  +4.6%  +9.3%  -3.9%

I ran those through Tango's calculator to get runs per 9 innings:

          AB     H    2B     3B   HR    BB    SO     R/9
  home   5843  1884  362     66  249   672   936    7.783
  road   5843  1826  350     54  238   615   974    7.071
  avg                                               7.427
  diff                                              +.712

Next, I ran the runs-per-game distribution calculation to get a runs-per-win estimate. (I won't go through the details here, but it's the same thing as before: calculate the probability of a tie, then a one-run home win, then a one-run road win, etc.)

The result: 14.37 runs per win. 

As expected, that's significantly higher than the 10.7 we calculated for MLB overall. (Adjusting 14.37 for the superfluous bottom-of-the-ninth gives about 13.6, so, if you prefer, you can compare 13.6 Coors to 10.1 overall.)

The difference of .712 runs per game, divided by 14.43 runs per win, gives an HFA of 

0.0495 wins per game

Which translates to a home winning percentage of .5495. 

Comparing the two results:

.542 home field winning percentage normal
.549 home field winning percentage Coors
.007 difference

The difference of .007 is worth only about half a win per home season. Sure, half a win is half a win, but I'm a little disappointed that's all we wind up with after all this work. 

It's certainly not as much of an effect as I thought there would be before I started. Even if you deducted this inherent .007, it would barely make a dent in the Rockies' 150 percentage point difference between Coors and road. The Rockies would still be in first place on the FanGraphs chart by a sizeable margin -- 42 points instead of 49.

Looked at another way, an additional .007 would move an average team from the middle of the 29-year standings, to about halfway to the top. So maybe it's not that small after all.

Still, our conclusion has to be that the Rockies' huge HFA over the years is maybe 10 percent a mathematical inevitability of all those extra runs, and 90 percent other causes.

Labels: , , , ,