The Royal Canadian Golf Association (RCGA), Canada's governing body for golf, has a committee to consider updating the system by which a golfer's handicap is computed. Tim B. Swartz, the committee's statistical guy, has a paper in the most recent JQAS explaining the new proposed system.
I'm going to simplify things a bit and explain the situation as I understand it from the paper. Leaving out the technical adjustments (even at the cost of a bit of inaccuracy), I'll describe the current handicap system like this:
You start with your 20 most recent scores (with a few adjustments that I'll discuss later), relative to par. Then, you drop the 10 worst scores, leaving only the 10 best. You average those 10 best. That's your handicap.
Why change this system? For one thing, as Swartz points out, your handicap isn't a true indication of your expected score; golfers fail to shoot their handicap more often than not. As I see it, you're taking the average of the top half, so you'd expect the handicap to be at about the 25% mark. If you assume that scores are normal, then since the normal curve is fatter near the middle, it's a bit more than 25%. As it turns out, the mean of the right half of the normal curve is about
.6367 .798, and the chance of beating a Z-score of +.6367 +.798 is about 26.2% 21.2%. So you'd expect a golfer to beat his handicap about 21% of the time.
Swartz checked, using a database of scores from a golf club in Alberta. As it turns out, golfers actually beat their handicap 36% of the time, not 21%. Maybe I made a mistake in the calculation; maybe golf scores aren't really normal; or maybe the various adjustments are causing the difference.
Another problem with the current system is that in casual head-to-head play, it favors the better golfer. Swartz generated a bunch of random matches from his database, and found that the better golfer won 55 percent of the time, rather than 50 percent.
A third problem, and an important one, is that in multi-player tournaments, the winner is likely to be a golfer with a higher handicap. That's because a bad golfer, with a handicap of (say) 20 (which represents a score of about 92), could reasonably have a very good day and shoot an 80, finishing at -12. But a scratch golfer (0 handicap, 72 average) is much less likely to match the -12 by shooting a 60 on the day.
The more players in the tournament, the more likely someone will have a much better game than normal. And those "much betters" are likely to be from the worst golfers.
In his simulation, Swartz found that the top third of golfers won only 27% of the 99-player tournaments. The middle third won 33%, and the worst third won 40%. So the current system favors the better golfer in tournaments of two players, but favors the worse golfers in tournaments of many players.
So how does Swartz fix the current system? Two ways: he makes the handicap represent the player's average score, instead of his 74th percentile score. Second, he divides by a player's standard deviation (effectively converting a raw score to a Z-score), which neutralizes the luck factor in large tournaments.
Here are the details.
Like the current system, Swartz considers only the 20 most recent scores. But instead of dropping the worst 10, he drops only the worst four, leaving 16 scores. Then, instead of just averaging them, the new system uses mathematical statistical techniques to estimate the best normal curve to fit the data (keeping in mind that the four worst scores are missing). That is, it asks the question: what is the best fit normal curve that takes into account that we're looking at only the best 16 of 20 observations?
Swartz gives linear formulas (like a Linear Weights estimate of the 16 scores) to estimate the mean and SD of that best-fit curve; he says that those formulas are minimum variance linear unbiased estimators, which means you can't do better (by using different weights) unless you go to a non-linear estimator.
Those estimates of mean and SD become the player's stated handicap (so, effectively, there are two numbers for the handicap instead of one). Then, for his next (21st) round, his raw score is converted to a Z-score, and that's what gets compared to the other players' Z-scores to determine the winner.
In the study's simulations, golfers shot their handicap 45% of the time with this new system (fairer than 36% with the old system); one-on-one matchups were won by the better golfer only 48 to 51 percent of the time (fairer than 55%); and in tournaments, the best golfers won 29% of the time (fairer than 27%) while the worst won 32 to 34 percent of the time (fairer than 40%).
I promised some details of the adjustments to player scores that go into the formulas. I'll outline them here, and you can see the paper (which is nicely presented and very easy to read) for the details.
First, under both systems, scores are adjusted twice for the difficulty of the course. There's the course rating, which specifies how hard the course is for excellent (scratch) golfers, and the slope rating, which specifies how hard the course is for worse (bogey) golfers after adjusting for the course rating.
Then, there's something called "equitable stroke control" (ESC). That sets a maximum possible score for each hole, so that (for instance) a bad golfer can't score more than a quadruple bogey. Even if it takes him ten strokes to finish a par-three, he can't put more than 7 down on the scorecard. (In Canada, the stroke limit varies by handicap between bogey and quadruple-bogey; in the US, it seems the limits are fixed and not based on par. Swartz says this is the only difference in the current system between the two countries.)
The idea is that very high scores measure golfer frustration rather than skill, and should be discounted. Also, Swartz says, it discourages "sandbagging," which is deliberately trying to inflate your handicap, and provides a maximum if you forget to write down your score.
In this study, Swartz often gives results both with ESC and without, and the results are fairly similar.
But, after all those adjustments, I think the essence is:
-- the old system adjusts for your score relative to your own 63rd percentile.
-- the new system adjusts any outliers in your worst quintile, and gives you a Z-score relative to your own distribution.
I'm not a serious golfer, so I don't know if the added complexity of the new system is worth the advantages. It does seem to me that the new system is better, though.