## Friday, February 09, 2007

### "Win Zone" gives golfer win probabilities for tournaments in progress

Suppose that after the first round of a four-round PGA golf tournament, Tiger Woods is at 67 while some guy you never heard of is leading at 66. Who has the better chance of being the eventual winner?

It's obviously Tiger: he's capable of putting a good score together every round, while the leader probably just had a lucky Thursday. Tiger is likely to wind up around 67-67-67-67, while the other guy will probably go something like 66-73-78-74.

A new statistic from the Golf Channel, called "
Win Zone," follows this logic to try to estimate players' chances of winding up the winner of the tournament. It looks at the players' histories, their scores, what hole they're on, how hard the course is, and so on, and comes up with a number representing the player's actual chances.

The Golf Channel website describes the system via a text summary and a video (follow above link), but doesn't give a lot of details on how it works. They do say that it runs "over 2 million calculations every minute," which suggests a simulation, or perhaps a Markov Chain analysis (but a simulation seems much more likely).

As I write this, the second round of the "2007 AT&T National Pebble Beach Pro-Am" is complete.
Here are the Golf Channel's top 10 in "Win Zone," along with those players' leaderboard stats. (This link is probably good until tomorrow when the next round starts.)

01. 35.7% Jim Furyk .......... –12 (tied for 1st)
02. 33.0% Phil Mickelson ..... –12 (tied for 1st)
03. 11.1% John Mallinger ..... – 9 (tied for 3rd)
04. 06.5% Kevin Sutherland ... – 9 (tied for 3rd)
05. 03.4% Craig Kanada ....... – 7 (tied for 5th)
06. 03.0% Davis Love III ..... – 7 (tied for 5th)
07. 02.5% Mark Hensby ........ – 7 (tied for 5th)
08. 01.8% Vijay Singh ........ – 5 (tied for 11th)
09. 01.5% Aaron Baddeley ..... – 5 (tied for 11th)
10. 01.0% Jesper Parnevik .... – 4 (tied for 18th)
10. 01.0% Atwal Arjun ........ – 2 (tied for 40th)
10. 01.0% Kirk Triplett ...... – 2 (tied for 40th)
10. 01.0% Justin Leonard ..... – 1 (tied for 57th)

This is pretty much what you'd expect – but I'm surprised at how Justin Leonard is seen to have as good a chance to win as Jesper Parnevik. Parnevik is three strokes ahead of Leonard, and has only 17 players to pass. Leonard has to make up 11 strokes in two rounds, and jump over 56 other players on the way up.

You'd expect Leonard's high probability of winning, despite his mediocre score so far, would mean he's a better golfer than Parnevik. It doesn't seem that way. I don't follow golf much, but Leonard has a "World Golf Rating" (whatever that is) of 176th, while Parnevik is 107th. So that can't be it.

What is it, then? According to the Golf Channel article and video, the ranking takes into account such other factors as:

-- who played well on the course in previous rounds of this tournament
-- how players have done on this course in the past
-- how players have done on *specific holes* of this course in the past.

So I'm wondering whether Win Zone isn't reading too much into the small samples of previous player/course/hole results. There's no way to tell, because they don't give details of their calculations or weightings. But I'd be wary of any statistic that considers whether a player is on a "hot hand," in light of the fact that studies have generally been unable to find such an effect in other sports.

Aside from that, does Win Zone work?

Well, there's really no way to know; Win Zone is new, and there's not enough data to analyze. The Golf Channel's arguments in its favor aren't really all that relevant. For instance, they tell us that Win Zone gives a better chance of picking the winner than just the current leaderboard. But that's not much of an achievement – in the example in the first paragraph, it would be obvious to any fan that Tiger had a better chance of winning than his no-name opponent, and we wouldn't need a fancy methodology to tell us that.

(Also, I disagree with some of the explanations in the video – for instance, they argue that a Win Zone probability of 50% is a "milestone," because at that point "the odds are with you." To which I say, "so?" Why should 50.1% be that much more important than 49.9%?)

But one reasonable way to check the system is to compare it to "market odds" from tradesports.com:

Furyk ...... 31.8 to 34.3% (Win Zone says 35.7%)
Mickelson .. 33.2 to 35.5% (Win Zone says 33.0%)
Love III .... 3.8 to 6.7 % (Win Zone says 3.0%)
Singh ....... 1.9 to 4.3 % (Win Zone says 1.8%)

Of course, the bettors' actions could be influenced somewhat by Win Zone – but betting markets tend to be pretty smart, and I trust their estimates to be hard to beat.

So it seems to me that Win Zone would give you a reasonably accurate rundown of the probabilities, at least for the favorites. I do wish they had given more details of their system, but even without those details, their win probabilities are a useful addition to the regular leaderboard stats.

(Thanks to John Matthew IV for the pointer.)

