Saturday, February 23, 2008

Straight-up picks can't distinguish good pundits from bad

Experts are no better at picking winners of NFL games than a simple algorithm, "Numbers Guy" Carl Bialik reports in his latest Wall Street Journal blog posting.

The easy method is dubbed the Isaacson-Tarbell postulate, after the two readers who proposed it. Pick the team with the better record; if the two teams have the same record, choose the home team. According to's Gregg Easterbrook, no pundit was able to beat Isaacson-Tarbell. Only one was able to tie.

Easterbrook writes,

"You don’t need incredible insider information, you don’t need to spend hours in fevered contemplation … Whatever you do, don't think!"

While normally I love to join the "Super Crunchers"-esque refrain that formulas often know better than "experts," I don't think it really applies in this case, where you have to pick winners straight-up.

NFL matchups are often lopsided. If an .800 team is playing a .300 team, it's obvious that you have to pick the .800 team. No matter how expert you are, no matter how much insider knowledge you have, you simply aren't going to be able to know that the .300 team is better, because it *isn't* better. The same is true for a .700/.400 matchup, or even a .650/.450 matchup. You may be more expert than the rest of the world, but the rest of the world isn't dumb. Everyone picks the .650 team over the .450 team, so your insider knowledge doesn't do you any good. The best you can do is to *tie* the rest of the dumb-but-not-that-dumb punditocracy.

It's only when it's a close matchup that expertise can come into play. Suppose that two evenly-matched teams are going at it, and most experts think team A has a 52% chance of winning. So, they pick team A. For you to outpredict those guys, you have to have insider knowledge, and that knowledge has to be in the direction that leads you to believe that team A has *less than a 50% chance*. That's the only way you'll predict team B will win, and the only way you'll be the rest of the (A-picking) experts.

How many close matchups are there in a season? Maybe one or two a week? Suppose there are 30 close games a season. In those, the best predictor might be more accurate than the pack, say, 50% of the time (to be generous). That's 15 games. In those 15 games, 7.5 will be more accurate go the "wrong" way -- that is, the additional expertise will confirm the pack's pick, not contradict it. That leaves 7.5 games left. Again to be generous, call it 8.

So, in eight games a season, the expert predicts a different team than the pack. But those eight games are already pretty close, almost 50/50. It's probably the case that the pack thinks they have a .520 pick, but they only have a .480 pick. So the expert has a .040 edge for eight games a year. Over 256 games, the best, most expert pundit has an advantage of about a third of a game.

Is it any wonder you can't find out who the experts are by picking straight-up?

If you want to evaluate the experts, just have them pick against the spread. Now *all* the games are close to 50/50, not just 30 of them. Now, the expert has a fighting chance to emerge from the pack.

Most of the touts I've heard of *do* pick against the spread. I haven't seen how they've performed, long-term, but I bet most of them are pretty close to 50%. And, if so, *then* you can conclude that those so-called insiders can't beat a simple algorithm.

But looking at straight-up picks? That's like trying to find the best mathematician in a crowd by asking them what 6 times 7 is. Under those circumstances, the Ph.D. will do about as well as a sixth-grader. It doesn't mean the guy with the doctorate in mathematics doesn’t know more than the eleven-year-old. It just means you asked the wrong question.

Labels: , ,


At Sunday, February 24, 2008 4:38:00 PM, Blogger Brian Burke said...

I think picking straight-up NFL winners is both easier and harder than people think.

I disagree about a couple things. First, in defense of the computer models, the vast majority of them aren't done well. There is an enormous amount of data available on NFL teams, and people tend to take the kitchen-sink approach to prediction models. I started out doing that myself. But if you can identify what part of team performance is repeatable skill and what is due to randomness particular to non-repeating circumstances, you can build a really accurate model.

I also disagree that picking against the spread is a good way to grade pundits. The actual final point difference of a game has as much to do with the random circumstances of "trash time" as with any true difference in team ability. A better alternative may be to have experts weight their confidence in each game as way to compare their true knowledge.

Consider the example you cited about the .800 team facing a .300 team can be misleading. The true .800 team vs. true .300 team is actually fairly rare. As you've previously pointed out, the .800 team may just be a .600 team that's been a little lucky, and the .300 team could really be a .500 team that's been a little unlucky. There are many more "true" .500 and .600 teams than .300 and .800 teams, so this kind of match-up is far more common than you'd expect. And if the ".500" team has home field advantage, we're really talking about a near 50/50 match-up. Although the apparent "0.800" team may still be the true favorite, a good expert can recognize games like this and set his confidence levels appropriately.

One other point. Humans making predictions are often in contests with several others (like the ESPN experts). By picking the favorite in every game, you are guaranteed to come in first...over a several-year contest. But in a single-season contest, you'd be guaranteed to come in 2nd or 3rd to the guy that got a little lucky.

The best strategy is to selectively pick some upsets and hope to be that lucky guy. Plus, toward the end of the year, players that are several games behind are forced to aggressively pick more and more upsets hoping to catch up.

Both of those factors have the effect of reducing the overall accuracy of the human experts. The comparison between math models and experts can often be unfair.

I have several more thoughts but I'll shut up. Sorry for the long comment--you're in my wheelhouse here.

At Monday, February 25, 2008 12:51:00 AM, Blogger Phil Birnbaum said...

Hi, Brian,

That's a good point about regression to the mean, and how lopsided matchups are rarer than I assumed. But still, to beat the averages on picking games straight up, you need to (a) find a game where the consensus is team A, but(b) where the actual favorite should be team B. So you need the game to straddle .500 by a small amount. And then, you still need to get lucky and have the .520 (say) true favorite beat the .480 true underdog.

As for picking against the spread: I agree with you about the final spread being somewhat random because of trash time. But that doesn't matter. As long as the median spread correlates with the winning chances, that's all you need. For instance, if .650 teams win, on average, by +7, and .700 teams win by +9, then if you spot a .700 team that the mob thinks is .650, you give the 7 points and you'll still beat the spread more than half the time. Won't you?

And I agree with you about picking some underdogs to try to win a contest, but I'm assuming that every pundit is trying to maximize his own record, regardless of whether he necessarily beats others. This is perhaps unrealistic for newspaper touts, but accurate for gamblers.

And if you have more comments, I'm happy to hear them!


Post a Comment

<< Home