Thursday, April 04, 2013

Accurate prediction and the speed of light II

Last post, I argued that there's a natural limit to how good a prediction can be.  If you try to forecast an MLB team's season record, the best you can hope for, in the long run, is a standard error of 6.4 games.

But ... even if that's true, can't you at least tell the good forecasters from the bad?

Suppose you checked 20 forecasts, and you ranked them at the end of the year.  The guy who finishes first should still get some credit, right?

Well ... not necessarily.

Suppose, that, typically, sportswriters are pretty good estimators of talent.  Maybe they're within 3 games, typically, so if the God's-eye view is that the Brewers should go 81-81, most of the forecasters will guess between 78 and 84.  (The true spread of talent is roughly 9 games.  So we're assuming experts can spot 2/3 of the differences between teams, in a certain sense.)

However, some forecasters are particularly astute, and they're within 2 games.  Others aren't very good, and they're within, say, 5 games.

What happens?  Well, by my simulation, the good forecaster (every team off by exactly two games) should be expected to finish with an average discrepancy (standard error) of 6.7 wins.  The lousy forecaster (every team off by 5) will finish with a discrepancy 8.1.

Not much difference, right?  One forecaster was actually two and a half times worse than the other, in the only part of the task under his control (estimating the talent).  But, his results were only 25 percent worse.

And, he might still "win" the competition!  Again by the simulation, the inferior forecaster will have a lower error more than 35 percent of the time.  Remember, that's when the one guy was 250 percent worse than the other!

-----

That's with two forecasters.  I reran the simulation with nine forecasters, ranging from 0 games off to 8 games off in talent appraisal.  The best forecaster won 17.5 percent of the time.  But the worst forecaster -- who misestimated every team's talent by eight games -- still won 6.5 percent of the seasons!  

And eight games ... well, if you estimate every team team's talent will be 81-81, you'll be off by an average of around 9 games.  So, the guy who almost can't tell one team from another ... he still finished first one season in 16.  

Why does that happen?  Because the luck differences overwhelm most of the skill differences.  The standard error from luck is 6.4 games.  The worst predictor adds on an error of 8.  The square root of (6.4 squared plus 8 squared) equals 10.2.

So: the best predictor -- in fact, the best possible predictor -- winds up at 6.4.  The worst in the simulation winds up at 10.2.  That's not that much worse.  In fact, it's "not that much worse" enough that he still wins 6.5 percent of the seasons.

-----

And ... what are you going to do if the winner winds up under the natural limit of 6.4?  It seems weird that you'd award a prediction trophy for something that MUST have been luck-aided.  Yes, the winner was likeliest the best judge of talent (in the Bayesian sense), but ... it would still be weird.

It's like ... suppose you have high-school track tryouts, and you time the athletes independently with stopwatches.  The judges aren't trained, so they're a bit random.  Sometimes they start the time too early, or stop too early.  Sometimes, their view of the finish line is obscured and they have to guess a bit.  

Every runner takes his turn.  When it's over, you discover that the fastest guy ran the 100 metres in ... 9.4 seconds.

Since the world record is 9.58 ... well, you KNOW this high-school kid didn't get 9.4.  He was obviously just lucky, lucky that the judge's stopwatch deficiencies worked in his favor that time.

That's what it's like when someone makes an almost perfect prediction of the MLB season.  It's not possible to truly be that skilled.  



-----

(Correction above: "team" changed to "team's talent", 4pm.)

Labels: , ,

0 Comments:

Post a Comment

<< Home