You can't forecast outcomes that are random
Predictions are often wrong. In an article in the Wall Street Journal last month, "Numbers Guy" Carl Bialik points out a few that went awry. Two years ago, for instance, a government energy agency predicted that the price of oil would be between $75 and $85 in 2008. In reality, it started out the year close to $100, ran up past $140 in July, and dropped back below $40 by the end of the year. Bialik writes, "winging darts at numbers on a board might have been more accurate."
It's easy to make fun of prognosticators when they get this stuff wrong. But let's not get too hasty. The fact is, the things that are most worth predicting are things that are most unpredictable. If you want a prediction of what time the sun will rise tomorrow morning, you can get 100% accurate predictions from any competent astronomer. But what would be the point?
The price of oil varies so much because there are so many factors that influence it: wars, foreign government policies, consumer behavior, US election results, technological advances, natural disasters, and so on. These things are random. And they are very, very complex, most of them being the result of human thought and action.
Still, shouldn't some people be better skilled at making those predictions than others? Absolutely. Tancred Lidderdale, the economist quoted in Bialik's article, has an excellent understanding of the factors that impact the price of oil, much better than mine. So what's wrong with evaluating his predictions after the fact, to see if he's any good?
The problem is that no matter how much you know about the price of oil, it's random enough that the spread of outcomes is really, really wide: much wider than the effects of any knowledge you bring to the problem.
Suppose that on the basis of Miguel Tejada's career, everyone thinks he should hit .290 next year. But suppose Bob, who's a big fan of Tejada, and follows his plate appearances closely, has noticed something about his performance and thinks differently. Maybe it's some detailed observation that he swings a certain way, and other players with the same swing have declined more in their thirties than average. So Tejada should be only about .286.
That may be absolutely right, and figuring that out was an act of staggering sabermetric genius. Bob's estimate of .286 is correct, and the .290 estimates are all wrong. Bob is literally the only one in the world whose estimate is correct.
But in practice, how do you prove that? The standard deviation of batting average over 500 AB is about 20 points: so even with .286 being correct, there's still a 46% chance that A will hit closer to .290 than .286 next year. There's actually about a 1 in 3 chance that Tejada's average will be below .266 or above .306. For practical purposes, it's impossible to evaluate the two predictions on this one single sample. Even if Bob is omniscient, knowing everything possible about Tejada's talent, health, and diet, it's going to take a lot of evidence to prove that he's a better estimator than the mob, so long as the results of individual at-bats are random.
The problem is the small sample size: over 1000 predictions, or 1,000,000, Bob is going to have a better record than everyone else. But, who makes a million predictions, and who keeps track of them to evaluate them afterwards? And even if we do this a reasonable number of times, like 100, Bob still isn't assured of beating me. If his chance of beating me is 54%, then, if we predict 100 times each, I still have an almost
35% 21% chance of coming out the winner.
That is: an omnisicient expert can beat a reasonably-informed layman only about
65% 79% of the time. And that's after 100 trials each, 100 trials where the predictor actually has a significant edge in knowledge or analysis. In real life, if you get only one trial, and you're not even close to omniscient, and the prediction you're making may not be the one in which you have the most confidence, the public's expectations of you shouldn't be very high. Not because you're ignorant, but because life is just too random.
Of course, this is an arbitrary example, with more randomness (20 points) than knowledge (4 points). But isn't it roughly the same situation for the price of oil? The randomness in the economy is just huge. Part of the reason oil went down last year is because of the recession. The recession happened because of the credit crisis. And very few people foresaw the credit crisis, including people who had thousands, or millions, or billions of dollars on the line. For a government economist to be omniscient, he has to be omniscient about mortgage finance, and on the government's and public's reaction to every crisis that might possibly occur. That's asking a lot, isn't it? To an energy economist, the state of mortgage finance has to be taken as random.
Because life is random, and the price of oil is very sensitive to the randomness of human-caused shocks, you can't expect a single, point estimate of the price of oil to be 95% accurate within $1, or even $5. An estimate that precise is impossible, beyond the scope of human capability, and probably beyond the scope of the most powerful computers that could be imagined. An honest and competent forecaster will tell you that the best he can do is give you *distribution* for the future price of oil: maybe that there's a 60% chance it will be between $60 and $110, and a 10% chance it will be below $60, and maybe a 5% chance it'll go over $200 (if there's a major war in the Middle East, say), and so on. That's not something the newspapers are keen to report on -- it's hard to put in a headline, and it's harder for readers to understand.
What you hope Mr. Lidderdale's agency was probably saying was, "we have our best guess at a probability distribution for what the price of oil will be next year. Its mean is in the $75 to $85 range." If that phrasing makes journalists uncomfortable, fine. But that doesn't change the fact that it's the best anybody can do. And it doesn't change the fact that you can't decide how good a predictor is on the basis of one, two, or even a hundred point estimates. You need a LOT of data. And if an outlier happens, all evaluations are off. I'd bet that anyone who predicted, back in 2007, that oil would jump to $140 and then drop back to $37, is a kook, not an expert. What happened in 2008 was something of an outlier, random, unpredictable, and unknowable. Anyone who came close was probably just drop-dead lucky.