Accurate prediction and the speed of light
This is the time of year when you see lots of baseball prediction stuff ... how many games teams will win, who will finish in first place, how the post-season brackets will go, and so on.
And I hate them, when they're taken seriously. Because, predicting outcomes with a high degree of accuracy is impossible. All you can do is guess at the basic probabilities. After that, it's all luck.
Suppose that you're able to figure that a certain team -- Milwaukee, say -- is actually a .500 team in terms of talent. Obviously, there's going to be a certain amount of error in your assessment, since it's impossible to know for sure -- but, for the sake of argument, let's say you just know.
Then, subject to certain nitpicks (which I'll leave in the comments), you can consider the Brewers season like 162 coin tosses. The most likely outcome is 81 heads and 81 tails, but it's probably going to be different just because of luck. Statistically, you can calculate that the standard error is around 6.4 wins. That means that, around 1/3 of the time, your estimate will be off by more than around 6.4 wins either way. And, around 1/20 of the time, your estimate will be off by more than 12.8 wins.
Suppose that, being rational, you predict 81-81. And, at the end of the season, the Brewers indeed wound up 81-81. You're a hero! But, you were lucky. The chance that an average team will go exactly 81-81 is ... well, I'm too lazy to calculate it, so I simulated it, and it's around 6.3 percent. You hit a 15:1 longshot.
-------
Basically, it's like a law of nature that it is impossible to regularly forecast team records with a margin of error of fewer than around 6.4 wins. Not difficult, but *impossible*. It's impossible in the same sense as constructing a perpetual motion machine is impossible, or turning lead into gold on your kitchen stove is impossible, or accurately determining the temperature 100 years from today at 4:33 pm is impossible. No matter how much you know about the team, and the players, and the second baseman's diet, and the third baseman's mental state, and whether the right fielder is on PEDs ... the best you can do, in the long run, is a standard error of around 6.4 wins.
When forecasters have a contest, and after the season, one of them has "won" with, a standard error of, say, 4.9 wins ... well, you may be impressed. But he was certainly at least partly lucky. He beat the natural limit of 6.4. He was better than perfect. You may think you're praising his forecasting acumen, but, really, you're implicitly praising his ability to influence coin tosses.
-------
As far as I'm concerned, this feature of randomness -- the existence of a "speed of light" limit to accuracy -- is so fundamental that it should be called "The First Law of Forecasting," or something. There is a natural limit that cannot be breached, and it usually comes much sooner than we expect.
The newspapers are full of writers, and pundits, that ignore this law, not just in sports, but in everything. They assume that if you're smart enough, and expert enough, you can accurately predict who's going to win tomorrow's game, or what the Dow-Jones average will be next year, or what's going to happen in North Korea.
But you can't.
Labels: forecasting, randomness, statistics
8 Comments:
If you want to get nitpicky, you could say the 6.4 limit is an overestimate. Maybe the Brewers are sometimes .650 (home against a weak team) and sometimes .350 (road against a strong team). That would reduce the amount of luck. Maybe games aren't strictly independent, because of different starting pitchers, among other things.
Also, if you predict all MLB teams at once, the 30 results aren't independent. If one team outperforms, that means the other teams, on average, must underperform.
But even all that stuff would reduce the standard error only from 6.4 wins to ... guessing here, but maybe 6 wins. And, you'd have to add in other stuff that would *increase* the error, like injuries and unexpected in-season talent changes.
If you want to mentally rewrite the post using 6.0 instead of 6.4, I won't argue with you. I still think 6.4 is reasonable, though.
One other thing: 6.4 games is the *mean square* error. If you just want the *mean* -- as in, "on average, how many games will you be off?" -- it's only around 5. (The mean absolute error is always smaller than the mean square error.)
Doesn't the claim that this is a fundamental law assume that sports is actually governed by a random process?
You could apply the ideas to the sequence of "random" numbers that computers generate, and it would be true that you could never do better than guessing uniformly IF the computer was truly using a random process to generate the numbers. But in reality, computers use completely deterministic algorithms to generate its "random" numbers. If you drop the assumption that it's a random process, now you can build computer programs that can break the fundamental law, and these numbers are no longer cryptographically secure.
It seems like there are plenty of simplifying assumptions that are made in sabermetrics. For example, we may assume that all at-bats are independent. But if a player comes into a season nursing a shoulder injury, which causes his batting average early in the season to be lower than expected. A pundit who takes that information into account may be able to make a better prediction of his batting average than the sabermetric prediction.
What are your thoughts on these simplifying assumptions and how they relate to the fundamental limit?
Hi, Ben,
All you need for the limit to hold is for it to be impossible to predict the outcome of any single game. With that assumption, the rest of it follows mathematically. That is, if you can't predict one game better than 50/50, it MUST be true that you can't predict 162 games better than with an SD of 6.4.
The fundamental limit falls very slowly as you get better. If you can predict each game 80 percent, rather than 50, the limit is 5 games instead of 6.4. If you get to 90%, the limit is still 3.9 games.
And, of course, nobody can predict 90% of games. If they could, bookmakers wouldn't exist.
WHY you can't predict 90% of games -- whether you call it "random processes" or otherwise -- is irrelevant.
Also: your example about understanding injuries can get you closer to the limit, but not past it. The limit of 6.4 assumes you know EVERYTHING about the underlying talent, including injuries and attitudes. If you don't know everything, then your expected error is going to be higher than 6.4, perhaps substantially higher.
That makes a lot of sense. Thanks for the explanation!
The N of 162 in the binomial distribution is what makes preseason baseball predictions such a crapshoot. Take NFL football and it's much shorter 16 game season and...WHOA, even in football, given the same .500 example, it's STILL only a 19.6% chance. It's amazing how intuition and reality can be so far off. Great Post!
Hey Phil,
Can you explain if Vegas beats the 6.4 on teams o/u for a season regularly, rarely, not at all?
If so, any thoughts on why
10can't: I have no idea. My guess is, if you used the Vegas O/U as a prediction, the standard error would be somewhere around ... I dunno, 8 or 9?
That means that 2/3 of teams should wind up within 8 or 9 wins of the O/U number. If that's true, then ... I dunno, maybe 40 to 50% of teams would wind up closer than 6.4? I could do the calculation, but I'm too lazy.
Post a Comment
<< Home