An economist predicts the Olympic medal standings
Daniel Johnson is an economics professor who, according to Forbes magazine, makes "remarkably accurate" predictions on how many Olympic medals each country will win. But I'm not sure, based on the description given, that the predictions are all that remarkable.
From the article, it sounds like what Johnson is doing is running some kind of regression, on "per-capita income, the nation's population, its political structure, its climate and the home-field advantage for hosting the Games or living nearby." He doesn't consider anything specific about the sports or athletes.
How accurate are Johnson's predictions? I'm not really sure. Forbes says,
"Over the past five Olympics, from the 2000 Summer Games in Sydney through the 2008 Summer Games in Beijing, Johnson's model demonstrated 94% accuracy between predicted and actual national medal counts. For gold medal wins, the correlation is 87%."
What does that mean? From the word "correlation," my guess is that those numbers are the correlation coefficient, or "r". But an r of .94 doesn't mean that the predictions are 94% accurate. It just means that the *best fit straight line* is 94% accurate. It's possible to be wrong, perhaps badly wrong, in every guess, but still have an r of 1, which means 100% correlation. For instance, if you underpredict every below-average country by 50% of the difference, and you overpredict every above-average country by 50% of the difference, you'll get a perfect correlation, but really crappy guesses. For instance, here's an example of how that might happen:
Country A: estimate 80, actual 65
Country B: estimate 60, actual 55
Country C: estimate 40, actual 45
Country D: estimate 20, actual 35
Regressing estimate on actual (or is it actual on estimate? I forget which way the word "on" implies, but never mind) gives a 100% correlation, but the actual guesses aren't spectacular in the least.
Anyway, it might be some other method that Johnson uses to compute the accuracy percentage, but it's hard to evaluate the claims without an explanation.
(UPDATE: as this blog post was going to press, I found Johnson's website, which confirms that it *is* correlation. It still could be that it's some kind of method that doesn't have the flaw of my example above. The site contains a media release, but no actual copy of the paper, which was published in "Social Science Quarterly" in December, 2004.)
More importantly, you can't tell how impressive a set of predictions is without something to compare it to. At Forbes, commenter "Doubter" points this out, and tries using the results of the previous Olympics to predict the current one. For the top five countries, he gets an 85% accuracy rating, and correctly points out that "include a bunch of countries with stable medal counts (Jamaica, Japan, Nigeria, Kenya, most European countries) and I am sure it gets much better."
I'm pretty confident that if you were to just use a weighted average of previous Olympics results and adjust for home field advantage, you'd come pretty close to what Mr. Johnson was able to do. Forbes should have realized that the results probably aren't "remarkably accurate" -- just "accurate".
Also confusing is the estimate of home field advantage. There were no results given for the Winter Olympics, but for the summer games, the host team "typically garners 25 additional medals compared with its expected performance, 12 of them gold." It doesn't really make sense that the home field advantage should be a fixed number of medals. Shouldn't it be a percentage increase? Canada won 11 medals in 1976 in Montreal, none of them gold. That was a few more medals than usual, probably because of home field. Should they really have been expected to win minus 14 medals in 1988 in Seoul, of which minus 12 would be gold? Or, on the other hand, were they just unlucky in Montreal, where they should have won about 30, when they were in the single digits in 1968 and 1972?
Or, if Pakistan were to host the Olympics, would you really expect them to jump from (say) 1 medal to 26?
Oh, and one more thing: in his 2010 predictions, Johnson has the top 13 countries winning 250 medals, but only 57 golds. Overall, gold are 33.3% of medals, but for those 13 countries, Johnson has them winning only 22.8% golds. How come? An eyeballing of the 2006 chart shows about 1/3 golds for those countries then ... I wonder why the drop?