## Friday, July 13, 2012

### Are economists bad at statistics?

(Warning: this is a boring "how to interpret a regression" post, not much sports.)

Felix Salmon comments on a paper that presented academic economists with the results of a hypothetical regression, and asked them several questions about the results.  It turned out that most of them got it right when they looked at a scatter plot of the raw data.  But when they were given traditional regression results, as produced by statistical software, they blew it.

Specifically, for three of the four questions, a majority of econometricians got them wrong when looking at only the regression results.

I'll translate one of the questions into baseball (using unrealistic numbers that I made up).

A regression finds that each point of OPS (that is, .001) is worth \$20,000 in salary.  The regression found an r-squared of 0.5.  In the data, salary had an SD of \$4 million, and OPS had an SD of .200.

1.  What is the minimum OPS for which a player has a 95% chance of earning more than \$10 million?

2.  What minimum OPS would give a player a 95% chance of earning more money than a player with an OPS of .600?

3.  Given that the confidence interval for salary-per-point-of-OPS is (\$19,500, \$20,500), if a player has an OPS of .800, what is the chance he will earn more than \$15.6 million (which is \$19,500 multiplied by 800)?

4.  If a player has an OPS of .800, what is the chance he will earn more than \$16 million (the point estimate)?

You should be able to figure all these out exactly, except #3 (which you can still estimate).

1.  The SD of salary was \$4 million.  The r-squared was 0.5.  So, after the regression, the SD of unexplained salary is around \$2.8 million (\$4 million divided by the square root of 2).

You need about 2 SDs above the expected \$10 million for a 95% chance.  2 SDs above \$10 million is \$15.8 million.  That translates into an OPS of about .790.

(Actually, I think you need only 1.65 SDs, because it's one-tailed, but never mind.)

2.  The SD of unexplained salary is \$2.8 million.  So the SD of the difference of two of those is \$4 million.  The .600 guy makes \$12 million.  For a 95% confidence interval, we add 2 SD, giving \$20 million.  That works out to an OPS of 1.000.

3.  The confidence interval is a bit of a red herring.  My first reaction was to estimate the chance the *estimate* was greater than \$19,500, which is .975.  But that's not what's being asked.  What's being asked is the chance a *player* is over \$19,500.

The point estimate is \$20,000 per point, and the chance of the player beating the estimate would be exactly 50 percent.  However, we're being asked the chance the player beats \$19,500.  That's a bit easier, so the chance is a bit higher.  Call it, say, 52 percent or something like that (I won't bother figuring it out exactly).

4.  Since the point estimate is unbiased, the chance of beating it is exactly 50 percent.

-----

These questions are much easier if you look at the scatter plot.  I've stolen it from the original post:

The equivalent of question 1 was: what value of X do you need for a 95% chance the value of Y is greater than zero?  That's really easy from the plot -- it looks like it's somewhere between 40 and 50.

Most of the economists got that.

-----

But most got the first three questions wrong when they had the numbers.  In their defense, though, those aren't really the kind of questions normally answered in academic papers.  Normally, questions involving "95%" refer to the coefficient estimates, not the individual datapoints.

So, I'm not convinced that, in every case, the results show a real flaw in their education; rather, I think some of the economists answered a different question, by force of routine.

My guess is that if you explained to some of the participants why their "red herring" answer was wrong, they'd say, "oh, right," and most of them would come up with the right answer.

But I might just be making excuses, because I fell for the red herring trap myself, at first.

-----

I agree with Salmon that more of the economists should have been able to answer the questions.  But I'm not sure about his conclusion:

" ... I see a paper demonstrating a statistically significant correlation between one variable and another, and I generally assume that if the experiment were repeated, we'd see the same thing again.  But that's not actually true.

And so it's easy to see, I think, how economists become convinced of things the rest of us aren't sure of at all -- and how the economists often end up being wrong, while the rest of us were right to be dubious.

... A lot of papers are written; a few of them have interesting findings.  Those are the papers which tend to get publicity.  But there's a very good chance they don't actually show what the headlines say that they show."

Actually, I don't disagree with these statements: I *agree* with them, very much so.  But I disagree that it has much to do with the economists being wrong about this quiz.  Yes, it's true that the incorrect answers tended to discount the amount of randomness in a single observation, assuming that individual datapoints were clustered much closer to their estimate than they really are.  But, strictly speaking, that has nothing to do with whether the experiment is repeatable, or whether the effect is real.

It's like, a study finds that smokers have a 20 percent chance of getting cancer, plus or minus 2 percent.  And the incorrect economists say, "Joe Smith is a smoker.  There's a 95 percent chance that between 16 and 24 percent of Joe's body will get sick."

The economists have missed the point, sure.  But that doesn't affect how real the link is between cancer and smoking.

Hat tip: anonymous commenter in the previous post.

Labels: ,

At Saturday, July 14, 2012 10:51:00 PM,  Mike said...

For #2, I think I got the right answer but did it the wrong way. .600, SD of .200, 2 SD's (for 95%) above it makes 1.000 (assuming we're being lazy with the one-tail/two-tail stuff). I ignored salary completely. Which I think we can do, since the unexplained salary impact of OPS is equally unexplained for both the .600 player and our hypothetically higher-paid player.

At Saturday, July 14, 2012 10:56:00 PM,  Phil Birnbaum said...

Hi, Mike,

I think you answered a slightly different question. You answered, "what is the 95th percentile of salary in the population?"

The reason it works out the same is just the numbers I arbitrarily chose. I could have just as easily have said the SD of OPS is .220, and then it wouldn't work.

At Sunday, July 15, 2012 11:05:00 AM,  Mike said...

That's only the case if the population mean is .600, which wasn't specified.

Still not sure what I did wrong. Salary varies due to unexplained stuff, and it's equally likely to vary up or down for the .600 guy as it is for the 1.000 guy. Why can't we ignore that in the calculation?

At Sunday, July 15, 2012 11:09:00 AM,  Phil Birnbaum said...

You're right, I assumed a mean of .600. My response is wrong.

Let me think for a bit.

At Sunday, July 15, 2012 11:10:00 AM,  Mike said...

Also, possibly a SUPER dumb question, but if r^2 is .5, then r=.71 or so. Why is the \$2.8M the UNexplained portion? Wouldn't that be the explained portion?

At Sunday, July 15, 2012 11:12:00 AM,  Phil Birnbaum said...

OK, imagine if salaries were only very loosely tied to performance. It's still \$20,000 a point, but some bad players make \$50 million and some good players make \$2.

In that case, it takes a LOT of performance to have a 95% chance you're better than a .600 player, right?

That's why you need to know the r-squared, so you can calculate the SD of salary after correcting for performance.

At Sunday, July 15, 2012 11:23:00 AM,  Phil Birnbaum said...

You scared me for a minute ...

I think it's the unexplained portion. Before the regression, suppose the SD of Y was 2. That means the variance was 4.

Now, the r-squared of 0.5 means the variance is halved, down to 2. The SD is the square root of that, which is root 2.

So it's the unexplained portion.

At Sunday, July 15, 2012 11:25:00 AM,  Phil Birnbaum said...

The explained portion, by the way, is also \$2.8 million. That's because the square root of (2.8 million squared, plus 2.8 million squared) is the original \$4 million.

BTW, I don't think anyone says "unexplained portion" for r, only for r-squared, since the r-squareds add up and the r's don't. But I know what you mean.

At Sunday, July 15, 2012 11:46:00 AM,  Mike said...

Ugh, sorry about that last comment, I got interrupted and didn't mean to post it. Agree with your last note there. But since r-squared is the one you add up, it would make more sense to me that in fact \$2M was the portion of SD explained by OPS, and \$2M was the portion not explained by OPS.

I think my stumbling onto the right answer for #2 is actually due to the choice of r-squared of 0.5 and because 2+2=4 and 2*2=4. I think I'd have gotten lucky with any SD's as long as the r-squared is 0.5.

Anyways, if I've learned anything here it's that I need to read up on this stuff and relearn it. Thanks for the time.

At Sunday, July 15, 2012 11:53:00 AM,  Phil Birnbaum said...

Thanks for the comments, Mike. They help me understand it better, too.

At Monday, July 16, 2012 12:03:00 PM,  David said...

Three thoughts:

1. I agree that economists failed at these questions because the questions don't match up with the hypotheses economists usually test. Your typical economist (including me) hasn't worked a problem like this out with pencil and paper in several years.

2. I'd push this thought further, though, and say that economists don't ask questions like this for good reason. Suppose I'm interested in evaluating the effectiveness of a job training program. I don't care as much about the probability that a person participating in the program subsequently has a higher wage than someone who did not. I care about what would be the average improvement in wages if everyone participated. That's what you need to know to do cost-benefit calculations, blah, blah. Of course there's variation in individual outcomes and if the effect of the program varies with some observable characteristics, that's important. But the questions asked in the paper just introduce a bunch of unnecessary noise by focusing on individual outcomes instead of mean outcomes.

(Note: this discussion of course does not apply to a lot of interesting questions about baseball, particularly involving forecasting. This suggests to me that economists interested in baseball should stick to measuring treatment effects and get out of the forecasting business...)

3. Maybe most importantly, the incentive for the respondents to this survey was pretty low. I can't imagine they put in much effort. That seems important for the results.

At Monday, July 16, 2012 12:05:00 PM,  Phil Birnbaum said...

I agree, on all three counts.