Saturday, August 30, 2014

Is MLB team payroll less important than it used to be?

As of August 26, about 130 games into the 2014 MLB season, the correlation between team payroll and wins is very low. So low, in fact, that *alphabetical order* predicts the standings better than salaries!

Credit that discovery to Brian MacPherson, writing for the Providence Journal. MacPherson calculated the payroll correlation to be +0.20, and alphabetical correlation to be +0.24. 

When I tried it, I got .2277 vs. .2346 -- closer, but alphabetical still wins. (I might be using slightly different payroll numbers, I used winning percentage instead of raw win totals, and I may have done mine a day or two later.)

The alphabetical regression is cute, but it's the payroll one that raises the important questions. Why is it so low, at .20 or .23? When Berri/Schmidt/Brook did it in "The Wages of Wins," they got around .40.

It turns out that the season correlation has trended over time, and MacPherson draws a nice graph of that, for 2000-2014. (I'll steal it for this post, but link it to the original article.)  Payroll became more important in the middle of last decade, but then dropped quickly, so that 2012, 2013, and 2014 are the lowest of all 15 years in the chart:






What's going on? Why has the correlation dropped so much?

MacPherson argues it's because it's getting harder and harder to buy wins. There is an "inability of rich teams to leverage their financial resources."  The end of the steroids era  means there are fewer productive free-agent players in their 30s for teams to buy. And the pool of available signings is reduced even further, because smaller-market teams can better afford to hang on to their young stars.


"Having money to spend remains better than not having money to spend. That might not ever change. Unfortunately for the Red Sox and their brethren, however, it matters far less than it once did."


------

My thoughts:

1.  The observed 2014 correlation is artificially low, because it's taken after only about 130 games (late-August), instead of a full season. 

Between now and October, you'd expect the SD due to luck to drop by about 12 percent. So, instead of 2 parts salary to 8 parts luck (for the current correlation of .20), you'll have 2 parts salary to 7.2 parts luck. That will raise the correlation to about .22.

Well, maybe not quite. The non-salary part isn't all binomial luck; there's some other things there too, like the distribution of over- and underpriced talent. But I think .22 is still a reasonable projection.

It's a small thing, but it does explain a tenth of the discrepancy.

------

2.  The lower correlation doesn't necessarily mean that it's harder to buy wins. As MacPherson notes, It could just mean that teams are choosing not to do so. More specifically, that teams are closer in spending than they used to be, so payroll doesn't explain wins as well as it used to.

Here's an analogy I used before: in Rotisserie League Baseball, there is a $260 salary cap. If everyone spends between $255 and $260, the correlation between salary and performance will be almost zero -- the $5 isn't enough of a signal amidst the noise. But: if you let half the teams spend $520 instead, you're going to get a much higher correlation, because the high-spending half will do much, much better than the lower-spending half.

That could explain what's happening here.

In 2006, the SD of payroll was around 42% of the mean ($32MM, $78MM). In 2014, it was only 38% ($43MM, $115MM). It doesn't look that much different, but ... teams this year are 10 percent closer to each other than they were, that has to be contributing to the difference.

(This is the first time I've done something where "coefficient of variation" (the SD divided by the mean) helped me, here as a way to correct SDs for inflation.

Also, this is a rare (for me) case where the correlation (or r-squared) is actually more relevant than the coefficient of the regression equation. That's because we're debating how much salary explains what we've actually observed -- instead of the usual question of much salary leads to how many more wins.)


------

3.  While doing these calculations, I noticed something unusual. The 2014 standings are much tighter than normal. 

So far in 2014, the SD of team winning percentage is .058 (9.4 games per 162). In 2006, the SD was larger, at .075 (12.2 games per 162). That might be a bit high ... I think .068 (11 games per 162) is the recent historical average.

But even 9.4 compared to 11 is a big difference.  It's even more significant when you remember that the 2014 figure is based on only 130 games. (I'd bet the historical average for late-August would be between 12 and 13 games, not 11.)

What's going on? 

Well, it could be random luck. But, it could be real. It could be that team talent "inequality" has narrowed -- either because of the narrowing of team spending (which we noted), or because all the extra spending isn't buying much talent these days.

I think the surrounding evidence shows that it's more likely to be random luck. 

Last year, the SD of team winning percentage was at normal levels -- .074 (12.04 games per 162). It's virtually impossible for the true payroll/wins relationship to have changed so drastically in the off-season, considering the vast majority of payrolls and players stay the same from year to year.

Also, it turns out that even though the correlation between 2014 payroll and 2014 wins is low, the correlation between 2014 payroll and 2013 wins is higher. That is: this year's payroll predicts last year's wins (0.37) better than it predicts this year's wins (0.23)! 

Are there other explanations than 2014 being randomly weird? 

Maybe the low-payroll teams have young players who improved since last year, and the high-payroll teams have old players who declined. You could test that: you could check if payroll correlates better to last year's wins than this year's for all seasons, not just 2013-2014.

If that happened to be true, though, it would partially contradict MacPherson's hypothesis, wouldn't it? It would say that the money teams spend on contracts *do* buy wins as strongly as before, but those wins are front-loaded relative to payroll.

We can see how weird 2014 really is if we back out the luck variance to get an estimate of the talent variance.

After the first 130 games of 2014, the observed SD of winning percentage is .058. After 130 games, the theoretical SD of winning percentage due to luck is .044.

Since luck is generally independent of talent, we know

SD(observed)^2 - SD(luck)^2 = SD(talent)^2 

Plugging in the numbers: .058 squared minus .044 squared equals .038 squared. That gives us an estimate of SD(talent) of .038, or 6.12 games per 162.

I did the same calculation for 2013, and got 10.2.

2013: Talent SD of 10.2 games
2014: Talent SD of  6.1 games

That kind of drop in one off-season pretty much impossible, isn't it? 

If that kind huge a compression were real, it would have to be due to huge changes in the off-season -- specifically, a lot of good players retiring, or moving from good teams to bad teams.

But, the team correlation between 2013 wins and 2014 wins is +0.37. That's a bit lower than average, but not out of line (again, especially taking the short season into account). 

It would be very, very coincidental if the good teams got that much worse while the bad teams got that much better, but the *order* of the standings didn't change any more than normal.

So, I think a reasonable conclusion is that it's just random noise that compressed the standings. This year, for no reason, the the good teams have tended to be unlucky while the bad teams have tended to be lucky. And that narrowed the distance between the high-payroll teams and the low-payroll teams, which is part of the reason the payroll/wins correlation is so low. 

------

4. We can just look at the randomness directly, since the regression software gives us confidence intervals. 

Actually, it only gives an interval for the coefficient, but that's good enough. I added 2 SDs to the observed value, and then worked backwards to figure out what the correlation would be in that case. It came out to 0.60. 

That's huge!  The confidence interval actually encompasses every season on the graph, even though 2014 is the lowest of all of them.

To confirm the 0.60 number, I used this online calculator. If the true correlation for the 30 teams is 0.4, the 95% confidence interval goes up to 0.66, and down to 0.05. That's close to my calculation for the high end, and easily captures the observed value of 0.23 in its low end. 

That's not to say that I think they really ARE all the same, that the differences are just random -- I've never been a big fan of throwing away differences just because they don't meet significance thresholds. I'm just trying to show how easy it is that it *could be* random noise.

I can try to rephrase the confidence interval argument visually. Here's the actual plot for the 2014 teams:




The correlation coefficient is a rough visual measure of how closely the dots adhere to the green regression line. In this case, not that great; it's more a cloud than a line. That's why the correlation is only 0.23.

Now, take a look at the teams between $77 million and $113 million, the ones in the second rectangle from the left.

There are eighteen teams in that group bunched into that small horizontal space, a payroll range of only $46 million in spending. Even at the historically high correlations we saw last decade, and even if the entire difference was due to discretionary free-agent spending, the true talent difference in that range would be only about 3 or 4 games in the standings. That would be much smaller than the effects of random chance, which would be around 12 games between luckiest and unluckiest. 

What that means is:  no matter what happens, that second vertical block is dominated by randomness, and so the dots in that rectangle are pretty much assured of looking like a random cloud, centered around .500. (In fact, for this particular case, the correlation for that second block is almost perfectly random, at -.002.)

So those 18 teams don't help much. How much the overall curve looks like a straight line is going to depend almost completely on the remaining 12 points, the high-spending and low-spending teams. In our case, the two low-spending teams are somewhat worse than the cloud, and the ten high-spending teams are somewhat better than the cloud, so we get our positive correlation of +0.23. 

But, you can see, those two bad teams aren't *that* bad. In fact, the Marlins, despite the second-lowest payroll in MLB, are playing .496 ball.

What if we move the Marlins down to .400? If you imagine taking that one dot, and moving it close to the bottom of the graph, you'll immediately see that the dots would get a bit more linear. (The line would get steeper, too, but steepness represents the regression coefficient, not the correlation, so never mind.)  I made that one change, and the correlation went all the way up to 0.3. 

Let's now take the second-highest-payroll Yankees, and move them from their disappointing  .523 to match the highest-payroll Dodgers, at .564. Again, you can see the graph will get more linear. That brings the correlation up to 0.34 -- almost exactly the average season, after mentally adjusting it a bit higher for 162 games.

Of course, the Marlins *aren't* at .400, and the Yankees *aren't* at .564, so the lower correlation of 0.23 actually stands. But my point is not to argue that it should actually be higher -- my point is that it only takes a bit of randomness to do the trick. 

All I did was move the Marlins down by less than 2 SDs worth of luck, and the Yankees by less than 1 SD worth of luck. And that was enough to bump the correlation from historically low, to historically average.

------

5. Finally: suppose the change isn't just random luck, that there's actually something real going on. What could it be?

-- Maybe money doesn't matter as much any more because low-spending teams are getting more of their value from arbs and slaves. They could be doing that so well that the high-spending teams are forced to spend more on free agents just to catch up. It wouldn't be too hard to check that empirically, just by looking at rosters.

-- It could be that, as MacPherson believes, there are fewer productive free agents to be bought. You couuld check that easily, too: just count how many free agents there are on team rosters now, as compared to, say, 2005. If MacPherson is correct, that careers are ending after fewer years of free agency, that should show up pretty easily.

-- Maybe teams just aren't as smart as they used to be about paying for free agents. Maybe their talent evaluation isn't that great, and they're getting less value for their money. Again, you could check that, by looking at free-agent WAR, or expected WAR, and comparing it to contract value.

-- Maybe teams don't vary as much as they used to, in terms of how many free-agent wins they buy. I shouldn't say "maybe" -- as we saw, the SD of payroll, adjusted for inflation, is indeed lower in 2014 than it was in 2006, by about 10 percent. So that would almost certainly be part of the answer. 

-- More specifically: maybe the (otherwise) bad teams *more* likely to buy free agents than before, and the (otherwise) good teams are *less* likely to buy free agents than before. That actually should be expected, if teams are rational. With more teams qualifying for the post-season, there's less point making yourself into a 98-win team when a 93-win team will probably be good enough. And, even an average team has a shot at a wild card, if they get lucky, so why not spend a few bucks to raise your talent from 79 games to (say) 83 games, like maybe the Blue Jays did last year?

-----

I'll give you my gut feeling, but, first a disclaimer: I haven't really thought a whole lot about this, and some of these argument occurred to me as I wrote. So, keep in mind that I'm really just thinking out loud.

On that basis, my best guess is ... that most of the correlation drop is just random noise. 

I'd bet that money buys free agents just as reliably as always, and at the usual price. The correlation is down not because spending buys fewer wins, but because more equal spending makes it harder for the regression to notice the differences.

But I'm thinking that part of the drop might really be the changing patterns of team spending, as MacPherson described. I wonder if that knot of 18 mid-range teams, clustered in such a small payroll range, might be a permanent phenomenon, resulting from more small-market teams moving up the payroll chart after deciding their sweet spot should be a little more extravagant than in the past. 

Because, these days, it doesn't take much to almost guarantee a team a reasonable shot at a wildcard spot -- which means, meaningful games later in the season than before, which means more revenue. 

In fact, that's one area where it's not zero-sum among teams. If most of the fan fulfillment comes from being in the race and having hope, any team can enter the fray without detracting much from the others. What's more exciting for fans -- being four games out of a wildcard spot alone, or being four games out of a wildcard spot along with three other teams? It's probably about the same, right? 

Which makes me now think, the price of a free agent win could indeed change. By how much? It depends on how increased demand from the small market teams compares to decreased demand from the bigger-spending teams.

------

Anyway, bottom line: if I had to guess the reasons for the lower correlation:

-- 80% randomness
-- 20% spending patterns

But you can get better estimates with some research, by checking all those things I mentioned, and any others you might think of.





Hat Tip: Craig Calcaterra


Labels: , , ,

Tuesday, August 26, 2014

Sabermetrics vs. second-hand knowledge

Does the earth revolve around the sun, or does the sun revolve around the earth?

The earth revolves around the sun, of course. I know that, and you know that.

But do we really? 

If you know the earth revolves around the sun, you should be able to prove it, or at least show evidence for it. Confronted by a skeptic, what would you argue?  I'd be at a loss. Honestly, I can't think of a single observable fact that I could use to make a case.

I say that I "know" the earth orbits the sun, but what I really mean by that is, certain people told me that's how it is, and I believe them. 

Not all knowledge is like that. I truly *do* know that the sun rises in the east, because I've seen it every day. If a skeptic claimed otherwise, it would be easy to show evidence: I'd make sure he shared my definition of "east," and then I'd wake him up at 6 am and take him outside.

But that sun/earth thing?  I can only I only say I "know" it because I believe that astronomers *truly* know it, from direct evidence.

------

It occurred to me that almost all of our "knowledge" of scientific theories comes from that kind of hearsay. I couldn't give you evidence that atoms consist, roughly, of electrons orbiting a nucleus. I couldn't prove that every action has an equal and opposite reaction. There's no way I could come close to figuring out why and how e=mc^2, or that something called "insulin" exists and is produced by the pancreas. And I couldn't give you one bit of scientific evidence for why evolution is correct and not creationism. 

That doesn't stop us from believing, really, really strongly, that we DO know these things. We go and take a couple of undergraduate courses in, say, geology, and we write down what the professors tell us, and we repeat them on exams, and we solve mathematical problems based on formulas and principles we are told are true. And we get our credits, and we say we're "knowledgeable" in geology. 

But it's a different kind of knowledge. It's not knowledge that we have by our own experience or understanding. It's knowledge that we have by our own experience of how to evaluate what we're told -- how and when to believe other people. We extrapolate from our social knowledge. We believe that there are indeed people, "geologists," who have firsthand evidence. We believe that evidence gets disseminated among those geologists, who interact to reliably determine which hypotheses are supported and which ones are not. We believe that, in general, the experts are keeping enough of a watchful eye on what gets put in textbooks and taught at universities, that if Geology 101 was teaching us falsehoods, they'd get exposed in a hurry.

In other words, we believe that the system of scientists and professors and Ph.D.s and provosts and deans and journals and textbook publishers is a reliable separator of truth from falsehood. We believe that, if the earth really were only 6,000 years old, that's what scientists would be telling us.

------

Most of the time, it doesn't matter that our knowledge is secondhand. We don't need to be able to prove that swallowing arsenic is fatal; we just need to know not to do it. And, we can marvel at Einstein's discovery that matter and energy are the same thing, even if we can't explain why.

But it's still kind of unsatisfying. 

That's one of the reasons I like math. With math, you don't have to take anyone's word for anything. You start with a few axioms, and then it's all straight logic. You don't need geology labs and test tubes and chemicals. You don't need drills and excavators. You don't actually have to believe anyone on indirect evidence. You can prove everything for yourself.

The supply of primes is infinite. No matter how large a prime you find, there will always be one larger. That's a fact. If you like, you can look it up on the internet, or ask your math teacher, or find it in a textbook. It's a fact, like the earth revolving around the sun.

If you do it that way, you know it, but you don't really KNOW it. You can't defend it. In a sense, you're believing it on faith. 

On the other hand, you can look at a proof. Euclid's proof that there is no largest prime number is considered one of the most elegant in mathematics. The versions I found on the internet use a lot of math notation, so I'll paraphrase.

-----

Suppose you have a really big prime number, X. The question is: is there always a prime bigger than X?  

Try this: take all the numbers from 1 to X, and multiply them together: 1 times 2 times 3 .... times X. Now, add 1. Call that really huge number N. That huge N is either prime, or is the product of some number of primes. 

But N can't be divisible by X, or anything less than X, because that division has to always leave a remainder of 1. Therefore: either N is prime, or, when you factor N into other primes, they're all bigger than X. 

Either way, there is a prime bigger than X.

------

I may not have explained that very well. But, if you get it ... now you know that there is no highest prime. If you read it in a book, you "know" it, but if you understand the proof, you KNOW it, in the sense that you can explain it and prove it to others.

In fact ... if you read it in a textbook, and someone tells you the textbook is wrong, you may have some doubt. But once you see the proof, you will *never* have doubt (except in your own logic). Even if the greatest mathematician in the world tells you there's a largest prime, you still know he's wrong. 

-----

In theory, everything in math is like that, provable from axioms. In practice ... not so much. The proofs get complicated pretty quickly. (When Andrew Wiles solved Fermat's Last Theorem in 1993, his proof was 200 pages long.)  Still, there are significant mathematical results where we can all say we know from our own efforts. For years, I wondered why it was that multiplication goes both ways -- why 8 x 7 has to equal 7 x 8. Then it hit me -- if you draw eight rows of seven dots, and turn it sideways, you get seven rows of eight dots.

There are other fields like math that way ... you and I can know things on our own, fairly easily, in economics, and finance, and computer science. Other sciences, like physics and chemistry, take more time and equipment. I can probably prove to myself, with a stopwatch and ruler, that gravitational acceleration on earth is 9.8 m/s/s, but there's no way I could find evidence of what it is on the moon. 

But: sabermetrics. What started me on all this is realizing that the stuff we know about sabermetrics is more like infinite primes than like the earth revolving around the sun. Active researchers don't just know sabermetrics because Bill James and Pete Palmer told us. We know because we actually see how to replicate their work, and we see, all the way back to first principles, where everything came from. 

I can't defend "e equals mc squared," but I can defend Linear Weights. It's not that hard, and all I need is play-by-play data and a simple argument. Same with Runs Created: I can pull out publicly-available data and show that it's roughly unbiased and reasonably accurate. (I can even go further ... I can take partial derivatives of Runs Created and show that the values of the individual events are roughly in line with Linear Weights.)

DIPS?  No problem, I know what the evidence is, there, and I can generate it myself. On-base percentage more important than batting average?  Geez, you don't even need data for that, but you can still do it formally if you need to without too much difficulty. 

For my own part -- and, again, many of you active analysts reading this would be able to say the same thing --  I don't think I could come up with a single major result in sabermetrics that I couldn't prove, from scratch, if I had to. Even the ones from advanced data, or proprietary data, I'm confident I could reproduce if you gave me the database.

For all the established principles that are based on, say, Retrosheet-level data ... honestly, I can't think of a single thing in sabermetrics that I "know" where I would need to rely on other people to tell me it's true. That might change: if something significant comes out of some new technique -- neural nets, "soft" sabermetrics, biomechanics -- I might have to start "knowing" things secondhand. But for now, I can't think of anything.

If you come to me and say, "I have geological proof that the earth is only 6,000 years old," I'm just going to shrug and say, "whatever."  But if you come to me and say, "I have proof that a single is worth only 1/3 of a triple" ... well, in that case, I can meet you head on and prove that you're wrong. 

I don't really know that creationism isn't right -- I only know what others have told me. But I *do* know firsthand what a triple is worth, just as I *do* know firsthand that there is no highest prime. 

------

And that, I think, is why I love sabermetrics so much -- it's the only chance I've ever had to actually be a scientist, to truly know things directly, from evidence rather than authority.

I have a degree in statistics, but if nuclear war wiped out all the statistics books, how much of that science could I restore from my own mind?  Maybe, a first-year probability course, at best. I could describe the Central Limit Theorem in general terms, but I have no idea how to prove it ... one of the most fundamental results in statistics, one they teach you in your first statistics class, and I still only know it from hearsay.

But if nuclear war wipes out all the sabermetrics books ... as long as someone finds me a copy of the Retrosheet database, I can probably reestablish everything. Nowhere near as eloquently as Bill James and Palmer/Thorn, and I'd probably wouldn't think of certain methods that Tango/MGL/Dolphin did, but ... yeah, I'm pretty sure I could restore almost all of it. 

To me, that's a big deal. It's the difference between knowing something, and only knowing that other people know it. Not to put down the benefits of getting knowledge from others -- after all, that's where most of our useful education comes from. It's just that, for me, knowing stuff on my own ... it's much more fulfilling, a completely different state of mind. As good as it may be to get the Ten Commandments from Moses, it's even better to get them directly from God.



Labels: , ,

Tuesday, August 12, 2014

More r-squared analogies

OK, so I've come up with yet another analogy for the difference between the regression equation coefficient and the r-squared.

The coefficient is the *actual signal* -- the answer to the question you're asking. The r-squared is the *strength of the signal* relative to the noise for an individual datapoint.

Suppose you want to find the relationship between how many five-dollar bills someone has, and how much money those bills are worth. If you do the regression, you'll find:

Coefficient = 5.00 (signal)
r-squared = 1.00 (strength of signal)
1 minus r-squared = 0.00 (strength of noise)
Signal-to-noise ratio = infinite (1.00 / 0.00)

The signal is: a five-dollar bill is worth $5.00. How strong is the signal?  Perfectly strong --  the r-squared is 1.00, the highest it can be.  (In fact, the signal to noise ratio is infinite, because there's no noise at all.)

Now, change the example a little bit. Suppose a lottery ticket gives you a one-in-a-million chance of winning five million dollars. Then, the expected value of each ticket is $5.  (Of course, most tickets win nothing, but the *average* is $5.)

You want to find out the relationship between how many tickets someone has, and how much money those tickets will win. With a sufficiently large sample size, the regression will give you something like:

Coefficient = 5.00 (signal)
r-squared = 0.0001 (strength of signal)
1 minus r-squared = 0.9999 (strength of noise)
Signal-to-noise ratio = 0.0001 (0.0001 / 0.9999)

The average value of a ticket is the same as a five-dollar bill: $5.00. But the *noise* around $5.00 is very, very large, so the r-squared is small. For any given ticketholder, the distribution of his winnings is going to be pretty wide.

In this case, the signal-to-noise ratio is something like 0.0001 divided by 0.9999, or 1:10,000. There's a lot of noise in with the signal.  If you hold 10 lottery tickets, your expected winnings are $50. But, there's so much noise, that you shouldn't count on the result necessarily being close to $50. The noise could turn it into $0, or $5,000,000.

On the other hand, if you own 10 five-dollar bills, then you *should* count on the $50, because it's all signal and no noise.

It's not a perfect analogy, but it's a good way to get a gut feel. In fact, you can simplify it a bit and make it even easier:

-- the coefficient is the signal.
-- the r-squared is the signal-to-noise ratio.

You can even think of it this way, maybe:

-- the coefficient is the "mean" effect.
-- the (1 - r-squared) is the "variance" (or SD) of the effect.

Five-dollar bills have a mean value of $5, and variance of zero. Five-dollar lottery tickets have a mean value of $5, but a very large variance.  

------

So, keeping in mind these analogies, you can see that this is wrong: 

"The r-squared between lottery tickets and winnings is very close to zero, which means that lottery tickets have very little value."

It's wrong because the r-squared doesn't tell you the actual value of a ticket (mean). It just tells you the noise (variance) around the realized value for an individual ticket-holder. To really see the value of a ticket, you have to look at the coefficient.  

From the r-squared alone, however, you *can* say this:

"The r-squared between lottery tickets and winnings is very close to zero, which means that it's hard to predict what your lottery tickets are going to be worth just based on how many you have."

You can conclude "hard to predict" based on the r-squared. But if you want to conclude "little value on average," you have to look at the coefficient.  

------

In the last post, I linked to a Business Week study that found an r-squared of 0.01 between CEO pay and performance. Because the 0.01 is a small number, the authors concluded that there's no connection, and CEOs aren't paid by performance.

That's the same problem as the lottery tickets.

If you want to see if CEOs who get paid more do better, you need to know the size of the effect. That is: you want to know the signal, not the *strength* of the signal, and not the signal-to-noise ratio. You want the coefficient, not the r-squared.

And, in that study, the signal was surprisingly high -- around 4, by my lower estimate. That is: for every $1 in additional salary, the CEO created an extra $4 for the shareholders. That's the number the magazine needs in order to answer its question.

The low r-squared just shows that the noise is high. The *expected value* is $4, but, for a particular case, it could be far from $4, in either direction.  I haven't checked, but I bet that some companies with relatively low-paid executives might create $100 per dollar, and some companies who pay their CEOs double or triple the average might nonetheless wind up losing value, or even going bankrupt.

------

Now that I think about it, maybe a "lottery ticket" analogy would be good too: 


Think of every effect as a combination of lottery tickets and cash money.

-- The regression coefficient tells you the total value of the tickets and money combined.

-- The r-squared tells you what proportion of that total value is in money.  

That one works well for me.

------

Anyway, the idea is not that these analogies are completely correct, but that they make it easier to interpret the results, and to spot errors of interpretation. When Business Week says, "the r-squared is 0.01, so there is no relationship," you can instantly respond:

"... All that r-squared tells you is, whatever the relationship actually turns out to be, the signal-to-noise ratio is 1:99. But, so what? Maybe it's still an important signal, even if it's drowned out by noise. Tell us what the coefficient is, so we can evaluate the signal on its own!"

Or, when someone says, "the r-squared between team payroll and wins is only .18, which means that money doesn't buy wins," you can respond:

"... All that r-squared tells you is, whatever the relationship actually turns out to be, 82 percent of it comes in the form of lottery tickets, and only 18 percent comes in cash. But those tickets might still be valuable! Tell us what the coefficient is, so we can see that value, and we can figure out if spending money on better players is actually worth it."

------

Does either one of those work for you?  




(You can find more of my old stuff on r-squared by clicking here.)


Labels: , ,