Saturday, November 18, 2006

Payroll vs. wins for basketball, football

On the "Wages of Wins" blog, David Berri now posts the results of a salary vs. performance regression in basketball, and another in football. For basketball, they find 15% of wins "explained" by salary, and only 5% in the NFL. I assume this means r-squared; the r's would be .39 and .22 respectively. (For baseball, they had previously found r-squared of 18%, or r of 43%.)

For basketball, Berri takes the analysis a step further, and show the average record for each of the five NBA salary quintiles:

Highest salary ....... \$78MM .... 37.8 wins
Quartile 2 ........... \$61MM .... 42.5 wins
Quartile 3 ........... \$54MM .... 39.7 wins
Quartile 4 ........... \$47MM .... 47.7 wins
Lowest payroll ....... \$38MM .... 39.5 wins

One surprising thing about this breakdown is that if you do a regression on just the quintiles, you get a negative correlation between salary and wins – and it's minus 39%, exactly equal in magnitude to the positive correlation Berri got for the regression on the full league! I don't think that really means anything important, although it's an interesting coincidence that it worked out that way. And it does mean that the relationship between salary and wins within each quintile must be exceptionally high, in order to cancel out the negative relationship between quintiles.

By the way, commenter Guy presents similar data for MLB playoff appearances (2004-2006)
here:

6 highest payrolls ....... 11 appearances .... 1.8 per team
13 middle payrolls ....... 12 appearances .... 0.9 per team
Next 5 lowest payrolls .... 1 appearance ..... 0.2 per team
6 lowest payrolls ......... 0 appearances .... 0.0 per team

This does show a strong relationship between payroll and success. Which means that when Berri says he "really, really believe[s] that money cannot buy love in baseball," he presumably is arguing that for money to buy success, a strong relationship in a chart like the above is necessary but not sufficient.

At Sunday, November 19, 2006 9:15:00 AM,  Guy said...

Phil:
The NBA numbers, even for a single season, are quite striking. I know almost nothing about the NBA salary structure. Are there limits on FA -- as in MLB -- that allow teams to underpay talented young players?

* * *

Using payroll data put together by Tango, I calculated championships for the years 1992-2005. For each payroll quintile, this is the number of championships won per 10 years per team.
Top 4.9
2nd 3.0
3rd 2.4
4th 1.8
5th 1.2

The top quintile pays about twice as much as the bottom quintile, but gets four times as many championships. And most of the championships for low-payroll teams came in the early 90s; it appears the payroll-success link is even more powerful today.

J.C. Bradbury ran a regression using 1985-2006 data (at Sabernomics) and reported that a 10% increase in payroll generates one extra win. But he's mistaken: if you run a regression on his own data, you find it's actually 1.6 wins. And looking at 1992-2005 data, 10% in payroll generates 1.8 wins.

So what we have is a strong link to wins, and an even stronger link to championships.

At Sunday, November 19, 2006 11:47:00 AM,  Phil Birnbaum said...

Hi, Guy,

I don't know much about the NBA salary structure either.

I did find this list of team payroll ... to my eye, it looks like a lot of teams in the middle bunched together with similar payroll. Maybe that's the reason for the low correlation, that the differences are so small?

In my rotisserie league, the limit was \$260. Every team spent between \$250 and \$260. I bet the correlation between payroll and score was close to zero there, just because the range was so small for a relationship to emerge.

That's just a guess, I'm thinking out loud.

Plus, the Knicks were the second-worst team in the league but had (by far) the highest payroll. Maybe 2005-06 was an exception? Later, I'll try another year and see how the numbers come out.

At Sunday, November 19, 2006 12:57:00 PM,  Phil Birnbaum said...

For 2002-03, the r is .35, r-squared is .12. (That's with payroll rounded to the nearest million, not that that should make much difference.)

All but the top five teams are between \$41MM and \$61MM, so I'd bet it's just the compression. But that's just a guess.

At Thursday, January 04, 2007 9:01:00 AM,  Luke said...

Highest salary ....... \$78MM .... 37.8 wins
Quartile 2 ........... \$61MM .... 42.5 wins
Quartile 3 ........... \$54MM .... 39.7 wins
Quartile 4 ........... \$47MM .... 47.7 wins
Lowest payroll ....... \$38MM .... 39.5 wins

Hi everybody,

There might be evidence of "diminishing marginal returns" when it comes to money buying wins.

Eyeballing the data above, you can see that spending the least on payroll leads to more wins than spending the most on payroll, but quartiles 2-4 all average more wins than the highest and lowest quartiles.

To check for a curvilinear relationship between salary and wins, I ran a regression with 2 steps. In Step 1, I entered “salary” as the predictor. In Step 2, I added “salary*salary” as the second predictor.

The results for Step 1 are consistent with Phil's findings. The standardized regression coefficient for salary was -.39, indicating that a single Z-score increase in salary is associated with a -.39 Z-score decrease in wins.

These results assume a linear relationship between wins and salary and lead to the perhaps erroneous conclusion that spending more money actually decreases the number of wins.

In Step 2, adding “salary*salary” to the regression equation leads to a different conclusion.

The standardized coefficient for salary changes from -.39 to 3.73. Holding constant “salary*salary”, a single Z-score increase in “salary” is associated with a 3.73 Z-score increase in wins.

BUT the standardized coefficient for “salary*salary” is -4.15. This relationship can be interpreted as: for each single point Z-score increase in salary, the salary-wins slope changes by -4.15 Z-scores.

In normal English, the results indicate that the overall effect of spending more money on payroll is more wins (B = 3.73), but that, at a certain point, spending more money actually decreases the number of wins (B = -4.15).

Making a scatterplot shows the inverted U-shape of the data and might be the easiest way to determine the "optimal" relationship between money and wins.

At least in this example, testing for a curvilinear relationship between salary and wins explains a lot more of the variance in wins. In Step 1, when only the variable "salary" is in the equation, the R-square is .16, but when the variable "salary*salary" is added to the equation, R-square jumps to .37, which is an increase in variance explained of .21.

Having said all of this, there are serious criticisms of what I have attempted to show. None of the results come anywhere close to statistical significance. 5 data points (N = 5) is far too limited to draw conclusions with any confidence. It would also be better to treat the teams independently rather than to put them in 5 groups.

If anybody has a raw data set for teams and payroll of any sport or year I could take a look at this again.