Tuesday, February 01, 2011

Scorecasting: are the Cubs unlucky, or is it management's fault?

The Chicago Cubs, it has been noted, have not been a particularly huge success on the field in the past few decades. Is Cubs' management to blame? The last chapter of the recent book "Scorecasting" says it's true. I'm not so sure.

The authors, Tobias J. Moskowitz and L. Jon Wertheim, set out to debunk the idea that the Cubs lack of success -- they haven't won a World Series since 1908 -- is simply due to luck.

How do they check that? How do they try to estimate the effects of luck on the Cubbies? Not the way sabermetricians would. Instead, the authors ... well, I'm not really sure what they did, but I can guess. Here's how they start:

"Another way to measure luck is to see how much of a team's success or failure can't be explaiend. For example, take a look at how the team performed on the field and whether, based on its performance, it won fewer games than it should have."

So far, so good. There is an established way to look at certain aspects of luck. You can look at the team's Pythagorean projection, which estimates its won-lost record from its runs scored and runs allowed. If it beat its projection, it was probably lucky.

Also, you can also compute its Runs Created estimate. The Runs Created formula takes a team's batting line, and projects the number of runs it should have scored. If the Cubbies scored more runs than their projection, they were somewhat lucky. If they scored fewer, they were somewhat unlucky.

But that doesn't seem to be what the authors do. At least, it doesn't seem to follow from their description. They continue:

"If you were told that your team led the league in hitting, home runs, runs scored, pitching, and fielding percentage, you'd assume your team won a lot more games than it lost. If it did not, you'd be within your rights to consider it unlucky."

Well, yes and no. Those criteria are not independent. If I were told that my team scored a certain number of runs, I wouldn't care whether it also led the league in home runs, would I? A run is a run, whether it came from leading the league in home runs, or leading the league in "hitting" (by which my best guess is that the authors meant batting average).

The authors do the same thing in the very same paragraph:

"How, for instance, did the 1982 Detroit Tigers finish fourth in their division, winning only 83 games and losing 79, despite placing eighth in the Majors in runs scored that season, seventh in team batting average, fourth in home runs, tenth in runs against, ninth in ERA, fifth in hits allowed, eighth in strikeouts against, and fourth in fewest errors?"

Again, if you know runs scored and runs against, why would you need anything else? Do they really think that if your pitchers give up four runs while striking out a lot of batters, you're more likely to win than if your pitchers give up four runs while striking out fewer batters?

(As an aside, just to answer the authors' question: The 1982 Tigers underperformed their Pythagorean estimate by 3 games. They underperformed their Runs Created by 2 games. But their opponents underperformed their own Runs Created estimate by 1 game. Combining these three measures shows the '82 Tigers finished four games worse than they "should have".)

Now, we get to the point where I don't really understand their methodology:

"Historically, for the average MLB team, its on-the-field statistics would predict its winning percentage year to year with 93 percent accuracy."

What does that mean? I'm not sure. My initial impression is that they ran a regression to predict winning percentage based on that bunch of stats above (although if they included runs scored and runs allowed, the other variables in the regression should be almost completely superfluous, but never mind). My guess is that's what they did, and they got an correlation coefficient of .93 ... or perhaps an r-squared of .93. But that's not how they explain it:

"That is, if you were to look only at a team's on-the-field numbers each season and rank it based on those numbers, 93 percent of the time you would get the same ranking as if you ranked it based on wins and losses."

Huh? That can't be right. If you were to take the last 100 years of the Cubs, and run a projection for each year, the probability that you'd get *exactly the same ranking* for the projection and the actual would be almost zero. Consider, for instance, 1996, where the Cubs outscored their opponents by a run, and nonetheless wound up 76-86. And now consider 1993, when the Cubs were outscored by a run, and wound up 84-78. There's no way any projection system would "know" to rank 1993 eight games ahead of 1996, and so there's no way the rankings would be the same. The probability of getting the same ranking, then, would be zero percent, not 93 percent.

What I think is happening is that they're really talking about a correlation of .93, and this "93 percent of the time you would get the same ranking" is just an oversimplification in explaining what the correlation means. I might be wrong about that, but that's how I'm going to proceed, because that seems the most plausible explanation.

So, now, from there, how do the authors get to the conclusion that the Cubs weren't unlucky? What I think they did is to run the same regression, but for Cub seasons only. And they got 94 percent instead of 93 percent. And so, they say,

"The Cubs' record can be just as easily explained as those of the majority of teams in baseball. ... Here you could argue that the Cubs are actually less unlucky than the average team in baseball."

What they're saying is, since the regression works just as well for the Cubs as any other team, they couldn't have been unlucky.

But that just doesn't follow. At least, if my guess is correct that they used regression. I think the authors are incorrect about what it means to be lucky and how that relates to the correlation.

The correlation in the data suggests the extent to which the data linearly "explain" the year-to-year differences in winning percentage. But the regression doesn't distinguish luck from other explanations. If the Cubs are consistently lucky, or consistently unlucky, the regression will include that in the correlation.

Suppose I try to guess whether a coin will land heads or tails. And I'm right about half the time. I might run a bunch of trials, and the results might look like this:

1000 trials, 550 correct
200 trials, 90 correct
1600 trials, 790 correct
100 trials, 40 correct

If I run a regression on these numbers, I'm going to get a pretty high correlation -- .9968, to be more precise.

But now, suppose I'm really lucky. In fact, I'm consistently lucky. And, as a result, I do 10 percent better on every trial:

1000 trials, 605 correct
200 trials, 99 correct
1600 trials, 869 correct
100 trials, 44 correct

What happens now? If I run the same regression (try it, if you want), I will get *exactly the same correlation*. Why? Because it's just as easy to predict the number of successes as before. I just do what I did before, and add 10%. It's not the correlation that changes -- it's the regression equation. Instead of predicting that I get about 50% right, the equation will just predict that I get about 55% right. The fact that I was lucky, consistently lucky, doesn't change the r or the r-squared.

The same thing will happen in the Cubs case. Suppose the Cubs are lucky, on average, by 1 win per season. The regression will "see" that, and simply adjust the equation to somehow predict an extra win per season. It'll probably change all the coefficients slightly so that the end result is one extra win. Maybe if the Cubs are lucky, and a single "should be" worth 0.046 wins, the regression will come up with a value of 0.047 instead, to reflect the fact that, all other things being equal, the Cubs' run total is a little higher than for other teams. Or something like that.

Regardless, that won't affect the correlation much at all. Whether the Cubs were a bit lucky, a bit unlucky, about average in luck, or even the luckiest or unluckiest team in baseball history, the correlation might come out higher than .93, less than .93, or the same as .93.

So, what, then, does the difference between the Cubs' .94, and the rest of the league's .93, tell us? It might be telling us about the *variance* of the Cubs' luck, not the mean. If the Cubs hit the same way one year as the next, but one year they win 76 games and another they win 84 games ... THAT will reduce the correlation, because it will turn out that the same batting line isn't able to very accurate pinpoint the number of wins.

If you must draw a conclusion from the the regression in the book -- which I am reluctant to do, but if you must -- it should be only that the Cubs' luck is very slightly *more consistent* than other teams' luck. But it will *not* tell you if the Cubs' overall luck is good, bad, or indifferent.

------

So, have the Cubs been lucky, or not? The book's study doesn't tell us. But we can just look at the Cubs' Pythagorean projections, and runs created projections. Actually, a few years ago, I did that, and I also created a method to try to quantify a "career year" effect, to tell if the team's players underperformed or overachieved for that season, based on the players' surrounding seasons. (For instance, Dave Stieb's 1986 was marked as an unlucky year, and Brady Anderson's 1996 a lucky year, because both look out of place in the context of the players' careers.)

My study gave a total of a team's luck based on five factors:

-- did it win more or fewer games than expected by its runs scored and allowed?
-- did it score more or fewer runs than expected by its batting line?
-- did its opponents score more or fewer runs than expected by their batting line?
-- did its hitters have over- or underachieving years?
-- did its pitchers have over- or underachieving years?

(Here's a PowerPoint presentation explaining the method, and here's a .ZIP file with full team and player data.)

The results: from 1960 to 2001, the Cubs were indeed unlucky ... by an average of slightly over half a win. That half win was comprised of about 1.5 wins of unlucky underperformance of their players, mitigated by about one win of being lucky in turning that performance into wins.

But the Cubs never really had seasons in that timespan in which bad luck cost them a pennant or division title. The closest were 1970 and 1971, when, both years, they finished about five games unluckier than they should have (they would have challenged for the pennant in 1970 with 89 wins, but not in 1971 with 88 wins). Mostly, when they were unlucky, they were a mediocre team that bad-lucked their way into the basement. In 1962 and 1966, they lost 103 games, but, with normal luck, would have lost only 85 and 89, respectively.

However, when the Cubs had *good* luck, it was at opportune times. In 1984, they won 96 games and the NL East, despite being only an 80-82 team on paper. And they did it again in 1989, winning 93 games instead of the expected 77.

On balance, I'd say that the Cubs were lucky rather than unlucky. They won two divisions because of luck, but never really lost one because of luck. Even if you want to consider that they lost half a title in 1970, that still doesn't come close to compensating for 1984 and 1989.

------

But things change once you get past 2001. It's not in the spreadsheet I linked to, but I later ran the same analysis for 2002 to 2007, at Chris Jaffe's request for his book. And, in recent years, the Cubs have indeed been unlucky:

2002: 67-95, "should have been" 86-76 (19 games unlucky)
2003: 88-74, "should have been" 86-76 (2 games lucky)
2004: 89-73, "should have been" 90-72 (1 game unlucky)
2005: 79-83, "should have been" 86-76 (6 games unlucky)
2006: 66-96, "should have been" 82-80 (16 games unlucky)
2007: 85-77, "should have been" 88-74 (3 games unlucky)

That's 42 games of bad luck over seven seasons -- an average of 7 games per season. That's huge. Even if you don't trust my "career year" calculations, just the Pythagoras and Runs Created bad luck sum to almost 5.5 of those 7 games.

So, yes ... in the last few years, the Cubs *have* been unlucky. Very, very unlucky.

------

In summary: from 1960 to 2001, the Cubs were a bit of a below-average team, with about average luck. Then, starting in 2002, the Cubs got good -- but, by coincidence or curse, their luck turned very bad at exactly the same time.

----

But if the "Scorecasting" authors don't believe that the Cubs have been unlucky, then what do they think is the reason for the Cubs' lack of success?

Incentives. Or, more accurately, the lack thereof. The Cubs sell out almost every game, win or lose. So, the authors ask, why should Cubs management care about winning? They gain very little if they win, so they don't bother to try.

To support that hypothesis, the authors show the impact (elasticity) of wins on tickets sold. It turns out that the Cubs have the lowest elasticity in baseball, at 0.6. If the Cubs' winning percentage drops by 10 percent, ticket sales drop by only 6 percent.

On the other hand, their crosstown rivals have one of the highest elasticities in the league, at about 1.2. For every 10 percent drop in winning percentage, White Sox ticket sales drop by 12 percent -- almost twice as much.

But ... I find this unconvincing, for a couple of reasons. First, if you look at the authors' tables (p. 245), it looks like it takes a year or so after a good season for attendance to jump. That makes sense. In 2005, it probably took a month or two for White Sox fans to realize the team was genuinely good; in 2006, they all knew beforehand, at season-ticket time.

Now, if you look at the Cubs' W-L record for the past 10 years, it really jumps up and down a lot; from 1998 to 2004, the team seesawed between good and bad. For seven consecutive seasons, they either won 88 games or more (four times), or lost 88 games or more (three times). So, fan expectations were probably never in line with team performance. Because the authors predicted attendance based on current performance, rather than lagged performance, that might be why they didn't see a strong relationship (even if there is one).

But that's a minor reason. The bigger reason I disagree with the authors' conclusions is that, even when they're selling out, the Cubs still have a strong incentive to improve the team -- and that's ticket prices. Isn't it obvious that the better the team, the higher the demand, and the more you can charge? It's no coincidence that the Cubs have the highest ticket prices in the Major Leagues (.pdf) at the same time as they're selling out most games. If the team is successful, and demand rises, the team just charges more instead of selling more.

Also, what about TV revenues, and merchandise sales, which also rise when a team succeeds?

It seems a curious omission that the authors would consider only that the Cubs can't sell more tickets, and not that total revenues would significantly increase in other ways. But that's what they did. And so they argue,

"So, at least financially, the Cubs seem to have far less incentive than do other teams -- less than the Yankees and Red Sox, and certainly less than the White Sox. ... Winning or losing is often the result of a few small things that require extra effort to gain a competitive edge: going the extra step to sign the highly sought-after free agent, investing in a strong farm team with diligent scouting, monitoring talent, poring over statistics, even making players more comfortable. All can make a difference at the margin, and all are costly. When the benefits of making these investments are marginal at best, why undertake them?"

Well, the first argument is the one I just made: the benefits are *not* "marginal at best," because with a winning team, the Cubs would earn a lot more money in other ways. But there's a more persuasive and obvious argument. If the Cubs have so small an incentive to win, if they care so little that they can't even be bothered to hire "diligent" scouts ... then why do they spend so much money on players?

In 2010, the Cubs' payroll was \$146 million, third in the majors. In 2009, they were also third. Since 2004, they have never been lower than ninth, in a 30-team league. Going back as far as 1991, there are only a couple of seasons that the Cubs are below average -- and in those cases, just barely. In the past 20 years, the Cubs have spent significantly more money on player salaries than the average major-league team.

It just doesn't make sense to assume that the Cubs don't care about winning, does it, when they routinely spend literally millions of dollars more than other teams, in order to try to win?

Labels: , , , , ,

At Thursday, February 03, 2011 4:09:00 PM,  j holz said...

I believe you're right about the 93 percent comments. If you've read the rest of the book, it's clear that the authors are writing for a mass audience and often don't understand the meaning of the numbers they're reporting.

After Wages of Wins, Stumbling on Wins, and Scorecasting, I'm beginning to wonder if there is any mathematical truth at all to the bold claims made in books like Freakonomics...

At Thursday, February 03, 2011 5:26:00 PM,  Jared McKiernan said...

This comment has been removed by the author.

At Thursday, February 03, 2011 5:33:00 PM,  Jared McKiernan said...

These mass "economics" books are generally so weakly supported by a true mathematical/statistical analysis of why the method should mean what the author claims, it is consistently enraging to see Levitt, Bradbury, et. al. rarely engage with comments from anyone without tenure or a big media name behind them.

Every meaningful result their books are based on has turned out to either be complete bullshit, misunderstood models, or which in practice is basically trivial.

These guys make it tougher to get people interested in what we could really do with the data and rigorous modeling and analysis. Luckily the Internet is here and we can build up our amateur army :)

At Monday, February 07, 2011 5:03:00 AM,  MIke Musinski said...

This is quasi-related to your post, but I was wondering where I could find data on the Cubs performance in August/Sepetember/October over the past say 50 years, versus not only what they should have done, but the league average. My hypothesis is that the Cubs players get drained more over the season due to more day games than any other team, and as a result have a worse record later in the season. It's probably rather elementary material compared to the sort of analysis you're doing, but I've always been curious as a Cubs fan. Being that I notice players quite often mention being tired later in the season.

Also in regards to Jared's comment, you have to agree that Freakanomics and the like definitely draw people in, into the world of statistical analysis. If they intend to get in to depths that you gentleman do, is their own perogrative. I for example absolutely love what you guys are doing in terms of understanding sports, but at the same time I could never be fully immersed in it because numbers and data quite frankly make me dizzy and anxious. I'm just not wired for it, but some people are. And the people who read "Scorecasting" may be but might not be aware of the thought provoking statiscal analysis done in said circles.

At Friday, February 18, 2011 4:54:00 PM,  kds said...

Phil, by "93% of the time you would get the same rank", they may have meant that the ordinal ranking of the team finish in their division/league would not have changed. That is, the way they adjusted for luck would have changed the Cubs place of finish only 7% of the time, even though it would have frequently changed the games W/L.

At Friday, February 18, 2011 5:00:00 PM,  Phil Birnbaum said...

KSS: OK, I guess that's possible. 93 percent does seem a bit high, though.

At Saturday, February 19, 2011 1:03:00 AM,  Phil Birnbaum said...

Sorry, KDS, I called you KSS by typo.