Sunday, January 10, 2010

Book review: Wayne Winston's "Mathletics"

"Mathletics," by Wayne Winston, is a fine book. It's meant as an introduction to the sabermetrics of baseball, football, and basketball, with a little bit of math/Excel textbook built in. It's not perfect, but it suits its purpose very well, and it's probably the first book I'd suggest for anyone who wants a quick overview of what sabermetrics is all about in practical terms.

One of the things that I think makes the book work well is that it's not full of itself. It doesn't make grand pronouncements about how it's a revolution in thinking about sports, or how its breakthroughs are going to change the game. It just gets to work, with clear explanations of the various findings in sabermetrics. Every subject gets its own chapter, and the chapters are generally exactly as long as they need to get the point across. The discussion of Joe DiMaggio's hitting streak takes eight pages, but the Park Factors chapter is only three, because, really, that's all it takes to explain park factor.

About a third of the book is devoted to each of the three sports. I guess I'm most qualified to evaluate the baseball section, and I'd say the selection of subjects is pretty decent. The first few chapters deal with the oldest, most established results -- pythagoras, linear weights, and runs created. There's a chapter on the various fielding evaluation methods, on streakiness, and on the "win probability added" method of evaluating offense. DIPS gets its own chapter, in the context of evaluating pitchers. There's even a chapter on replacement value, although, strangely, Winston discusses it only in the context of win probability, rather than methods that don't involve timing of events.

For the most part, it's a matter of personal opinion what topics in sabermetrics are more important and what topics are less important, and, since this is Winston's book and not mine, you should take my recommendations with a grain of salt. But my main complaint is that I wish there had been a discussion of random chance in the statistical record, and regression to the mean. Throughout the book, no mention is made of the fact that most extreme values of sports statistics are biased away from the mean, although I think there are a few casual mentions of small sample sizes. (But even as I write this, other topics occur to me ... Hall of Fame induction standards, for instance, and baseball draft findings.)

On the football side, there are discussions of quarterback rating methods, an analysis of NCAA overtime strategies, and NFL overtime probabilities. There's a chapter on the paradox of the passing premium, and one on fourth-down decision-making. All this stuff seems like solid summaries to me, at least from what I've learned about football strategy from research blogs like Brian Burke's.

One thing I learned about football that I'd never seen before (which might just be a gap in my football sabermetrics education, although I got the impression that this was original research by Winston) is a summary of the strategy of when to go for a two-point conversion instead of kicking the extra point. Winston presents a full table of the appropriate strategy depending on the score. Some of the findings are obvious, like never go for two when you're seven points behind (after the TD but before the PAT). But some are intriguing -- for instance, when you're six points ahead, you should go for the two-pointer if and only if there are likely to be fewer than 18 possessions left in the game.

Most of the baseball and football material was already familar to me, as was about half the material in the basketball section (formulas for ranking players, a summary of the research on referee racism, etc.). But there was a bunch of basketball stuff I hadn't seen before, or didn't know much about. Again, some of that might be because I don't follow basketball research that closely. But I'm sure some of the stuff is original, as Winston works as a consultant to the NBA's Dallas Mavericks. I found the "plus-minus" chapters to be the most interesting (and they're also the longest), but, after reading them, I still wasn't quite sure how much of the results were real, and how much were just noise due to small sample sizes.

The plus-minus system tries to figure out a player's value by how his team does when he's on the floor. The problem with that, of course, is that the player's rating will be biased by the teammates he plays with: a crappy player might look good if he plays with Kevin Garnett all the time. The system tries to factor that out, by keeping track of all the teammates and opposition players on the floor at the same time, and finding a set of ratings that most consistently predicts outcomes based on those other nine players. (Winston uses a feature of Microsoft Excel called "Excel Solver" for this; I'm not sure how it would differ from an ordinary least-squares regression.)

The results are impressive, but there aren't any confidence intervals, or even simple intuitive measures of how reliable the results might be. I really like the plus-minus method in theory, but I've always wondered about how much you can trust its answers, and Winston doesn't really tell us here. The question is especially relevant because Winston goes on to try to figure out the "chemistry" of various lineups. For instance, suppose you have five players who are +1 each, but, when they're on the court together, the team winds up +15 instead of the expected +5. Winston would say that those five players complement each other somehow and perform exceptionally well together. I'd ask, could it just be random?

Another interesting study in the book, which I think is original to Winston, is a measure of which draft positions give you the best value per dollar (similar to the Massey/Thaler study of the NFL draft). It turns out that the 1-10 choices are by far the most lucrative, but that 6-10 slightly outperforms 1-5 after adjusting for salaries. There are only five years in Winston's study, though, and he tells us that the 6-10s are "pumped up by the phenomenal success of #10 picks Paul Pierce, Jason Terry, Joe Johnson, and Caron Butler."

Finally, there's a fourth section of the book, which discusses topics that aren't specific to a single sport. Gambling probabilities are covered, along with team rating methods, competitive balance, and other such things.


As I said, I really like the book and its method of presentation ... but I have to say I don't agree with everything in it, and I think there are things in it that are just plain wrong. Winston spends two chapters trying to evaluate how play has improved over the decades ("Would Ted Williams Hit .406 today?"), but I don't think the computations work. The method, as has been done by many others, is to look at all players who played two consecutive years, and see how their performance changed from one year to the next. If their performance dropped by (say) two points, you conclude that the league improved by two points between those two seasons.

As I have argued before, I think that method doesn't measure league improvement -- I think it measures the difference between player performance in the first year of their career, as compared to the last year of their career. So I think Winston's conclusion, that Ted Williams would have hit .344 in 2005, is completely without basis.

Another problem is the chapter on parity. Winston regresses NFL team's performance this season on its performance last season, and gets an r-squared of .12. He does the same thing for the NBA, and gets an r-squared of .32. He therefore concludes that the NFL has more parity, and it must be because of the salary cap, the draft, and the fact that contracts in the NFL are not guaranteed.

Those might all be factors, but, as I (and Tango, and GuyM, and many others) have pointed out, the main reason is that NFL teams play 16 games, while NBA teams play 82 games. Even if the other factors affecting year-to-year performance were exactly the same, the correlation would be lower in the NFL just because random chance is a much higher proportion of performance in a 16-game season than in an 82-game season.

Winston also revisits the question of whether payroll can buy salary. He finds that there's a reasonable correlation between team pay and performance in baseball, but low or negative correlations in the NBA and NFL. That, he speculates, is because it's much easier to evaluate the statistics to figure out if a baseball player is good, than to figure out the relative skill of a football player or basketball player. Under that theory, NBA and NFL teams just aren't very good at figuring out who's valuable and who's not.

That doesn't sound plausible to me, that teams could be that blind. Most of the effect, I think, is that because the NBA and NFL both have a salary cap, the distribution of team payroll is very narrow. Therefore, most of the variation is luck, which means the r-squared is going to be lower.

That is: the r-squared is not an absolute measure of the relationship between pay and performance -- it's a *relative* measure, relative to the other sources of variance. In any given year, there will be a high correlation between my salary and the total salary of people in my house -- but a lower correlation between my salary and the total salary of people in the country. The R-squared depends heavily on the size of the *other* factors that contribute to variance. In the NBA and NFL, those factors are much larger than the (compressed) payroll. In MLB, however, you have teams that spend $200 million, and teams that spend $60 million. That means a lot more of the observed difference between teams is payroll-related.

One last way to look at this: in Rotisserie League Baseball, there is a high correlation between player salary and performance: Albert Pujols goes for a lot more rotisserie dollars than Eric Hinske. But if you do a correlation between team pay and performance, you'll get a very low number, because all teams pay around $260!

Winston would do better to regress individual player performance on individual player salaries. If he did that, he'd find that there is indeed a strong link between pay and performance, but that the salary cap means it doesn't apply at the team level.


I should also mention a few picky things that could be improved. There are some silly errors that could have been fixed with a little more reviewing. For instance, in Chapter 1, Winston notes that in July, 2005, the Washington Nationals were 50-32 despite having allowed more runs than they scored. According to Pythagoras, based on their runs scored and allowed, they should have been around .500. "Sure enough," the book says, "the poor Nationals finished 81-81."

But, of course, that doesn't follow. Perhaps the Nationals should have finished .500 in their remaining 80 games, but that should have brought them to 90-72, not 81-81 -- you can't go back and reverse the games that already happened. That's just a little oversight that should have been caught, and could be misleading to someone who's reading about Pythagoras for the first time.

Another thing I found is that some of the Excel charts were a little off-putting. That's my opinion, which is not necessarily better than Winston's own editorial judgment (and, after all, it is his book, and part of its mandate is teaching a bit of Excel). But at least a little better formatting would have helped. In particular, numbers in cells should be rounded to the appropriate number of decimals; a chart showing the "mean strength" of the Buffalo Bills to be 3.107639211 is obviously a little too exact.

And I hate the term "Mathletics" as a substitute for sabermetrics. Hate, hate, hate. Hate.


Another strength of the book is its bibliography. Even before getting to it, at the back of the book, it's obvious that the author is quite well read in the current state of the sabermetric art; almost every source I can think of (including this blog) is sourced somewhere in the text. The bibliography expands on the text references, with a listing of somewhere around 100 articles and websites, with full opinionated descriptions of what's in them. (Disclaimer: Winston says some very kind things about this site ... thanks!)

The only omission I found -- and it's a big one -- is that "The Book" blog isn't included. In my opinion, that should be among the first places sabermetricians go to learn what's new in the field (especially in baseball). Tango is very thorough in identifying which new research is worthy and which isn't, and I'm disappointed Winston didn't include that particular blog. However, "The Book" itself is listed, with a nicely favorable review and a link to Tango's own website (if not the book's).


I think of "Mathletics" as a bit of a sabermetric Wikipedia between hard covers. Despite some shortcomings that I've described here, it's the only concise, current, beginner's description of sabermetric findings that I can think of. My preference would be to see it expanded a bit. I'd love for it to have a section on hockey -- there's lots of stuff we know thanks to Alan Ryder, Gabriel Desjardins, Tyler Dellow, and others -- and there are lots of other topics in the other three sports that could be added. I'd also prefer if more of the Excel stuff was left out of the book and placed on the author's website (where the full spreadsheets can be found.)

But, as I said, it's Winston's book, not mine, and until he appoints me paid editor, I should appreciate it for what it is, which is a book that fills what I think is an important untapped need. Even as it stands, it's now the first book I'd recommend to any beginner who wants a quick overview of the state of sabermetric knowledge.

Labels: , , , , ,


At Sunday, January 10, 2010 2:51:00 PM, Blogger KiranR said...

I just started reading the book, and find it quite entertaining. As I am fairly new to the field of Sabermetrics, it does provide for a good, brief introduction. The bibliography is excellent as you point out.

At Sunday, January 10, 2010 4:41:00 PM, Anonymous Jim A said...

I've read much of the book as well and have also found it entertaining and informational, if not ground-breaking.

One thing I was pleasantly surprised by was, whereas most academics are painfully unaware of non-academic sports research, Winston seems to be well-versed in the research in all three sports and gives ample credit to previous findings where it is due.

At Monday, January 11, 2010 6:06:00 AM, Blogger Unknown said...

Yeah - I'll have to pick it up. Phil (or anyone else) are there any good 'The Book' type books on Football?

At Monday, January 11, 2010 11:04:00 AM, Blogger BMMillsy said...

Thanks for the review, Phil. Very thorough (and I was curious at what level this book would be written). I have a question, though.

"Those might all be factors, but, as I (and Tango, and GuyM, and many others) have pointed out, the main reason is that NFL teams play 16 games, while NBA teams play 82 games. Even if the other factors affecting year-to-year performance were exactly the same, the correlation would be lower in the NFL just because random chance is a much higher proportion of performance in a 16-game season than in an 82-game season."

I agree with the above statement, however do you think that your definition of 'balance' is a bit narrow? If we define perfect balance as "equal chance that fans see their own team as being able to make the playoffs", then noise could still play an important role in how fans perceive the balance of the league. Under this scenario, the idea that the NFL is more balanced makes sense.

While with more games, the playoff contenders may converge to the "true talent" levels, that doesn't mean that the fans see it that way. Given the short schedule, the turnover is much higher for NFL (random or not), creating a balance in contenders from year to year, no?

For examlpe, if preseason you were going to guess who was more likely to make the playoffs this year: the Lakers or the Vikings, who would you choose? The Celtics or the Saints? While noise may not be an important factor in the actual talent dispersion among teams, I think it's an important part of balance in the NFL. Some of the low correlation could also be attributed to the adjusted schedule strengths based on finishes in the previous year.

At Monday, January 11, 2010 11:13:00 AM, Blogger Phil Birnbaum said...

Hi, Millsy,

Sure, I agree with you that there are multiple ways to look at competitive balance. For the record, the book doesn't actually use that term: it writes about "league parity," and in terms of correlation between seasons.

My criticism is that I think schedule length (and, although I didn't mention it in the post, game "length") is the main reason for the effect that Winston found, whether you call that "competitive balance" or not.

I agree with you also that "having a reasonable hope of your team making the playoffs" is desirable, although it's a matter of personal taste. I prefer the NFL to the NBA for that reason, but I can see how others might not. But, and again it's personal taste, I don't like how the NHL increased the "hope of your team making the playoffs" with the extra point for overtime losses.

Also, Winston uses team ratings rather than actual record in his regressions, so he does adjust for schedule strength. But your point is well taken.

At Monday, January 11, 2010 5:20:00 PM, Blogger BMMillsy said...

Interesting point about the NHL. The unintended effects of the new scoring system are really interesting. I think you raise a great question about personal tastes. I still don't think we have a firm grasp on what fans really want to see as a whole.

At Tuesday, March 01, 2011 2:32:00 PM, Blogger DRP said...

I think parity should be measured by the ability of the worst teams in the league to improve the next season. Another way is to calculate the inter-quartile range of wins for the league; a smaller number will imply greater parity as more teams have similar win totals, and the converse is also true. Yet another effective, if crude, way is to calculate total wins of the 5 best teams in the league as a percent of total wins.


Post a Comment

<< Home