Sabermetric Research: Chopped liver II

Thursday, January 21, 2010

Chopped liver II

David Berri and J.C. Bradbury have a new paper out. Called "Working in the Land of the Metricians," it purports to be a guide for Ph.D. sports economists in how to interact with sabermetricians.

The subject is an important one, but, sadly, the paper isn't very constructive. There's a good quote-by-quote critique of the paper at Tango's blog, which I agree with substantially enough that I can just refer you there. (Also, since the paper is gated, you can get a good feel for what's happening in it by reading Tango's take.)

UPDATE: the paper is now available in full (.pdf) at Dave Berri's site.

For my part, I'll just concentrate on one major point -- that Berri and Bradbury still treat the non-academic community's work as if it barely has any value at all. And they're pretty forthright about it. Despite paying lip service to the idea that maybe non-academic researchers have *something* to contribute, the authors remain wilfully blind to 98 percent of the sabermetric community's contribution to the field.

Indeed, Berri and Bradbury explicitly refuse to recognize that the active research community has any expertise at all:

"Birnbaum considers sabermetricians to be "no less intelligent than academic economists" and superior to economists in their understanding of baseball. This statement reveals a curious worldview. On one hand, the aspect that is universal across both groups -- members of both communities have been devoted sports fans since an early age -- is considered unique to the nonacademic sports anaysts. On the other hand, when it comes to the aspect that is unique to academics -- academia normally involves many years of advanced training and requires its participants to be judged competent by their peers in a "publish or perish" environment -- metricians demand equal recognition. In our view, this mentality begets misplaced confidence."

Get what they're saying here? Bill James, Tom Tango, Mitchel Lichtman, Andy Dolphin, Pete Palmer, those guys hired by baseball teams, all those other sabermetric researchers -- they're not baseball experts at all. They're *just sports fans*. How could they possibly know more about analyzing baseball than any other fan, unless they've formally studied econometrics?

It's astonishing that Berri and Bradbury could possibly believe that economists, even sports economists, know more about baseball than these guys, who have built their careers around analyzing the game. Equally astonishing is their implication that we *are* less intelligent than they are. At first, I thought that they didn't realize what they were saying. But, no, it seems pretty clear that they *do* believe it.

What has become apparent in sabermetricians' debates with Bradbury (and, to a lesser extent, Berri, who doesn't engage us as much) is that they disagree with us on almost every point we make, and that they seem to be uncomfortable arguing informally. I can't recall a single time that either of them has conceded that they're wrong, even on a small point. Economists are supposed to be fond of hypotheticals, simple models, and playful arguments to illustrate a point (see this Paul Krugman column), but Bradbury and Berri, not so much. Attempts to try to describe their models with simplified analogies are usually met with detailed rebukes about how we don't understand their econometric methods.

In that light, it's easier to see where they're coming from. They believe that (a) only formal, peer-reviewed research counts as knowledge, and (b) all us non-peer-reviewed people have been wrong every time we've disagreed with their logic. If both of those were actually true, they'd be right -- we'd just be ignorant sports fans who don't have any expertise and need to be educated.

As for the sabermetric findings that they actually use, like DIPS and OPS ... Berri and Bradbury seem to consider them a form of folk wisdom that the unwashed baseball fans managed to stumble upon, and argue that economists should not accept them until they've been verified by regression methods that would pass peer review and be publishable in academic journals.

Back in 2006, in the post that Bradbury and Berri quoted above, I accused Bradbury of ignoring the findings of sabermetricians in one of his papers. At the time, I thought perhaps I was too harsh. I was wrong. This paper shows that he truly believes that, as non-peer-reviewed sports fans with no special expertise, our research findings are unworthy of being cited.

Indeed, in Bradbury's exposition (I am assuming all the arguments in the baseball portion of the current paper are Bradbury's, although Berri is still listed as co-author), he treats the history of sabermetric knowledge as if it were mostly a series of academic papers. That's absurd; it's indisputable that, conservatively, at least 90 percent of our sabermetric knowledge came from outside academia. If you were writing the history of research about Linear Weights, what would you include? Think about it for a minute.

Ready? Here's how Bradbury sees it:

--In 1992, academic A.A. Blass published a study where he estimated linear weights by regression.

--In 2005, academic Ted Turocy published a paper which highlighted the "omitted variable" bias making the results not as accurate as they could be.

--In 1963, academic George Lindsey had published a paper with a rudimentary form of the same equation, but not using regression.

--In 1984, sabermetricians John Thorn and Pete Palmer "popularized" and "updated" Lindsey's work.
--In 2003, academics Jim Albert and Jay Bennett compared the two approaches.

Got it? Four academic papers (Albert/Bennett is actually their book, Curve Ball, but no matter) and Thorn/Palmer. On top of that, the only mentions of Thorn and Palmer are their "updating" and "popularizing". Each of the academics' work, on the other hand, is described in some detail.

Is that how you would characterize the state of knowledge, that the history of accumulated knowledge about Linear Weights comes from these five academics and a cursory contribution from Pete Palmer? I mean, come on. There's a huge literature out there if you look outside academia, including studies on various improvements to the original.

It gets worse, in the DIPS discussion.

Bradbury starts off by reviewing the seminal Gerald Scully paper. In 1974, Scully (a famed sports economist who passed away in 2009) published a paper that tried to value players' monetary contributions to their teams. Regrettably, he used strikeout-to-walk ratio as a proxy for the pitcher's value to his team's success on the field. That's not a particularly good measure of the value of a pitcher; ERA is much better. Even before sabermetrics, everyone knew that, including casual baseball fans and Joe Morgan. And if they didn't, Bill James would have made it clear to them in his Abstracts, so there was no excuse for a baseball researcher to not be aware of that after, say, 1983.

And Bradbury acknowledges that, that ERA was perceived to be better than K/BB ratio. Does he cite common sense, conventional wisdom, or Bill James? Nope. He cites two academic papers from the 1990s. No, really:

"... [Andrew] Zimbalist (1992b) and [Anthony] Krautmann (1999) argue that ERA is a better measure of pitcher quality ..."

So, before 1992, nobody else argued it? Well, OK, if you say so.

Anyway, with that established, Bradbury continues. In 2001, Voros McCracken came along, and, in an "essay" on the "popular sabermetric Web site Baseball Prospectus," he "suggested" that pitchers have little control over what happens to balls in play. At this point, Bradbury checks the correlation of a pitchers' BABIP in consecutive seasons, and finds it's fairly low (.24).

"This supports McCracken's assertion," he writes.

Okay, thanks for the verification! But, er, actually, it's not like the sabermetric community was just sitting on its hands the past eight or nine years, staring at McCracken's hypothesis and wondering if someone would ever come along and tell them if it was true. Sabermetricians of all sorts have done huge volumes of work to refine and prove versions of the DIPS hypothesis. For a long time, you couldn't hit any sabermetric website without DIPS this and DIPS that hitting you from every angle, with all kinds of theories about it tested and studied.

It's perfectly fine that Bradbury verifies McCracken with his quick little regression here, but why wouldn't he acknowledge everyone else, by, for instance, saying that his study is an example of the kinds of confirmation studies that the sabermetric community has been doing since 2001? In light of the rest of the article, which implies that sabermetricians are just unsophisticated baseball fans, that omission would reasonably lead readers to incorrectly assume that Bradbury's little study is the first of its kind.

But, instead, Bradbury ignores all those years of sabermetric study of the DIPS issue, just because it wasn't academically published or peer-reviewed. That's as wrong now as it was when I wrote about it in 2006 -- especially in an essay that's ostensibly suggesting that academics can benefit from outside research.

The DIPS approach is as well accepted in sabermetrics as the Coase Theorem is accepted in economics. The difference is, if I published something mentioning Coase's hypothesis, and then published a study "supporting" it without citing any other study or mentioning that it's a canon of the economics literature, Bradbury would go ballistic, laying into my ignorance like ... like Tiger Woods on a supermodel. (Sorry.) The other way around, though, and it's all OK.

Anyway, that's just the prelude -- this is the point where Bradbury's argument gets really bizarre.

Why did Bradbury bring up DIPS? Because it shows that, instead of using ERA to evaluate the skill of a pitcher, it's better to just use walks, strikeouts, and home runs allowed, thus eliminating a lot of random chance from the pitcher's record. That part is absolutely fine. But then he comes back to Scully.

Remember when Scully did his 1974 study that used strikeout-to-walk ratio as a measure of a pitcher's value? And everyone agreed that he should have used ERA instead? Well, now, hang on! McCracken has shown us that if we ignore a pitcher's balls in play, we can get a better measure of his talent than if we use ERA. Eliminating balls in play leaves only BB, K, and HR. That means strikeouts and walks are really important. And, in turn, that means that Scully was actually correct back in 1974 when he emphasized strikeouts and walks by using K/BB ratio as his statistic of choice!

To make this bizarre argument Bradbury ignores the fact you have to combine K and BB in a very specific way, or it doesn't work well at all, and that K/BB ratio is still worse than ERA. And HR are important too. But, argues Bradbury, at least Scully was right in that it had to be K and BB. Maybe he wasn't completely correct, but he had the right idea.

Why did Gerald Scully choose to value pitchers by K/BB? According to Bradbury, it wasn't because he just gave it a guess. He did it because he intuitively anticipated that DIPS was true. Voros McCracken, the non-academic sabermetrician, just served later to confirm Scully's original insight.

See? The academics were right all along!

If that sounds ludicrous, it is. I really hope you don't believe what I'm saying here ... I hope you're thinking that nobody who could actually make that argument with a straight face, that Gerald Scully's choice of K/BB ratio as his metric is an anticipation of a completely different, complex, 12-part formula for DIPS ERA that happens to partly depend on K and BB. I hope you believe that I'm making it up.

But I'm not. That's actually what Bradbury argues! Here are the quotes:

"If a measure varies considerably for an individual player over time, it is likely that the measure is heavily polluted by luck ... Scully appeared to understand this point with his choice of the strikeout-to-walk ratio to measure pitcher quality.

"Later research from outside academia [McCracken] confirmed Scully's original approach and suggested [also] including home-run prevention when evaluating pitchers. ...

"Consequently, we see Scully's general approach is confirmed by a metrician, demonstrating what the nonacademic sports research community can contribute to sports economics research."

Could there be a sillier, more self-serving rationalization?

Oh, and by the way ... it seems that Scully wasn't even a baseball fan at the time he started writing his study. How does Bradbury think Scully was able to anticipate a result that wouldn't become apparent until 27 years later? A result, moreover, that pertained to a sport Scully probably didn't know that much about, in a field of science that didn't even exist at the time, that wouldn't emerge until years of study by non-academic researchers, and that was so surprising it shocked even the most veteran researchers in the field?

Maybe it was that "years of advanced training" in economics -- obviously so much more valuable than the non-expertise in sabermetrics that Voros McCracken, a mere baseball fan, couldn't possibly have had.

Labels: academics, economics, peer review

9 Comments:

At Thursday, January 21, 2010 11:55:00 AM, Chiasmus said...: You've ably disassembled these guys. I'd just add a couple of things.

First off, although you and Tango have cast this as an academics-vs.-proles debate, it reminds me a lot of the way many economists treat other academic disciplines. The same things happen there: the arrogance, the ignorance of non-economics work, the re-inventing the wheel and then claiming credit for it. I'm a sociologist, and I can imagine re-writing your first quote like so:

[X] considers sociologists to be "no less intelligent than economists" and superior to economists in their understanding of culture and social structure. This statement reveals a curious worldview. On one hand, the aspect that is universal across both groups -- members of both communities have been members of society since an early age -- is considered unique to the sociological analysts. On the other hand, when it comes to the aspect that is unique to economists -- economics normally involves many years of advanced mathematical and econometric training and requires its participants to be judged competent by their peers in the economics journal environment -- sociologists demand equal recognition. In our view, this mentality begets misplaced confidence.

The only difference is that while many economists probably think the above, they'd be less likely to say it so explicitly, whereas it's more acceptable to publicly slag the intelligence of people who are entirely outside academia.

The other things is that I think this kind of disciplinary imperialism is particularly severe among mediocre economists. Neither of these guys is as famous as Steve Levitt, and neither is teaching at Harvard or contending for the nobel prize, and I think you have to understand their insecurity and their need to rigorously police the boundaries of acceptable debate in that light. They want to impress other, higher-status economists, and to do that they have to talk mostly about other economic work. Plus, they probably have a lot of their egos bound up with the idea that they're smarter and more sophisticated than just "some guy on the Internet".

The unfortunate thing, of course, is that casual observers and the media still give unwarranted weight to guys with "Ph.D. in economics" on their C.V., even after the whole discipline has been shown to be a bit less than...reliable as a guide to reality.
At Thursday, January 21, 2010 12:42:00 PM, Phil Birnbaum said...: Sure, I can believe there's a lot of infighting going on between disciplines, or even within disciplines. What strikes me about this one, though, is that it's so explicit. Normally, you don't quote group X saying they're just as intelligent as your group, and then imply that it's false. Nor do you normally take a group that has spent years and years passionately devoted to an area of human activity, and argue that they're no more knowledgeable about it than anyone else.

I mean, even if it were TRUE, you wouldn't say such things. And whether it's true or not true, you normally pay lip service, mumble something about "yes, although we don't agree with you on this particular point, don't take it personally, we've learned a lot from you in the past, and keep up the good work" ... you know, like MLB probably used to say publicly about sabermetricians 20 years ago.

That's the most interesting thing about this ... it's so brazen. They're basically telling Tango and Bill James, "you don't know any more about baseball than anyone else." And we look at them kind of stunned, like, are they really saying that? And they are!
At Thursday, January 21, 2010 5:42:00 PM, Vic Ferrari said...: Man, you really hate this dude. I suspect that you and Tom have done more to make him famous than a publicist ever could. That's a shame.

I'm interested to see his math for his aging paper. I read Mitchel Lichtman's terse explanation of Bradbury's methodology, this at hardballtimes. Surely there is more to it than that, no? At the very least he must have corrected for the plate appearances for each player-season. So the initial error of each data point would be divided by PA in his regression.

Even then, it's pretty much useless. Who gives a toss about the career trajectory of the mythical 'average player'. Or in ths case the mythical 'average player' from a subset (guys with 5000 ABs or PAs, whichever it was).

The sensible way to do the math is revealed here, as examples in Jim Albert's "Bayesian Computation with R". If I've linked that properly you should be able to read three excerpts from the book that cover the topic with sound reason and math that matches said reason nearly exactly. To my mind, his process is impeccable.

The page that is skipped by google preview (p.277) presumably explains the regression he used (quadratic by the looks of it) and the correction I described in the first paragraph of this post (that's why I wrote it).

On page 277 he surely also explains why he chose that regression model, which is presumably because he knew some really big math (BFM) ahead. In any case that missing page is very likely to be similar to Bradbury's paper. Though if we're wagering I'll put my money on Albert's preamble to the solution to be more thorough and reasonable than Bradbury's paper.
At Thursday, January 21, 2010 5:48:00 PM, Phil Birnbaum said...: Vic,

Actually, I don't hate J.C. at all. I've never even met him. I just really, really disagree with him.

You can get his paper by sending him an e-mail ... he said he sends it to anyone who asks. That's how I got it. He's reachable through his website, http://www.sabernomics.com .

Your link sounds interesting, but didn't work.
At Thursday, January 21, 2010 5:54:00 PM, Vic Ferrari said...: Just to add:

If you look at Albert's career trajectory estimates for several players, there is a marked difference between the career trajectories of different guys.

Some of that may be injury, in all sports we know that when average players get injured in a way that significantly hurts their performance ... it's retirement time, by their choice or management's.

When the same thing happens to a great player ... he carries on at a lower level of true ability. Still a higher level than most, but a lower level for him.

If someone replicated Albert's math, and printed out the estimated career trajectories for all the 5000 PA guys ... then people with good historical knowledge of MLB, and terrific memories ... they could probably identify similar player types and see if their career trajectories tended to match up.

Then we could convert those observations back into the language of reason (math) ... and we'd be getting somewhere.
At Thursday, January 21, 2010 5:56:00 PM, Vic Ferrari said...: Sorry about the link:

http://books.google.ca/books?id=AALhk_mt7SYC&printsec=frontcover&dq=computation+with+r+albert&cd=1#v=onepage&q=career trajectory&f=false
At Thursday, January 21, 2010 6:10:00 PM, Vic Ferrari said...: Further, on a more philosophical level. It is unthinkable that Bradbury could confuse McCracken with Scully.

When I was at my family's home for Thanksgiving, I picked a baseball book off of the shelf. My father has been dead for 25 years, but he was a huge baseball fan, he has a lot of baseball books.

It was called "My Best Day In Baseball", or something similar. Published in the mid sixties if my memory is right. Dozens of authors contributed interviews with famous players on the subject. I suspect that most were print journalists.

Firstly, I was surprised how almost all of the writers busted out stats all over the place. Secondly, the use of the k/w stat was common. Maybe not quite as much as ERA, but close.

So I suspect that this is why Scully used k/w.

McCracken, with DIPS, conveyed an understanding that baseball is a sequence of events, and randomness is the essence of the universe. This as opposed to treating randomness as an element of error ... which is the road to nowhere IMO.

Not that we should build a statue for Voros. Like most good ideas, it's simple, and pretty damn obvious in hindsight. Even though that train of thought does lead to someungodly math.

It is also unthinkable that many other people hadn't come to the same conclusion long ago. Just that it is extremely unlikely that Scully was one of them, and equally unlikely that Bradbury understands what McCracken's math tells us, or why.
At Thursday, January 21, 2010 7:17:00 PM, Vic Ferrari said...: Phil, thanks for the info. Hopefully I'll get a copy of his paper soon.

Albert also only looked at players with 5000 PA as well, though he never suggested that lesser players should be expected to have similar career trajectories.

Having found some other of his est. career trajectories for players ... there is significant difference between many star players.

So intuitively, I suspect that the age of peak performance is generally quite a bit higher than Lichtman has calculated. Fortunately he is a very good writer, and intellectually honest to an uncommon degree. So I know exactly what he has done, and why.

I also think "couplets" are a terrific way to assess the extent baseball's overreliance on previous results in assessing players, and to get a feel for the randomness within the game. Unfortunately that makes them very difficult to use for his purpose.

Th problem comes with the "survival of the luckiest" phenomenon, which Mitchel explains very well.

If we believe that the baseball gods are a lot less like the God we were taught to love in Sunday school, and a lot more like the God in the parts of the old testament that nobody reads ... vengeful, impetuous, unfair, wholly fightening and not particularly likable. A God who often punishes the innocent with verve for the sins of an othewise decent man, and lets the evil and murderous walk among us. Well that's probably going to shift his curves to the right. Perhaps not, but I suspect so.

Regretably that leads us to some BFM. The math would be a lot simpler if you looked at the elements of wOBA separately and then summed them with wieghts after the fact. It would still be a bear, though.

You could use OBP (admittedly much weaker that wOBA, but let's try and make life easier :) ) and the ability distribution of the beta form g(p) ~ OBP^(LOBP*K-1) * OBP^((1-LOBP)*K-1). Where LOBP is league average OBP (or more correctly, expected league average ... what can ya do, randomness is eveywhere.)

I believe that Albert deduced best fi K to be 220 for OBP, or something like that. You'd want to check fo yourself over a few seasons. Even at that, there is no earthly reason that ability in MLB is distributed in Beta fashion, and we will never, ever know how it truly ever exists. We can only hope to build sensible models from bricks of reason and check them against future results to test their veracity. And there is never such a thing as too much common sense. Ever. So you can consider that good enough, or start tweaking the distribution of talent manually if you have a hundred years to kill... or with a computer sim ... or with brilliant math if you're smarter than me.

Then you'd have likelihood distibutions for the abilities of each player and each season. The number of plate appearances would be absorbed in that.

At that point it's straightforward.

I wrote a post recently that explains the process with histograms. Bayes by Lego, really. It's recent. It uses the "brute force" method, but you don't have to have too much imagination to envision the lego blocks becoming vanishing small ... and further envisioning the math hell that follows.
At Thursday, January 21, 2010 8:50:00 PM, Vic Ferrari said...: Mr Bradbury was gracious enough to send me his paper. And although I am spectacularly unqualified to review an economics paper, I'll comment anyways.

I have no idea what the Baltagi and Wu random-effects method is. An internet search turns up nothing that is free to view. Presumably nu is a plate appearance weighting, though he describes it as player specific, not player-season specific ... so I dunno. Theta-D is not explained, I won't even venture a guess. Presumably Baltagi and Wu make that explicit.

The use of the Z-score makes sense. Levels the field for when MLB changes the height of the mound, etc. I'm sure that some Sabermetricians have better measures ... but they won't be far off that, I suspect. The adjusting for ball parks effects is good, and something that the Albert article in the reference list did NOT do.

He has not accounted for the role of chance in season by season fluctuations, so his curves for individual players are going to be pretty wild. Has Mr Bradbury provided the trajectory curves for the individual players in the study? Even for some guys whose true abilities arced along in a normalish way ... the universe will have made some of their resulting career trajectories absolutely madass. Not "perhaps", that is as near a certainty as can exist. Unless the regression method contains some ethereal element that dampens that.

If this wild variability in the trajectories did occur ... then that's going to intuitively lead the author to believe that this sample of the population is not unique. And for him to suppose that the same applies to the league as a whole, with "survival of the luckiest", as eloquently explained by Lichtman, hiding it from our view. Which may or may not be true.

Having wasted hours reading all of this stuff today, my conclusion ... for wagering purposes, for a random player whose true underlying ability vs time curve is about to be revealed by God himself ... smart money takes the OVER on 29 years for hitters, and 28 years for pitches.

Kind of a moot point for pitchers I know ... sabermetricians surely know a lot of detail on the subject, but surely all other considerations stand in the huge shadows of injury (for everyone) and wear & tear (esp for guys who've thrown a lot of pitches or have awkward styles).

One other thing: Bradbury references an article by Bradbury and Drinen (2008) that concludes that team effects on batter LW perfomance is negligible. Is this true? Inutitively I would have though it would be better to play for a stong team in a weak NL division than a weak team in a strong AL division. Perhaps not.

And now I'll show myself out. :)

Sabermetric Research

Thursday, January 21, 2010

Chopped liver II

9 Comments:

About Me

Previous Posts