Friday, August 26, 2016

"Bias" in log5 estimates -- a clarification

Last post, I argued that the log5 method has a bias. When you estimate a team's talent, in the sense of how well it would do playing a normal season against a league's worth of teams, you wind up being too conservative, giving the underdog too much of a chance to win.

Why does that happen? Because for the log5 formula to work, you need to use a team's expectation against a .500 team, not a team's expectation averaged out among all teams. The two aren't the same. You could come up with a method to figure out how big the difference is; it varies by the empirical spread of talent in the league.

It's easy to see why this is the case with a simple example. Suppose I know more statistics than 90 percent of the population with a degree. If I played a season's worth of stats exams against all of them, I'd finish with an .900 record. But, consider someone with an average amount of stats knowledge, a .500 graduate. That person probably took one or two stats courses, at most. So, I'd beat him or her almost 100 percent of the time, not just 90 percent.

The differences aren't that big in pro sports. By my estimate, an NFL or NBA team that has a .686 talent over an average season actually has a .700 talent against an average team. In the normal range of MLB team talent, the difference is negligible. A team with .565 talent -- that's 91.5 wins out of 162 -- would probably play only .566 against an average team, a discrepancy of only one point.

-------

Anyway, after I posted that, Ted Turocy wrote me, disagreeing with how I described the problem. Ted agreed with my argument itself, but felt strongly that it doesn't show that log5 is "biased."

Here's how I understand his objections:

1.  The log5 (or odds ratio) method has been used successfully for years.  In the academic literature, it's called the "Bradley-Terry" method, named after two academic researchers who introduced it in 1952. In one of the fields Ted studies, Contest Theory, it's been the standard for decades. In the academic world, researchers don't make the mistake I described -- it's understood completely that talent estimates relate to performance against a .500 team.  In fact, the algorithms used to estimate talent don't usually even mention season-against-league performance.

2.  The log5 formula (or the odds ratio formula, which is algebraically identical) has been formally proven to provide unbiased estimates under certain assumptions (which I'll talk about in a future post).

3.  My objection, that log5 is biased if you use "against league" estimates of talent instead of "against .500" estimates of talent, applies to ANY estimator, not just log5. That's because for "average talent against all teams" to always equal "talent against an average team", the formula would have to be linear. But linearity won't work for a correct formula, since all estimates have to be between .000 and 1.000, and linear formulas would routinely exceed those limits.

I agree with all three of these objections, with one minor nitpick: the academic literature rarely uses the term "log5." It mostly uses "Bradley-Terry," or "odds ratio."  While the formula is the same, the application is different.

In "normal" sabermetrics, "log5" just uses a season's record as an estimate of talent -- I have *never* seen a mainstream sabermetric study acknowledge that the "against .500" talent should be used instead. In my experience, it's just been commonly assumed that "talent against league" and "talent against .500" were exactly the same number -- and, sometimes that's been stated explicitly.  (In fairness, while the two aren't the same, it turns out that in baseball, they're close enough for most purposes.)

So, I was prepared to say, OK, maybe we can say that "log5" is biased the way it's used in the sabermetric literature, but Bradley-Terry isn't biased in the way it's used in the academic literature. But, Ted let me know that, no, that won't work -- the term "log5" actually *is* used in the academic literature to mean "odds ratio formula," and it's used properly.  So my description of it as "biased" is still wrong.

OK, fair enough. Ted has convinced me that my title is misleading, that it implies that the log5 formula *itself* is biased, even when used properly and talent is assumed to mean "against a .500 team." I had considered "log5" to implicitly mean "using talent against league," which I shouldn't have.

I should have said something like: "A log5 estimate is biased against the favorite when "record against league" is used as the measure of team talent instead of "record against .500.""

Having said that, I say again that both Ted and I agree that the bias is there, the explanation is correct, and it's just the characterization of "log5 is biased" that's in dispute.

I'll soon update the previous post to make that clear.

------

BTW, during our e-mail exchange, Ted educated me about other aspects of the issue, for which I thank him. My current understanding of log5, as I will describe it in future posts, is much clearer because of his help.  However, I think Ted still disagrees with me on a few of the things I will be posting. I may wind up being wrong, but probably less wrong than if Ted hadn't helped me out.