## Saturday, September 03, 2016

### A case where log5 works perfectly

Suppose there's a coin-flipping league where every team has the same talent. After the season is over, you notice one team in the standings at .800, and another is at .400. Those records include the two teams facing each other at least once.

What is the probability, in retrospect, that the .800 team beat the .400 team in a particular game where they met? The log5 formula says you figure it out like this:

.800 is a ratio of 4 wins per loss
.400 is a ratio of 2/3 wins per loss

4 divided by 2/3 equals 6

6 wins per loss is 6 wins per 7 games, which is .857.

(You can use the traditional form of the log5 formula if you want, to get the same .857.)

And it turns out that, in this case, the log5 formula DOES work. It works perfectly. The probability is indeed .857, and you can prove that.

I'll work out this particular example. Call the two teams A (.800) and B (.400). Suppose there were only 5 games in the season, so that A went 4-1 and B went 2-3.

Suppose the two teams only met one time. What's the chance A won that game?

Start with the the case where A beat B. If that happened, A would have to have gone 3-1 in its other four games, and B would have to have gone 2-2.

There are four permutations where A goes 3-1 (WWWL, WWLW, WLWW, LWWW), and six ways for B to go 2-2 (WWLL, WLWL, WLLW, LWWL, LWLW, LLWW).

That means there are 24 (6 x 4) ways to draw up the season when A beats B.

Now, suppose that B beat A. That means A went 4-0 otherwise, and B went 1-3.

There is only one way for A to go unbeaten (WWWW), and only four ways to arrange B's 1-3 (WLLL, LWLL, LLWL, LLLW).

That means that there are 4 (1 x 4) ways to draw up the season when B beats A.

Since this is coin flipping, all the cases have equal probability of happening. So, A beats B 24 times for every 4 times that B beats A.

That's a ratio of 6:1, which is 6/7, which is .857 -- exactly as log5 predicts.

-------

It's not that hard to go from this example to a proof. Just replace the raw numbers by variables for number of games total (n), number of games A wins (a), and number of games B wins (b). When you count permutations, you'll wind up with factorial terms, and when you divide the A permutations by the B permutations, the factorials will cancel out, and you'll be left with

p = (a/(n-a)) / (b/(n-b))

Which is exactly the log5 formula.

I don't know much about the history of log5, but some of you do. Was this part of the genesis of log5, that it could be proven to work retrospectively, so when it seemed to work pretty decently as a forecast, it became the standard?

-------

But wait a minute -- last month, I argued that log5 couldn't possibly work when you used season records. If you recall, I posted this chart:

matchup           log5
---------------------------
.800 vs .800       .500
.800 vs .700       .631
.800 vs .600       .727
.800 vs .500       .800
.800 vs .400       .857
.800 vs .300       .903
.800 vs .200       .941
---------------------------
Average           .766

This says that an .800 team, playing against the league as log5 would predict, would actually play at a .766 pace. That's a contradiction -- it should be .800 -- so log5 must be wrong!

One difference between then and now is that, before, we had the .800 team playing against a clone of itself. That's not true here, so let's redo the chart without the first line:

matchup           log5
---------------------------
.800 vs .700       .631
.800 vs .600       .727
.800 vs .500       .800
.800 vs .400       .857
.800 vs .300       .903
.800 vs .200       .941
---------------------------
Average           .810

Well, the average still isn't .800, so we still have a problem.

So, what's going on? Is my logic wrong here, or is my logic wrong there? Does log5 work, or doesn't it?

This bugged me for a while, until I sorted it out in my head. I think both conclusions are correct. The log5 formula actually *does* work in this case, and it actually *does not* work in the other case, for exactly the reasons described.

But what about these charts that show the contradiction? They apply there, but they don't apply here.

The difference is: when .800 is the *talent* of the team, it's constant, and you can use it on every line of the chart. But, when you use the *retrospectively observed performance*, it changes with every game. So you can't use .800 in every line of the chart.

Suppose the (eventual) 4-1 team wins the first game. In that case, it's only an (eventual) 3-1 team after that. That means its retrospectively observed performance next game isn't .800, it's .750. That means you have to draw up the chart like this:

matchup           log5
---------------------------
.800 vs .700       .631
.750 vs .600       .667
...

If it loses the first game, it's 1.000 after, and you draw up the chart like this:

matchup           log5
---------------------------
.800 vs .700       .631
1.000 vs .600      1.000
...

So the chart has to be different every time, based on what actually happens in the games.

I believe that if you were to do every possible permutation of the season, weighted by log5 probability, and average the averages, you would indeed wind up with .800.

-------

Now, the proof that validates the retrospective use of log5 only works because we assumed that games are decided by coin flips. If that weren't the case, then all the permutations wouldn't prospectively have an equal chance of happening, and the logic would fall apart.

But would the *result* still hold? If you don't know A's talent or B's talent, but they still go 4-1 and 2-3, respectively, does the 6 out of 7 still hold?

I don't think it does. Again, imagine "height baseball," where the taller team always wins. It could be that A is the second-tallest team out of 6, and B is the fourth-tallest. That would be consistent with the 4-1 and 2-3 records (imagine a round-robin season).  But A would have a 100% chance of beating B, not 85.7%.

So this is a special case. Whether log5 works here because there's something special about the 50%, or whether it's because all teams are the same, or whether it's just that the average record against all teams happens to equal the record against the average team ... I don't know.

But still. To me, there are no coincidences in math, just relationships that look coincidental until we see the connection. Maybe when I understand log5 better, it'll be self-evident why it works here.

As I said, some of you guys reading this are much more familiar with the intricacies of log5 than I am. Is this a known result? Am I reinventing a wheel?

Labels: ,

At Monday, September 05, 2016 9:30:00 AM,  Tangotiger said...

The height example doesn't work because it's not a probability distribution. It's pre-determined fate. It's like saying "home team wins".

It's been a while since I looked, but I think the Odds Ratio only works when you have a normal distribution with a mean of .500 and a standard deviation that is "small". The more you deviate from that, the less true it is.

You can try for example with a uniform distribution, and I don't know how well it works.

In addition, there's the point that "scoring confrontations" is not the same thing as flipping a coin. A .600 v .400 talent teams in baseball and basketball wouldn't necessarily produce the same outcome.

At Wednesday, September 14, 2016 8:33:00 PM,  Anonymous said...

RE "height baseball":
You're using the wrong measure of talent again. If the teams' talents are Ta >> Tb >> Tc >> Td ..., then team a beats all teams ~=100%, team b beats team a ~=0% and all the other teams ~=100%, ect.