Thursday, August 24, 2006

Finding a true talent level for an outcome distribution

What is the distribution of team talent in major league baseball? For instance, how many teams are good enough to actually win 100 games in a season?

You can’t just go by actual won-loss records, because any team that actually wins 100 games has probably done so aided by a bit of luck. To oversimplify, a team that wins 100 games might be a 95-game team that got lucky, or a 105-game team that got unlucky. There are many, many more 95-game teams than 105-game teams, and so the average team that wins 100 games is probably closer to 95 games than to 105 games.

In general, the distribution of talent is much narrower than the distribution of actual results. One way to see this is to consider the extreme case – suppose every team had the same (.500) talent, as if every game outcome was determined by flipping a coin. In that case, the talent distribution is the narrowest it can possibly be, but the season outcomes are normally distributed with standard deviation about 6.

So, how can we determine the talent distribution, given that we only know the outcome distribution?
Tangotiger has a method. He points out that

var(outcome) = var(talent + luck) = var(talent) + var(luck) + 2 cov(talent, luck)

Since luck is random, it doesn’t correlate to talent, and the covariance term is zero. So

var(outcome) = var(talent) + var(luck)

We can observe var(outcome) from actual W-L records, and we can figure out var(luck) from the binomial distribution. And so we can easily figure out var(talent). Here’s a quote from Tom’s post on the method in more detail. He’s figuring out var(talent) for the NFL:

Here is one way to figure out the var(true) for any league.

Step 1 - Take a sufficiently large number of teams (preferably all with the same number of games).

Step 2 - Figure out each team’s winning percentage.

Step 3 - Figure out the standard deviation of that winning percentage. I just did it quick, and I took the last few years in the NFL, and the SD is .19, which makes var(observed) = .19^2

Step 4 - Figure out the random standard deviation. That’s easy: sqrt(.5*.5/16) 16 is the number of games for each team.

So, var(random) = .125^2

Solve for: var(obs) = var(true) + var(rand)

var(true), in this case, is .143^2

So the standard deviation of talent in the NFL is .143.

Tango tells us the SD of talent in MLB is about 0.060. If we assume that MLB teams are normally distributed with mean .500 and SD .060, then 99.5 wins (.614) is 1.90 standard deviations from the mean.

Looking up 1.90 in a cumulative normal distribution table tells us that 2.9% of teams have the talent to win 100 games or more. That’s 1 in 34, or about one team per season.

But in any case, the point of this post is not the specific value, but rather, Tangotiger’s method. It’s simple, it’s easy to calculate, it’s theoretically sound, and it’s extremely useful in all kinds of situations.



At Friday, August 25, 2006 1:34:00 PM, Blogger Tangotiger said...


I said the following:
I took all the teams since 1962, with a minimum of 158 GP (n=1022), and calculated the standard deviation of their actual (observed) winning percentage. That’s .072.

I then figured the random variation, expected, given 161.8 trials (the average number of games per team in my sample). That’s .039.

variance(observed) = variance(true) + variance(random)

...we can calculate the standard deviation as .060.


At Friday, August 25, 2006 4:05:00 PM, Blogger Phil Birnbaum said...

This post originally said that I had results that were different from Tango's. Since it became obvious that Tango's were correct, I have removed that reference -- it's obviously my work that I have to take a second look at.

At Saturday, June 05, 2010 5:09:00 PM, Blogger dan said...

Hi Phil;
Great blog.
I've been using tangotiger's 'true talent level formula for nhl
Just checking that my calculations are right

NHL last 5 years (since lockout)
150 team seasons SD of win % is
0.083 then 0.083^2 = var (obs)

var (luck) is sqrt(.5*.5/82)
= 0.055^2

therefore var (obs)
= 0.083^2 0r (.007) - var(luck) 0.55^2 or (.003) = var (true)
.063^2 or ( .004)

this means that hockey is very close to mlb which tango cal. at .06^2 not the .08 he suggested

also luck factor for nhl is also close to mlb
again using tango's formula
.5^2 / (.5^2+.063^2) =
and 1- .984/2 =
0.507 true win% from one game
so if you watch one nhl game only 1.6% can be certain is skill
and it takes approx 63 nhl games to establish skill level (not the 36 tango suggested)

sqrt(.5*.5/63)= .063 (63 is number of games)
as tango suggests this needs to equal var (true)

does this seem right?

thanks dan

At Saturday, June 05, 2010 6:02:00 PM, Blogger Phil Birnbaum said...

Hey, Dan,

I'm with you up to the 0.63^2 (although it's 0.64^2 with less rounding).

After that, I'm not familiar enough with Tango's "games" method to know ... maybe ask Tango?

BTW, are you just ignoring the extra point in overtime? The NHL isn't really binomial with that extra point is it?

At Saturday, June 05, 2010 11:53:00 PM, Blogger dan said...

Hi phil;
thanks for responding.
I found Tango's post on "the book" blog so I did use his games method
As for overtime,
I have not found any correlation between overtime/shootout and skill
(this makes sense since overtime is like a five minute game or 1/12 of a
normal game. if the skill is approx 25% then overtime skill would account for 2% or barley
seen. Since I just wanted to see how much of a nhl game involves luck exactly the way it is ...i am not concerned with points just wins so I used w%.
Even with the crazy loser point there is still only one winner so its still binomial isn't it?If you try to filter out overtime/shootout then I think your point is valid
I mean if mlb decided to flip a coin or have a homerun hitting contest after nine innings instead of continuing the game it would still be binomial?

I'm just interested in how the game has change (from Sd.10- before lockout) with the salary cap and the rule changes, larger goalie equipment and overtime/shootout etc etc. more and more randomness has been introduced but it is still hard to believe its 75% luck
2 possible reasons
1) human bias that fails to realize streaks occur even with randomness (i remember a post you lined regarding a prof. and coin flipping along these lines?)
2) As brian burke of advanced nfl stats points out the 'more' skilled team wins half the time due to luck as well and most often
we see these wins incorrectly due to the skill and exgerate the dominance of the better team because the result is expected.



Post a Comment

<< Home