Sabermetric Research: The large supply of tall people

Monday, May 28, 2007

The large supply of tall people

In this New York Times article from three weeks ago, Dave Berri, co-author of "The Wages of Wins," once again argues that the relatively low level of competitive balance in the NBA is due to the "short supply of tall people":

"The population of [necessarily tall] athletes the N.B.A. draws upon is quite small. … As the evolutionary biologist Stephen Jay Gould observed, when a population is relatively small, the difference between the very best and the average athlete will be quite large. In other words, when your population of athletes is small, your league will have less competitive balance."

I have previously made two points in response to this argument. First, there are other, stronger reasons that the level of competitive balance in the NBA is low. And, second, every sport has specialized requirements for its athletes -- why should height be any different?

It's the second point I want to address in more detail here.

Berri asserts that basketball players need to be tall. Which is true. But, obviously, height isn't enough. They also need to be able to shoot straight. They need to be able to anticipate plays. They need to be able to read a defense. And so on.

Why, then, should height be special? Isn't there equally a "short supply of people who can feed a teammate with a blind pass?" Isn't there a "short supply of people who can make a difficult fadeaway jumper?"

And, as I wrote previously, Wayne Gretzky saw plays unfold in slow motion, which gave him extra time to react. This made him perhaps the greatest playmaker in hockey history. You've got to think that even among all the hockey players in the world, there's a very short supply of people who can read plays anywhere close to the NHL level.

Why makes height different from shooting, or playmaking? For almost any skill (or "characteristic," if you don't consider height a skill), the population will possess it in a normal distribution. Professional athletes are drawn from those winding up in the far right tail of that distribution -- the best of the best (or tallest of the tallest). Of course, any sport requires not just one skill, but many. So an NBA player could probably be only "above average" in some areas, so long as he's *way* above average in enough other areas.

But there's a short supply of humans who are in the right tail of any and every normal distribution. Again, why should height be different? I see three ways height is different, and both of them work *against* Berri's argument.

First: not all basketball players are tall. A quick check of the Detroit Pistons roster shows that four of their players are 6'3" or shorter. It's not really rare to be that tall. I know several people who are around 6-3.

But how many people do I know who can beat out even one Detroit Piston in any basketball-related skill? Probably none. In passing, shooting, or dribbling, any NBA player could beat any of my friends with no effort at all.

What does this mean? That while height is important for a basketball player, it appears to be almost the *least* important of all the skills! If you count all the men in North America who are at least as tall as the fourth-shortest player on an NBA team (6-3), you'll get a pretty high number. If you count all the men in North America who can block shots at least as well as the -fourth-worst player on an NBA team, you'll get a pretty low number.

When you look at it this way, there's comparatively a pretty *large* supply of tall people.

Second: tall men -- really tall men, say 6-7 or such -- may be fairly rare in the general population, but they are easily noticed. So while there might be a short supply of tall people, probably all of them have tried to play basketball! That is, suppose that in the general US population, men who are 6-7 are exactly as rare as men who can hit a fadeaway jumper under pressure with two seconds left on the shot clock. You'd find more of the former in the NBA than the latter. Why? Because almost all tall teenagers attract attention as basketball players, and they all will have had tryouts. Meanwhile, not-so-tall teenagers who might have had basketball skills could wind up not playing at all (especially if they are younger than their peers, and so temporarily smaller), or concentrating on other sports.

Relative to other skills, and considering only the population who considers playing baskeball, height is again in larger supply, not shorter.

Third: tallness is evident early, but basketball skill may develop later. According to Wikipedia, even Michael Jordan "was not initially a standout player for the North Carolina Tar Heels." In any sport, history is full of players drafted in the very late rounds who turn out to be superstars, and there are lots of players who unexpectedly develop into elite players even after making the big leagues.

"Tall" is obviously easier to scout for than other skills. How many teenagers would have pursued an NBA career, but didn't know they would eventually be able to pass and shoot and read a defense extremely well? Probably lots. How many teenagers would have pursued an NBA career, but didn't know they would eventually be extremely tall? Probably very few.

Again, the height requirement means that more of the best players in the world will make the NBA -- not fewer.

In summary:

1. Height is not as rare in the general population as other basketball attributes, which suggests that it's less important.

2. People with height are almost 100% likely to consider and pursue basketball, unlike people with other basketball skills.

3. Height is evident early in a player's life, unlike other basketball skills.

The fact that height is more important in the NBA than in other sports results in an increase in the basketball-suited population that actually pursues the sport. The market for professional basketball players is much more efficient at recognizing productivity when it appears in the guise of height than when it appears in other ways.

And so I don't think it's correct that NBA play quality -- and therefore competitive balance -- is reduced by a shortage of tall players. If anything, height is *more* abundant than other skills.

The argument should go exactly the other way. The fact that height is important in the NBA results in *more* competitive balance compared to other sports, not less.

Labels: basketball, competitive balance, NBA, The Wages of Wins

28 Comments:

At Tuesday, May 29, 2007 12:42:00 PM, Tangotiger said...: I've said this many times: the reason of the competitive disbalance is exactly tied-in to the length of the match.

If tennis ended after 1 set, not best 3-of-5, Federer wouldn't have won as much.

If baseball was a 3-inning game, you wouldn't have a team winning 116 games.

If basketball could be played in a 12-minute game, you wouldn't have such disbalance.

Physical exhaustion aside, if basketball could be played in 96 minutes instead of 48 (or similarly, play two 48 minute games, where the scores are added up), the disbalance would be even greater.
At Tuesday, May 29, 2007 12:57:00 PM, Phil Birnbaum said...: Tango,

I agree of course, and I've also said that many times. But a possible rebuttal to that might be, "yes, the number of confrontations is the main factor, but the supply of tall people contributes a little bit too."

I'm addressing the "yes, but" portion of the rebuttal here. The "tall people" effect, I'd bet, is very very tiny (no pun intended), but it goes the other way than TWOW thinks it does.
At Tuesday, May 29, 2007 5:33:00 PM, Unknown said...: I can't believe that people are still debating this issue!! The short supply of tall people argument in nonsensical.

Soccer is another example where height is an apparent advantage (to head the ball to goal) but few tall people are successful because other skills are more important.

Height plays more of a role in basketball, for sure, but as you rightly point out there are a bunch of other necessary skills to forge a successful career.

Another great post, Phil.

Beamer
At Tuesday, May 29, 2007 5:39:00 PM, Unknown said...: On a seperate note a hear WoW II is coming out ... more fodder for all.
At Tuesday, May 29, 2007 9:40:00 PM, Anonymous said...: First, I should note that I recently stumbled across this blog (and in the past have skimmed a few issues of "By the Numbers") and am amazed at how good the quality of the analyses are.

Second, I think Dave Berri has addressed the "structure of the game" argument. In the comments section to this post-- http://dberri.wordpress.com/2007/05/22/some-nice-things-i%e2%80%99ve-missed-%e2%80%93-nba-playoff-edition -- Dave Berri wrote:

"(...) let me quickly note again the problem with the “structure of the game” arguments for competitive balance. Competitive balance in baseball improved dramatically in the 20th century. If structure of the game drives competitive balance, then how could balance change in baseball (when the game hasn’t changed)? I would add, we see the same pattern in football and hockey. Balance seems to improve over time. This is consistent with the population hypothesis. We see less improvement in basketball, primarily because the populations drawn upon in basketball remain extremely low."

Also, I'm just making this point hypothetically but if every sport had comparable skill constraints (i.e. requiring someone the same number of standard deviations on the right side of the population distribution) and basketball additionally required an extra dimension (i.e. a height constraint) then this condition would be inconsistent with your post. My point is somewhat gratuitous because of I have no evidence supportive of it, but I thought I'd explicitly bring it up anyway.

--Okapi
At Tuesday, May 29, 2007 10:37:00 PM, Phil Birnbaum said...: Hi, Okapi,

1. Thanks for the kind words!

2. Thanks for the link. I had seen Berri's post, but not the comments.

Berri's argument is a bit of a straw man. Nobody is saying competitive balance is *solely* a function of the structure of the game -- only that structure is a major component.

Over time, Berri may be correct that balance has increased *within* sports over time because of an increase in the population of players. But there is another major factor acting *between* sports, and that's the structure of the game.

Barry also says that balance in basketball has not increased over time as much as other sports. Whatever the reason for this, I stick by my argument here that it's not anything to do with tall people.

3. I'm not sure that's the case, that the more different skills a sport requires, the lesser the competitive balance.

A linear combination of normal variables is still normal, isn't it? If you rated all players on a million skills, in the exact combination in which they contributed to victory, they would still follow a normal distribution, and you could still choose from the far right end.

So, for instance, there is no theoretical reason that decathlon should be less balanced than triathlon, just because it needs ten skills instead of three.

But while there is no *theoretical* reason, there might be a practical one. If a sport requires few particular skills, it might attract more casual players. That would lead to a greater population that practices, which means a larger supply of players. On the other hand -- decathlon, say -- who wants to train in ten different sports? The number of people willing to devote years to decathlon is probably pretty small.

I don't think this applies to basketball, though -- the appeal is not that it requires few skills, but just that it's fun.

While I'm here, a counterexample: Home Run Derby requires only one skill, while real baseball requires many. But HRD would be dominated by a few sluggers, while baseball sees lots of competitive balance, even in individual accomplishments.
At Wednesday, May 30, 2007 2:31:00 AM, Jason Lisk said...: Competitive balance in baseball improved dramatically in the 20th century. If structure of the game drives competitive balance, then how could balance change in baseball (when the game hasn’t changed)?

I don't think this holds up. The "game" has changed, at least in how rosters are constructed and how the champion is determined. In 1969, MLB went to divisions following expansion, and introduced an extra playoff round. The Curt Flood challenge to the reserve clause was also around this time. In 1995, yet another playoff round was added.

Based on his argument in the article, I assume he measures competitive balance by the number of teams winning a championship over a given period. Berri's argument that racial integration increased the available player pool and thus improved competitive balance is dubious. You would expect that the balance would have been improved for the period from 1947-1968 if that were true. The top two teams of that period were even more dominant than previous eras.

Here are the top 2 pennant winners for similar periods of time:

1903-1924: Giants (10), Red Sox (6)
1925-1946: Yankees (11), Cards (9)
1947-1968: Yankees (15), Dodgers (10)
1969-1990: Athletics (6), Orioles and Reds (5)
1991-2006: Yankees (6), Braves (5)

The Yankees actually won 14 of 16 pennants between 1949 and 1964, their most dominating stretch, in the time period that Berri claims the influx of new players improved competitive balance. If his claim were true, I would think the evidence would appear before the league began expanding in the 1960's, thus diluting the increased player pool, and before changing the playoff structure .

And to bring this back to the basketball topic, if he is relying on this unconvincing argument to support his argument that the same happens with a decreased player pool in the NBA, I am not convinced.
At Wednesday, May 30, 2007 6:28:00 AM, Anonymous said...: Jason: You have to separate the question of the disparity in player talent from team competitiveness. Berri makes the mistake of assuming they are the same thing (or have a constant relationship). He is right that as the talent pool increases, variation in player talent should shrink (and overall talent rises). The variance in player performance HAS tended to shrink over time, though the trend has slowed considerably in recent decades.

However, team competitiveness depends on how teams are constructed, and the structure of the game. So you can't use the SD of team win % to tell you whether talent differences at the player level are shrinking.

* *

The core problem with the Tall People thesis is that there is no "height requirement" in the NBA. There is only a requirement that you can score and/or prevent scoring. Tall players tend to do that better. So, as Phil suggests, we should treat height as a skill, not an arbitrary eligibility requirement that shrinks the talent pool. In 1950, the average NBA player was 6-4, 198. Today he is 6-7, 225. Can anyone seriously doubt that today's players are better? (And in fact, that change proves that the talent pool has expanded considerably.)

What is true is that when you demand one skill -- height -- then players' OTHER skills will tend to be weaker. For example, we would expect 7 ft players to have a lower 3-pt percentage than average, just as we expect SSs to be weaker hitters than 1Bmen (because we've demanded better defensive skills). But that doesn't mean the players' TOTAL skill set, including height-related advantages, is lower.

For example, without Shaq the NBA would have had less variance in FT%. But did he lower the overall skill level of the NBA? Obviously not.
At Wednesday, May 30, 2007 8:12:00 AM, Phil Birnbaum said...: Jason: I think Berri's measure of competitive balance is a function of regular season standings, not playoff performance. If memory serves, he divides the standard deviation of winning percentage vs. the expected standard deviation you'd get if all teams had .500 talent.
At Wednesday, May 30, 2007 8:24:00 AM, Phil Birnbaum said...: Guy wrote, "However, team competitiveness depends on how teams are constructed, and the structure of the game. So you can't use the SD of team win % to tell you whether talent differences at the player level are shrinking."

Agreed. As Guy wrote in a comment to Berri's post (see Okapi's comment above for the link), there are at least three factors to competitive balance:

1. The structure of the game;
2. The structure of how players are arranged on the teams;
3. The standard deviation of player skill.

It seems to me that (1) is strongest, (2) is strong, and (3) is minor. I'm not sure why Berri is concentrating on the least important factor, and ignoring the other two.
At Wednesday, May 30, 2007 8:32:00 AM, Anonymous said...: Re: Phil#9: WoW relies on the Noll-Scully ratio of observed SD to ideal SD to measure competitive balance in each sport. This is apparently a convention in sports economics (they borrowed it from another researcher). However, the ratio of the two SDs will not tell you the true variance in team strength within a league, which is presumably what competitive balance means. And if you measure it correctly, it turns out that football, not basketball, has the largest "imbalance." I wrote this at WoW:

"We know the relationship between observed SD, true strength, and random error (also the “ideal” SD), which is SD(observed)^2 = SD(true)^2 + SD(error)^2. So, SD(true) = sqrt (SD(obs)^2 - SD(error)^2), for which the Noll-Scully ratio of SD(obs)/SD(error) will not be a good approximation.
The ratio will tend to understate the real variance in leagues with a small number of games. For example, the “ideal” SD is .039 for MLB, vs .125 for NFL. So, to match baseball’s 2.10 Noll-Scully ratio, football would need an observed SD in win% of .263 (huge). But in that scenario, the NFL true strength variance would be more than three times greater than MLB’s (.231 v .072), despite identical Noll-Scullys.

So the Noll-Scully ratio is not a good proxy for true strength differences among teams. However, we can use the ratios Dave reports to reverse engineer the observed SD, and then calculate the true strength differences in leagues:
(Noll-Scully, True strength SD)
MLS: 1.38, .071
NFL: 1.56, .150
MLB: 2.10, .072
NBA: 2.54, .129
So, soccer and baseball actually have almost identical levels of competitive balance. The NBA and NFL have much larger spreads of talent, but interestingly, it is the NFL (.150), not the NBA (.129), that has had the largest team strength variation. (Although, in the 1990s the NBA was slightly larger). And in fact, we can observe that great NFL teams will often be over .800, while great baseball teams are just .650 — the NFL is not highly competitive in this sense.
And since the NFL is the most unbalanced support, I suppose we’ll soon be hearing about “The Small Supply of Big People”!
At Wednesday, May 30, 2007 8:46:00 AM, Phil Birnbaum said...: Guy,

A most excellent point. Why use Noll-Scully to estimate the SD when you can figure the SD itself?

The var(observed) = var(talent) + var(luck) equation is a very useful tool.
At Wednesday, May 30, 2007 9:31:00 AM, Anonymous said...: It occurs to me that once we measure true strength and luck/error, we can use those to determine "standings integrity," i.e. how well league standings reflect real differences in team strength. (I say 'strength' rather than 'talent', just to avoid confusion with individual player talent.) I can see two metrics: a) true strength variance as % of total variance, and b) signal:noise ratio. For the major sports, we'd get:
SD(true)^2/SD(obs)^2
NBA .845
MLB .773
NFL .589
MLS .390

Signal/Noise -- SD(true)/SD(error)
NBA 2.34
MLB 1.85
NFL 1.20
MLS 0.80

Same ranking, but different magnitudes. Anyone here think one of these (or another metric) is best?

If I've done this right, MLS is a real crapshoot. But I don't know from soccer -- does that sound right?

The NFL standings clearly reflect a lot of luck, but on the other hand it's amazing what just 16 games can do if talent differences are large. If MLB used a 16-game schedule, just 25% of the variance would be real strength differences, and signal:noise would be just .58! And the NFL gives itself an out by letting so many teams make the playoffs.
At Wednesday, May 30, 2007 5:06:00 PM, Anonymous said...: About standings integrity, here's another idea for a metric: given two randomly selected teams, the probability that the one with the better observed record is actually the better team in true strength.

There is a simple formula for this in terms of the signal/noise ratio.

P = 1/2 + taninv(SD(true)/SD(error))/pi

The SD(true)/SD(error) will be a positive number from 0 to infinity, so its taninv will be between 0 and pi/2. That means the formula gives a number between 0.5 and 1 as it should.

Using Guy's numbers, we get
NBA .871
MLB .842
NFL .779
MLS .715

It's more compressed than the other metrics. I like it because it is easy to interpret: in the NBA, the team with the better record is actually the better team 87% of the time.
At Wednesday, May 30, 2007 5:24:00 PM, Phil Birnbaum said...: That's pretty cool! I like it when you unexpectedly wind up with a formula with arc tan and pi, when what you're studying seems to have no relation to such things.

Is there an explanation you can link to as to why it works?
At Wednesday, May 30, 2007 7:48:00 PM, Anonymous said...: I'll write something up. I was surprised too that it came out so nicely.
At Wednesday, May 30, 2007 9:53:00 PM, Anonymous said...: A variation on that would be the correlation between observed records and true strength. Tango speculates on his blog that we can get that from SQRT(1-Var(error)/var(observed)), which is also SQRT of my % of variance formula above. That would give us:
NBA .92
MLB .88
NFL .77
MLS .62
Anyone know if that formula is right? It seems a little more intuitive to me than likelihood team A is better than B, but that's totally subjective.

* *

BTW, the formula for converting a Noll-Scully ratio (NS) into the correct SD(true) is: SD(error)*SQRT(NS^2-1).
At Wednesday, May 30, 2007 11:58:00 PM, Anonymous said...: I put up a derivation of my formula here (PDF file). It's an interesting trick the way the formula falls out.

Guy, I think I agree with you that the correlation formula is a bit more intuitive. (No idea whether Tango's formula is correct, but it seems very plausible.) One thing that occurs to me from looking at the MLS numbers is that even if the correlation is weak, the standings will still reflect the strengths of the teams relatively well.

All these metrics can be expressed in terms of each other, so the ranking will always remain the same.

--

I am not sure about the validity of these methods for the NFL. The calculation of SD(error) is an approximation that's very good when the odds of most games are within 65/35 or so, but that might fail for the NFL (and, to a lesser extent, the NBA).

Also for my own formula, I assume that the distributions of team skill and error are normal, which may not work with as few as 16 games. Besides, it's not uncommon for two teams to end up with exactly the same record, which is a warning that the continuous approximations are creating distortion. This doesn't apply to any of the other formulas, though.

For MLS, are ties considered as half a win and half a loss? I imagine that this would also throw off the SD(error) calculation.
At Thursday, May 31, 2007 2:40:00 AM, Unknown said...: Tango's formula for calculating r is correct. Actually it is a metric called reliability ... which can be proved to be similar.

1 - var(err)/var(obs)

= [var(obs)-var(err)]/var(obs)

= var(true)/var(obs)

Remeber that:

r = cov(x,y)/std(x)std(y)

cov(x,y) ~= var(true)
std(x)std(y) ~= var(obs)
At Thursday, May 31, 2007 7:58:00 AM, Phil Birnbaum said...: Thanks, dcj, great stuff. And John, thanks for that too.
At Thursday, May 31, 2007 8:02:00 AM, Anonymous said...: Thanks, John.
At Tuesday, June 05, 2007 12:40:00 PM, Brian Burke said...: The structure of the game of basketball is such that rare outlier players, like Shaquille O'Neil or Michael Jordan, can dominate play. One player can account for up to 1/5 of his team's minutes on the floor.

Someone like Barry Bonds or Ray Lewis is on the field a lot less. Baseball players are at bat about 1/9 as often as the rest of their team. Football players share the field even more. Ray Lewis can at most only account for 1/22 of his team's "man-minutes"
At Thursday, June 07, 2007 10:01:00 AM, birtelcom said...: Brian Burke's comment seems the most persuasive to me -- to the extent basketball exhibits less competitive balance than other sports it seems likely to be primarily the result, not of anything to do with height, but rather of the fact that a relatively small number of individuals on a team contribute a relatively high proportion of a team's total performance over a season. A baseball starting pitcher contributes more to a single game's outcome than any other single player in any major sport, but only participates in one of every five games or so. If a starting pitcher could start every game for a team, competitive balance in baseball would be much lower (and indeed was when one or two starting pitchers pitched most of a team's innings).

All that said, the discussion in Phil's original blog post describing height as a "talent" like other talents seems to miss something important. Talents or skills such as the ability to shoot a basket, read a defense, hit a curveball, etc. can perhaps be distinguished from an attribute such as height by characterizing such talents or skills as "cultivatable talents". Training from an early age, repeated practice, high quality coaching, etc. can improve to a significant degree the performance level of the skills required for professional sports. Of course some minimum inherent level of ability is necessary -- many dedicated and well-trained players still fall short without the necessary underlying talent. But the thing about shooting a basket or hitting a curveball is that, if we put performance levels of these cultivatable talents on a scale of 1 to 10 (5 being professional performance level), a player with in-born skills at say a 2 might cultivate that skill with maximum effort and training enough to achieve, say, a 5 level. This means that the full population of all those with in-born skills at 2 or better can achieve professional level. Height doesn't work that way. If you are born 5 foot 10 (which is I believe is about the average height of an American male) you have a tremendous disadvantage in seeking to play professional basketball, a disadvantage that itself cannot be cultivated away. Yes, you can cultivate your shooting skills such that the height is less of a disadvantage, but the height disadvantage itself is completely and totally unamenable to being cultivated away. The population of people who are viable candidates to be trained into professional status is substantially smaller for basketball than baseball because the height element is not a cultivatable talent.
At Thursday, June 07, 2007 10:50:00 AM, Phil Birnbaum said...: I agree completely with Brian Burke's comment that individual players are more important in basketball than other sports, for the reasons he states. I think that's one part of the reason that competitive balance is lower in basketball than other sports.
At Thursday, June 07, 2007 11:01:00 AM, Phil Birnbaum said...: birtelcom, I disagree on a couple of points.

First, I agree with you that "you can't teach height". And I agree with you that if you're 5-11, you are very unlikely to wind up in the NBA, regardless how much other talent you have.

However, I don't think that makes a difference. I think there are other things, like *aptitude* for basketball, that are just as determinative as height. If I were 6-5 instead of 5-8, I still would never be an NBA player, because I don't have the co-ordination or ability to read the play.

That is, there are other attributes than height that *you can't teach*. Whatever those attributes are, there are also a short supply of them. It's just that height is visible while aptitude is not.

You argue that a "2" can become a "5" with training. That's true if the 2 is a 2 in specific basketball skills, but the guy is already an 8 or a 9 in *aptitude*. There are other things that can't be taught.

And those guys, some of them go unnoticed, because they're short, and so they never have incentive to develo those basketball skills from a 2 to a 5.

Which comes back to my original argument: of all the aptitudes, height is the only one that's visible, and therefore *not wasted*.

What that means is that the importance of height makes it *more likely* that the best players make the NBA. The importance of height should *increase* competitive balance.
At Thursday, June 07, 2007 11:06:00 AM, Phil Birnbaum said...: One other thing I should say is that competitive balance doesn't depend on how good the average player is -- it depends on how good the players are *relative to each other*. If every NBA player practiced twice as much in high school, competitive balance probably wouldn't change much.

That's why the fact that height isn't cultivatable, but shooting is, doesn't really matter much.

Where it matters, it matters the other way. Some really good players are overlooked because the scouts don't realize how good they are. This lowers competitive balance because worse players replace them. But NO players are overlooked because the scouts don't realize how tall they are.

And that's why the importance of height improves competitive balance -- by making important something that can very, very easily be measured.
At Sunday, June 17, 2007 4:19:00 AM, Bob Timmermann said...: I picked you for the job, not because I think you're so darn smart, but because I thought you were a shade less dumb than the rest of the outfit. Guess I was wrong. You're not smarter, Walter... you're just a little taller.
At Monday, April 20, 2009 4:00:00 AM, Anonymous said...: This comment has been removed by a blog administrator.

Sabermetric Research

Monday, May 28, 2007

The large supply of tall people

28 Comments:

About Me

Previous Posts