Wednesday, February 29, 2012

Are early NFL draft picks no better than late draft picks? Part II

This is about the Dave Berri/Rob Simmons paper that concludes that QBs who are high draft choices aren't much better than QBs who are low draft choices. You probably want to read Part I, if you haven't already.


If you want to look for a connection between draft choice and performance, wouldn't you just run a regression to predict performance from draft choice? The Berri/Simmons paper doesn't. The closest they come is their last analysis of many, the one that starts at the end of page 47.

Here's what the authors do. First, they calculate an "expected" draft position, based on a QB's college stats, "combine stats" (height, body mass index, 40 yard dash time, Wonderlic score), and whether he went to a Division I-A school. That's based on another regression earlier in the paper. I'm not sure why they use that estimate -- it seems like it would make more sense to use the real draft position, since they actually have it, instead of their weaker (r-squared = 0.2) estimate.

In any case, the use that expected draft position, and run a regression to predict performance for every NFL season in which a QB had at least 100 plays. They also include terms for experience (a quadratic, to get a curve that rises, then falls).

It turns out, in that regression, that the coefficient for draft position is not statistically significant.

And, so, Berri and Simmons conclude,

"Draft pick is not a significant predictor of NFL performance. ... Quarterbacks taken higher do not appear to perform any better."

I disagree. Two reasons.

1. Significance

As the authors point out, the coefficient for draft position wasn't nearly significant -- it was only 0.52 SD from zero.

But, it's of a reasonable size, and it goes in the right direction. If it turns out to be non-significant, isn't that just that the authors didn't use enough data?

Suppose someone tells me that candy bars cost $1 at the local Kwik-E-Mart. I don't believe him. I hang out at the store for a couple of hours, and, for every sale, I mark down the number of candy bars bought, and the total sale.

I do a regression. The coefficient comes out to $0.856 more per bar, but it's only 0.5 SD.

"Ha!" I tell my friend. "Look, that's not significantly different from zero! Therefore, you're wrong! Candy bars are free!"

That would be silly, wouldn't it? But that's what Berri and Simmons are doing.

Imagine two quarterbacks with five years' NFL experience. One was drafted 50th. The other was drafted 150th. How much different would you expect them to be in QB rating? If you don't know QB rating, thing about it in terms of rankings. How much higher on the list would you expect the 50th choice to be, compared to the 150th? Remember, they both have 5 years' experience and they both had at least 100 plays that year.

Well, the coefficient would say the early one should be 1.9 points better. I calculate that to be about 15 percent of the standard deviation for full-time quarterbacks. It'll move you up in the rankings two or three positions.

Is that about what you thought? It's around what I would have thought. Actually, to be honest, maybe a bit lower. But well within the bounds of conventional wisdom.

So, if you do a study to disprove conventional wisdom, and your point estimate is actually close to conventional wisdom ... how can you say you've disproven it?

That's especially true because the confidence interval is so wide. If we add 2 SD to the point estimate, we find that the effect of draft choice could be as high as -0.091. That means that 100 draft positions is worth 9.1 points. That's a huge difference between quarterbacks. Nine points would move you up at least 6 or 7 positions -- just because you were drafted earlier. It's almost 75 percent of a standard deviation.

Basically, the confidence interval is so wide that it includes any plausible value ... and many implausible values too!

That regression doesn't disprove anything at all. It's a clear case of "absence of evidence is not evidence of absence."

2. Attrition

In Part I, I promised an argument that doesn't require the assumption that QBs who never play are worse than QBs who do. However, we can all agree, can't we, that if a QB plays, but then he doesn't play any more because it's obvious he's not good enough ... in *that* case, we can say he's worse than the others, right? I can't see Berri and Simmons claiming that Ryan Leaf would have been a star if only his coaches gave him more playing time.

If we agree on that, then I can show you that the regression doesn't work -- that the coefficient for draft choice doesn't accurately measure the differences.

Why not? Again, because of attrition. The worse players tend to drop out of the NFL earlier. That means they'll be underweighted in the regression (which has one row for each season). So, if those worse players tend to be later draft choices, as you'd expect, the regression would underestimate how bad those later choices are.

Here, let me give you a simple example.

Suppose you rate QBs from 1 to 5. And suppose the rating also happens to be the number of seasons the QB plays.

Let's say the first round gives you QBs of talent 5, 4, 4, and 3, which is an average of 4. The second round gives you 4, 2, 1 and 1, which averages 2.

Therefore, what we want is for the regression to give us a coefficient of 4 minus 2, which is 2. That would confirm that the first round is 2 better than the second round.

But it won't. Why not? Because of attrition.

-- The first year, everyone's playing. So, those years do in fact give us a difference of 2.

-- The second year, the two "1" guys are gone. The second round survivors are the "4" and "2", so their average is now 3. That means the difference between the rounds has dropped down down to 1.

-- The third year, the "2" guy is gone, leaving the second round with only a "4". Both rounds now average 4, so they look equal!

-- The fourth year, the "3" drops out of the first round pool, so the difference becomes 0.33 in favor of the first round.

-- The fifth year ... there's nothing. Even though the first round still has a guy playing, the second round doesn't, so the results aren't affected.

So, see what happens? Only the first year difference is correct and unbiased. Then, because of attrition, the observed difference starts dropping.

Because of that, if you actually do the regression, you'll find that the coefficient comes up 1.15, instead of 2.00. It's understated by almost half!

This will almost always happen. Try it with some other numbers and assumptions if you like, but I think you'll find that the result will almost never be right. The exact error depends on the distribution and attrition rate.

Want a more extreme case? Suppose the first round is four 4s (average 4), and the second round is a 7 and three 1s (average 2.5). The first round "wins" the first year, but then the "1"s disappear, and the second round starts "winning" by a score of 7-4.

In truth, the first round players are 1.25 better than the second round. But if you do the Berri/Simmons regression, the coefficient comes out negative, saying that the first round is actually 0.861 *worse*!

So, basically, this regression doesn't really measure what we're trying to measure. The number that comes out isn't very meaningful.


Choose whichever of these two arguments you like ... or both.

I'll revisit some of the paper's other analyses in a future post, if anyone's still interested.


UPDATE: Part III is here.

Labels: , , , ,


At Wednesday, February 29, 2012 12:05:00 PM, Anonymous Alex said...

Your significance argument assumes that with more data, any result currently in the correct direction would become significant. This is false.

The second argument assumes the result (as has been pointed out in other discussions about this issue). If you believe that worse players are drafted in later rounds, then attrition accounts for the results. If you assume that players are fairly equally talented across the draft, you get the same results (and moreso if there's attrition to help wipe out the already weak effect).

I'm not sure that there's a way to reason around this without making an assumption on the talent level of guys who don't play much/ever, and thus give yourself the answer.

At Wednesday, February 29, 2012 12:23:00 PM, Blogger Phil Birnbaum said...

My first argument assumes no such thing.

My second argument shows that IF worse players are drafted in later rounds, that would improperly skew the results towards zero. Therefore, the regression does not show that worse players are NOT drafted in later rounds.

At Wednesday, February 29, 2012 12:56:00 PM, Blogger Phil Birnbaum said...

Specifically, the first argument shows that the confidence interval is so wide that you can't tell whether there's an effect or not.

For the record, I am NOT trying to argue whether or not it's true that early choices are no better than late choices. That might be true; it might not be true.

What I *am* trying to argue is this: that (a) this study doesn't come close to showing the hypothesis is true, and (b) if anything, this study actually gives data that contradict that hypothesis.

At Wednesday, February 29, 2012 1:28:00 PM, Blogger j holz said...

Count me as still interested. I really wish Freakonomics would link to this.

At Wednesday, February 29, 2012 1:34:00 PM, Anonymous Guy said...

Alex: it's not true that our only choice is to make an assumption, either that there is or is not a correlation. Have you looked at Brian Burke's post? He has a very large sample of plays by QBs who didn't meet Berri's playing time requirement. These non-qualifiers perform very poorly as a group. He then makes a very generous assumption (in Berri's direction) that players who don't play at all are as good as those who played only a litte (very unlikely, since those who play a little are much worse than the qualifiers). And when you incorporate this data on non-qualifiers for all draft positions, there is a very robust relationship between draft position and performance. The absence of a relationship is entirely a function of the qualifying requirement.

I also think Berri uses a weighted average for his draft position groupings. Even limiting the analysis to qualifiers, I bet that if he took a straight average of players he would find a strong relationship. This is hidden because at each successively lower draft level, the very best players account for a larger share of pass attempts. (Tom Brady alone accounts for 14% of plays by QBs in draft positions 150-250!).

At Wednesday, February 29, 2012 1:37:00 PM, Anonymous Guy said...

I would also say it's quite remarkable that Berri -- who uses multiple regression quite promiscuously, to answer nearly every question -- somehow neglects to run the obvious regression Phil suggests. Does draft position predict performance at the player level, or not? I bet the answer is yes.

At Wednesday, February 29, 2012 2:45:00 PM, Anonymous Alex said...

Phil - you said "But, it's of a reasonable size, and it goes in the right direction. If it turns out to be non-significant, isn't that just that the authors didn't use enough data?". Perhaps I was wrong to say 'any result', but you certainly mean 'this result', right? What makes your statement true in this case?

Guy - Yes, I read Brian's work when he posted and I read it again when Phil put up his post. But keep in mind that selection works both ways - players who play poorly to start are likely to not play again, and thus to be non-qualifiers in Brian's set. Players from poor draft positions who start poorly don't often get further chances to show improvement, whereas top choices do and thus can benefit from experience and, potentially, regression to the mean to get better.

I have yet to really see any of this discussion come down to other than 'I agree with this research because, despite the issues, I think NFL talent evaluation is poor' or 'I disagree with this research because I think NFL talent evaluation is decent and there are some issues with the research'.

At Wednesday, February 29, 2012 4:38:00 PM, Blogger BMMillsy said...

Hey Phil,

I haven't yet read this article closely, so I don't have any comment there. I suspect there is some selection bias going on, but beyond that I can't accurately comment at this point.

But, you might be interested in knowing that there is a paper in the 2003 Journal of Labor Economics that looks at the tendency for teams to take higher risk in the later rounds of the NFL draft, therefore looking for the 'diamond in the rough'. As you have discussed here, they are looking for that superstar that everyone else missed (usually these are players that they have less information about). Teams are more likely to take a marginal projected player from, say, an FCS school (less information about performance) than a projected marginal player from an FBS school in the later rounds.

So, ultimately, this creates a point in the draft at which taking a player with more uncertainty would be the correct choice. I suspect this is a large factor in what led to the findings in Berri et al. This would also explain why so many later round players don't even get rostered...but that one player they do find turns out to be a big deal.

But again I will need to read both papers more closely.

For reference, the citation is:

Hendricks, W., DeBrock, L. & Koenker, R. (2003). Uncertainty, hiring, and subsequent performance: The NFL draft. Journal of Labor Economics, Vol. 21, No. 4, pp. 857-886.

At Wednesday, February 29, 2012 5:48:00 PM, Anonymous Guy said...

"I have yet to really see any of this discussion come down to other than 'I agree with this research because, despite the issues, I think NFL talent evaluation is poor' or 'I disagree with this research because I think NFL talent evaluation is decent and there are some issues with the research'."

Alex, that is a fair description of Berri's study, and also of Brian's first post on the issue. In that case, he simply assumed that non-qualifiers performed at the 5th percentile level. While I find 5th percentile vastly more plausible than 50th percentile (Berri's theory), it's fair to say these were simply competing assumptions.

But then Brian produced DATA on the non-qualifiers: we have thousands of plays to look at, and we know these players performed very poorly. At the 6th percentile, in fact, exactly as Brian had guessed. If you simply include these players (and ignore those who didn't play at all), then there is a clear relationship between draft position and performance. So now we don't have competing assumptions, but rather an assumption on one side and data on the other. (I also think Berri's data would support my view, if he hadn't weighted players by pass attempts.) Now, you may have a theory as to why these low-round players are really quite good, despite playing execrably. You may even be right. But the burden of proof is on you, or anyone who wants to prove that some of these low-round choices are really better than they appear. And AFAICT, no one has even tried to prove this case. For now, the data we have says that the lower the draft round, the worse the QB.

At Wednesday, February 29, 2012 6:27:00 PM, Blogger Phil Birnbaum said...


If your study comes up with a point estimate, and it's almost the exact estimate you're trying to disprove, then you can't fall back on "not significantly different from zero."

You need to add more data to your study, so that your confidence interval either excludes zero, or excludes the value you're trying to disprove.

More data would eventually settle the question either way. Either zero would fall out of the confidence interval, giving a significant result, or the confidence interval would shrink so close to zero that there would obviously be no economic significance, despite conventional wisdom.

At Friday, March 02, 2012 4:22:00 PM, Anonymous Doug said...

Hey Phil,

Thanks for putting this together. The only reason I can think of for using the proxy for draft position (instead of draft position) is because the authors think an instrumental variable approach is required (,,

But this is misguided, because draft position is not endogenous to QB performance in the NFL; because the draft pre-dates any NFL performance there can be no endogeneity there. What Berri should be doing is modeling playing time as a predictor of NFL performance, with THAT being endogenous to them model (because PT affects performance, which affects PT, etc.). Actually modeling that makes my head hurt, but it would be possible. Look up Heckman selection models (like here:

At Wednesday, March 21, 2012 8:19:00 AM, Anonymous Anonymous said...

The notion that a player has a "true" draft position that is somehow different from when he was actually drafted exposes these guys as yahoos. It's an expression of the bizarre notion that "mock drafts" and amateur scouting aren't attempts to predict future behavior but instead should be used to evaluate that behavior. As a mode of criticism it is identical to claiming that the results of the NCAA tournament are invalid because it didn't precisely match your bracket.


Post a Comment

<< Home