Are NFL coaches too conservative on fourth down?
NFL coaches are cowards.
That’s the conclusion of a study by David Romer, “Do Firms Maximize? Evidence From Professional Football.” Romer finds that teams could do better by kicking on fourth down less often, and “going for it” more often.
Romer took NFL play-by-play data from 1998, 1999, and 2000. (He looked only at first quarter results, in order to minimize the effects of score and time remaining.) He then used “dynamic programming” to fit a smooth curve to the point values of first downs on every yard line on the field. Having done that, he’s now easily able to figure the best fourth-down strategy.
For instance, Romer’s analysis found that having the ball first-and-ten on your own 30- yard line is worth about +0.9 points (these numbers may not be exact – I’m reading them off a graph). The opponent having the ball in the same spot is worth –2.9 points.
Now, suppose you have a fourth-and-short situation on your 30. If you go for it and make it, you’ll be at +0.9 points. If you go for it and miss, you’re at –2.9. If you kick to your opponent’s 30, you’re at –0.9 (since the other team will be at +0.9).
Therefore, Romer would find that if you can make fourth down with at least 45% probability, you should go for it. That’s because (45% of 0.9) + (55% of –2.9) equals the sure -0.9 you get from punting.
Romer then goes one more step. He figures out that the chance of making five yards or more is very close to that 45% chance. Therefore, a team who’s fourth-and-5 would presumably be indifferent (in the economic sense) between punting and going for it. And so you’d expect that teams in that situation would punt about half the time.
But in real life, teams don’t go for it 50% of the time. Romer doesn’t tell us the actual percentage. However, he does figure out the empirical “yards to go” breakeven point for coaches based on their behavior. That is, if coaches won’t go for five yards 50% of the time, when *will* they go 50% of the time? Is it three yards? Two? One?
In their own territory, it’s effectively zero – coaches never went for it more than 50% of the time, even on fourth and 1. In the opponents half of the field, though, they did. Eyeballing the graph shows the trend is irregular, but appears to average about 1/4 of the theoretical yardage. For instance, on the opponent’s 30, the breakeven point is fourth-and-7. But coaches acted as if the breakeven were about fourth-and-3. (This information comes from figure 5, which is a very interesting chart.)
It would be interesting to see the converse – now that we know that coaches go for it half the time on fourth and 3, how often do coaches actually go for it on fourth and 7? Is it 10%? 20%? Never? Romer doesn’t give us this information. This means (at least in theory) that coaches may be rational, but their own calculations are different. For instance, maybe they think that fourth-and-6 is the breakeven, so they seldom go for it with 7 yards to go, but often go for it with 5 yards to go. This is theoretically possible, especially since in “The Hidden Game of Football,” authors Bob Carroll, Pete Palmer, and John Thorn [CPT] come up with different numbers.
CPT’s methods seem similar to Romer’s, but their conclusions are different. Romer writes that CPT “do not spell out their method for estimating the values of different situations ... and it yields implausible results.” (I wasn’t able to notice anything that implausible, but I’ll give Chapter 10 another read.) Also, Romer writes that “their conclusions differ substantially from mine.”
One significant difference is that Romer advocates going for it on your own 30 when you have fewer than 5 yards go to. CPT, on the other hand, advocate punting even on fourth-and-1 (according to CPT, punting is worth –0.5 points, going for it on fourth-and-1 is worth –0.8, and going for it on fourth-and-6 is worth –1.9).
The difference seems to be the values assigned to the different field positions. Both studies give –2 points for first-and-ten on your own goal line, and +6 points for first-and-goal on the opponent’s one-inch line. But CPT assigns values linearly between those two values, while Romer has an S-shaped curve, flat in the middle but steep near both goal lines. To CPT, the value of a punt from one 30-yard line to the other is more than three points – but to Romer, it’s only two points. That’s a big difference, and it’s enough to make the difference between punting and not punting.
I don’t know whose numbers are right. It would have been nice if either source had shown us a chart of actuals – that is, how many times did team X have the ball on their 30, and what were the eventual results? That way, we could try to evaluate both authors ourselves.
In any case, it’s probably true that all the conclusions, on both sides, are dependent on the situation values. Without actuals, we really don’t know which is correct. If CPT are correct, and Romer is wrong, it could be that coaches aren’t all that conservative at all.
This is a meaty paper. It’s interesting to read, and provides much data (mostly in graphs) that would be useful to football sabermetricians. I do wish I understood Romer’s method a bit better – I’m not familiar with “dynamic programming” – so I could form an opinion on whether his values are empirically superior to CPT’s. (CPT don’t provide any raw data either.)
But even with Romer’s interesting data, I’d still wish for more. We are told that when it’s a 50% shot, coaches go for it less than 50% of the time. But what about when it’s a 70% shot? Or a 90% shot? Does that make a difference? Do coaches go by yards, so that they’ll always go for it on fourth-and-inches regardless of field position? And, as I wondered above, is it possible that coaches agree with Romer’s analysis, but are just using different numbers?
It seemed like Romer found one way of looking at the data that showed strong risk aversion, and stopped there. Sabermetricians, who are more interested in the particulars of coaches’ strategy than the yes/no question of whether they maximize, may be frustrated that the author has so much data yet chose to show only a small part of it.
And if anyone has any data or suggestions on figuring out whether Romer’s data is better or worse than CPT’s, I’d be interested in knowing.
Hat tip: The Wages of Wins. This study got a fair bit of press ... see this espn.com article for NFL coaches’ reactions. Here’s the first of four-part series from Doug Drinen. And you can google “Romer NFL” for others.