A new basketball free throw choking study
"Performance Under Pressure in the NBA," by Zheng Cao, Joseph Price, and Daniel F. Stone; Journal of Sports Economics, 12(3). Ungated copy here (pdf).
------
There's a new paper demonstrating evidence that NBA players choke when shooting free throws under pressure. The link to the paper is above; here's a newspaper article discussing some of the claims.
Using play-by-play data for eight seasons -- 2002-03 to 2009-10 -- the authors find that players' percentage on foul shots goes down significantly late in the game when they're behind. They present their results through a regression, but it's more obvious just by using their summary statistics. Let me show you the trend, which I got by doing some arithmetic on the cells in the paper's Table 1:
In the last 15 seconds of a game, foul shooters hit
.726 when down 1-4 points (922 attempts)
.784 when tied or up 0-4 points (5505)
.776 when up or down 5+ points (4510)
With 16-30 seconds left, foul shooters hit
.748 when down 1-4 points (727 attempts)
.775 when tied or up 1-4 points (2652)
.779 when up or down 5+ points (6174)
With 31-60 seconds left, foul shooters hit
.752 when down 1-4 points (922 attempts)
.742 when tied or up 0-4 points (1634)
.767 when up or down 5+ points (8969)
In all other situations, foul shooters hit about .751 regardless of score differential (400,000+ attempts).
------
Take a look at the first set of numbers, the "last 15 seconds" group. When down 1 to 4 points, it appears that shooters do indeed "choke," shooting almost 2.5 percentage points (.025) worse than normal. In 5+ point "blowout" situations late in the game, they shoot more than 2.5 percentage points *better* than normal.
But neither of these numbers is statistically significantly different from the overall average (which I'm guessing is about .751). The difference of .025 is about 1.7 SDs.
The real statistical significance comes when you compare the "down by 1-4" group, not to the average, but to the "5+ points" group. In that case, the difference is double: the "down 1-4" is .025 below average, and the "5+" group is .025 *above* average. The difference of .050 is now significant at about 3 SDs.
So, if you look only at the last 15 seconds of games, it looks like players down by 1-4 points choke significantly compared to players who are up or down by at least five points.
There are similar (but lesser) differences in the 16-30 seconds group, and the 31-60 seconds group. I haven't done the calculation, but I'm pretty sure you also get statistical significance if you combine the three groups, and compare the "down 1-4" to the "5+" group.
-------
So that's what we're dealing with: when you compare "down 1-4 late in the game" to "up or down 5+ late in the game", the difference is big enough to constitute evidence of choking. The most obvious explanation is that the foul shooters in the two groups might be different. However, that can't be the case, because the authors controlled for who the shooter was, and the results were roughly the same. Indeed, they controlled for a lot of other stuff, too: whether the previous shot was made or not, which quarter it is, whether it's a single foul shot or multiple, and so on. But even after all those controls, the results are pretty much the way I described them above.
Again, I repeat: the authors (and the data) do NOT say that the "choke" group shoots significantly worse than average. They can only say that the "choke" group is significantly worse than one specific group of players: the "don't care" group, shooting late in the game when the result is pretty much already decided, with a gap of 5+ points.
But this fits in with the authors' thesis: that the higher the leverage of the situation, the more choking you see. They later break down the "5+" group into "5-10" and "11+", and they find that even that breakdown is consistent -- the 11+ group shoots better than the (slightly) higher leverage 5-10 group. Indeed, for most of the study, they compare to "11+" instead of "5+". For some of the regressions, they post two sets of results, one relative to the "5-10 points" group, and one relative to the "11+" group. The "11+" results are more extreme, of course.
-------
As I said, the authors don't present the results the way I did above ... they have a big regression with lots of tables and results and such. The result that comes closest to what I did is the first column of their Table 5. There, they say something like,
"In the last 15 seconds of a game, a player down 1-4 points will shoot 5.8 percentage points (.058) worse than if the game were an 11+ point blowout. The SD of that is 2.1 points, so that's statistically significant at p=.01."
-------
Oh, and I should mention that the authors did try to eliminate deliberate misses, by omitting the last foul shot of a set with 5 seconds or less to go. Also, they omitted all foul shots with less than 5 minutes to go in the game (except those in the last 15/30/60 seconds that they're dealing with). I have absolutely no idea why they did that.
-------
Although the authors do mention the "down 1-4" effect above, it's almost in passing -- they spend most of their time breaking the effect down in a bunch of different ways.
The biggest effect they find is for this situation:
-- shooting a foul shot that's not the last of the set (that is, the first of two, or the first or second of three);
-- in the last 15 seconds of the game;
-- team down exactly one point.
compared to
-- shooting a foul shot that's not the last of the set (that is, the first of two, or the first or second of three);
-- in the last 15 seconds of the game;
-- score difference of 11+ points in either direction.
For that particular situation, the difference is a huge 10.8 percentage points (.108), significant at 2.5 SDs.
Also: change "down by one point" to "down by two points", and it's a 6.0 percentage point choke. Change "not the last of the set" to "IS the last of the set," and the choke effect is 6.6 points when down by 1, and 6.0 points when down by 2.
This highly specific stuff doesn't impress me that much ... if you look at enough individual cases, you're bound to find some effects that are bigger and some that are smaller. My guess is that the differences between the individual cases and the overall "down 1-4" case are probably random. However, the authors could counter with the argument that the biggest sub-effects were the ones they predicted -- the "down by 1" and "down by 2" case. On the other hand, late performance is actually *better* than blowouts when the score is tied (by around 0.2 points), a finding the authors say they didn't expect.
So my view is that maybe the "1-4 points" result is real, but I'm wary of the individual breakdowns. Especially this one: in this situation:
-- last 15 seconds of the game
-- for a visiting team
-- where the most recent foul shot was missed
-- down by 1-4 points
the player is 9.6 percentage points (.096) less likely to make the shot than
-- last 15 seconds of the game
-- for a visting team
-- where the most recent foul shot was missed
-- score 11+ points in favor of either team.
Despite the large difference in basketball terms, this one's only significant at .05.
------
Another thing about the main finding is ... we actually already knew it. Last year, I wrote about a similar study (which the authors reference) that found roughly the same thing. Here, copied from that other post, are the numbers those researchers found, for all foul shots in the last minute of games, broken down by score differential:
-5 points: -3% [percentage points]
-4 points: -1%
-3 points: -1%
-2 points: -5% (significant at .05)
-1 points: -7% (significant at .01)
+0 points: +2%
+1 points: -5% (significant at .05)
+2 points: +0%
+3 points: -1% ("also significant")
+4 points: +1%
+5 points: -1%
There are some differences in the two studies. The older study controlled for player career percentages, instead of player season percentages. It didn't control for quarter (which is why commenters suspected it might just be late-game fatigue). It didn't control for a bunch of other stuff that this newer study does. And it used only three seasons of data instead of eight.
But the important thing is: the newer study's eight seasons *include* the older study's three seasons. And so, you'd expect the results to be somewhat similar. It's possible that the three significant seasons are enough to make all eight seasons look significant, even if the other five seasons were just average.
Let's try, in a very rough way, to see if we can tease out the new study's result for those other five seasons.
In the older study, if we average the -1, -2, -3, and -4 effects, we get -3.5. So, in the last minute, down by 1-4 points, shooters choked by 3.5 percentage points.
How do we get the same number for the newer study? Well, in the top-left cell of Table 5, we get that, in the last 30 seconds and down by 1-4 points, shooters choked by 3.8 percentage points.
That's our starting point. But the new study's selection criteria are a little different from the old study's, so we need to adjust.
First, the "-3.8" in the new study is from comparing to games in which the point differential is 11 or more. The "5-10" is probably a more reasonable comparison to the previous study. The difference between "11+" and "5-10", at 30 seconds, appears to be about one percentage point (compare the second columns of Tables 3 and 4). So we'll adjust that 3.8 down to 2.8.
Second, the new study is for the last 30 seconds, while the old study is for the last minute. From earlier in this post, we see that the 31-60 difference between the "down 1-4 group" and the "5+" group is only about -1.5 percentage points. Averaging that with the -2.8 from the above step (but giving a bit more weight to the -2.8 because there were more shots there), we get to about -2.4.
So we can estimate, very roughly, that for the same calculation,
Old study (three seasons): -3.5
New study (eight seasons): -2.4
Let's assume that if the new study had confined itself to only the same three seasons as the older study, it would have come up with the same result (-3.5). In that case, to get an overall average of -2.4, the other five seasons must have averaged -1.74. That's because, if you take five seasons of -1.74, and three seasons of -3.5, you get -2.4.
So, as a rough guess, the new study found:
-3.5 -- same three seasons as the old study;
-1.7 -- five seasons the old study didn't cover;
-------------------------------------------------
-2.4 -- all eight seasons combined.
So, in the new data, this study finds only half the choke effect that the other study did. Moreover, I estimate it's only 1 SD from zero.
-------
That's for "down 1-4 points." Here's the same calculation, broken down by individual score. Here "%" means percentage point difference:
-2 points: First three: -5%. Next five: -2.3%. All eight: -3.3%.
-1 points: First three: -7%. Next five: -1.7%. All eight: -3.7%.
+0 points: First three: +2%. Next five: -1.8%. All eight: -1.0%.
+1 points: First three: -5%. Next five: -0.5%. All eight: -2.2%.
+2 points: First three: +0%. Next five: -0.3%. All eight: -1.3%.
Generally, five new seasons are closer to zero than the three original seasons. That's what you would expect if the original numbers were mostly luck.
-------
So, in summary:
-- The study finds that in the last seconds of games, players behind in close games shoot significantly worse than in blowouts.
-- In the last 30 seconds, they're maybe about 2.8 percentage points worse. In the last 15 seconds, they're maybe about 4.8 percentage points worse.
-- The effect is biggest when down by 1 in the last fifteen seconds.
-- However, they are not statistically significantly better or worse than *average,* just statistically significantly worse than blowouts (although they certainly are "basketball significantly" worse than average).
-- The effect is mostly driven by the three seasons covered in the earlier study. If you look at the other five seasons, the effect is not statistically significant (but still has the same sign).
What do you think? I'm not absolutely convinced there's a real effect overall, but yeah, it seems like it's at least possible.
However, I do think the most extreme individual breakdowns are overstated. For instance, the newspaper article says that in the last 15 seconds, down by 1, players will shoot "5 to 10 percentage points worse than normal." (They really mean "worse than 11+ blowouts," but never mind.) Given that that's the most extreme result the study found, I think it's probably a significant overestimate. I'd absolutely be willing to bet that, over the next five seasons, that the observed effect will be less than five percentage points.
--------
P.S. One last side point: the newspaper article says,
"Shooters who average 90 percent from the line performed slightly better than that under pressure, while 60 percent shooters had a choking effect twice as great as 75 percent shooters. That suggests that a lack of confidence begets less confidence, and vice versa."
This is a correct summary of what the authors say in their discussion, but I think it's wrong. The regressions that this comes from (Table 5, columns 2 and 6) don't include an adjustment for the player. So what it really means is that the 60 percent shooter will be *twice as far below the average player* as the 75 percent shooter. That makes sense -- because he's a worse shooter to begin with, even before any choke effect.
------
UPDATE: After posting this, I realized that I may have missed one aspect of the regression ... but I think my analysis here is correct if I make one additional assumption that's probably true (or close to true). I'll clarify in a future post.
Labels: basketball, choking, clutch, free throws