### Huge "choke" effect reported in soccer, part II

OK, someone was kind enough to send me a copy of one of the studies (gated link), the one that presents this result:

"... 86.6 percent for the first shooter, 81.7 for the second, 79.3 for the third and so on."

The authors looked at three tournaments: Copa America, the European Championship, and the World Cup. Strangely, or perhaps not so strangely (you guys can tell me what you think), the percentages varied quite a bit for the three tournaments:

82.7% Copa America (133 kicks)

84.6% European Championship (104 kicks)

71.2% World Cup (153 kicks)

The difference between the World Cup and European Championship is 3.8 standard deviations.

(Of the missed kicks, here are the percentage that were saves (as opposed to hitting the post, or shooting wide or high):

78.3% non-goal saves Copa America

57.9% non-goal saves European Championships

70.5% non-goal saves World Cup

But, as the authors point out, those differences aren't significant because of the small numbers of misses.)

So why is the percentage of goals so low in the World Cup? I'm not sure. The authors think it's pressure and psychology. They mention the possibility of better goaltending, but argue that the save percentages were "almost 10 percent higher" in the Copa America than the World Cup. But they weren't: those were the save percentages of non-goals. The true save percentages were

13.5% saves Copa America

08.9% saves European Championships

20.3% saves World Cup

It seems to me that better goaltending could still be part of the reason. If you think goaltending is a bigger factor than kicking -- the same way the starting pitcher is a bigger factor in a baseball game than the opposing hitters -- that theory makes a lot of sense.

-----

Anyway, moving on to the shooters. Here's the raw data for each of the shooting orders:

86.6% shooter 1

81.7% shooter 2

79.3% shooter 3

72.5% shooter 4

80.0% shooter 5

64.3% shooter 6+

(It should be noted that the "shooter 5" is subject to "bottom of the 9th" effects -- the team leading won't take its 5th shot if not necessary, so the fifth kick should be weighted more towards the first-kicking team. Not sure how big a deal that is, but there were only 55 kicks there out of 82 shootouts. The "shooter 4" had 80 kicks, and the "6+" was only 28 kicks.)

My first reaction is still: aren't teams just putting their best shooters first? The authors mention this theory, but then don't come back to it.

What they do instead is run a regression. They include tournament, positional role of the player (forwards should be better at scoring than defenders), and age of the player. They also include a dummy for playing time, in case the substitutes are less tired because they didn't play as much.

What are the results? Well, as you'd expect, young players score more than old players. Forwards score more than midfielders who score more than defenders. And playing time doesn't seem to matter much.

But what about shooting position? After the adjustments, does shooter 1 still score more often than shooter 4?

Shockingly, the authors don't tell us! They list the results for everything in the regression except shooting order. I can't figure out why they would choose to omit that one.

Nonetheless, they write,

" ... tournament and kick order were most strongly related to kick outcome .... there were especially marked differences ... between kicks #6-9 and kick #1 (p=.0002). All kicks except kick #4 were significantly different from kicks #6-9 in the analysis."

So the results appear to be roughly the same as above, even after adjusting for age and position and tournament.

Doesn't that still support the hypothesis that it's just skill? The authors make a good case for that:

" ... skilled players are probably picked for the first kick ... and less-skilled players are picked for the sixth kick and on, because the five most skilled players have already been used for kicks #1-5. Skilled players may have also been picked for kick #5, which would explain why these kicks, contrary to what would be expected from the trend of the other kicks, were more successful than kicks #3 and #4."

Sounds right to me. But, then, the authors make a last-ditch attempt to preserve their "choking under pressure" hypothesis:

"However, these confounding factors cannot explain the decline in success between kicks #2, #3, and #4. Thus, the hypothesis about the influence of kick importance is still plausible."

Er, *why* can't skill explain the decline from #2 to #4? And if skill is not a factor, and players choke under pressure, why is #5 so high?

This seems to be a case where the study couldn't be much more compatible with the obvious explanation, but the authors support the less-plausible hypothesis anyway.

-----

They also say:

"... coaches may be interested to know that forwards have a tendency to score more goals than defenders ... and younger players often score more than older players."

If the authors think coaches don't know this, why do they think some players are assigned to be forwards and some not? And why do they think there are so few 45-year-old professional soccer players?

-----

Anyway, we now have a better idea of what might be causing the other result. From the NYT article:

"Kick takers in a shootout score at a rate of 92 percent when the score is tied and a goal ensures their side an immediate win. But when they need to score to tie the shootout, with a miss meaning defeat, the success rate drops to 60 percent."

The two situations are probably about equally likely if you're taking your fifth shot: my guess is that a 4-4 is about as likely as a 5-4 there. But if you're on your sixth shot, it's much more likely that it's 1-0 and not 0-0 (because there's a 64% chance that the other team scored on their sudden death shot).

So, when you're tied, it's more likely that you're on your fifth shot, where your chance of scoring is high. When you're behind by one, it's a little less likely, which means it's a little more likely that you're on your sixth shot, where your chance of scoring is low (because you're using a lesser shooter).

And that probably explains at least part of the "92% to 60%" result -- it's better players, not just "stress". We'd need the other study to know for sure.

"shooter 5 is subject to "bottom of the 9th" effects...there were only 55 kicks there out of 82 shootouts...and the "6+" was only 28 kicks.)"

I think this has to mean that the "60% vs. 92" samples are very small, unless there are other tournaments considered. At most, maybe 40 of the 55 5th kicks (probably a lot fewer) could qualify, plus 14 of the 28 "6+" kicks. So that would be 54 kicks, probably closer to 40, which then have to be divided into two samples: ties and -1. Is that even worth talking about?

Good point. The SD for the difference between two groups of 20 is 14 percentage points. So the observed is a bit more than 2 SDs.

Add in the fact that the ties are disproportionately in the fifth kick, where the kickers are better, and you probably get well below 2 SDs.

390 kicks doesn't seem like a lot of data even if you assume you can perfectly account for all the biases in the data.

Isn't soccer the most played game in the world? They should be looking at thousands of penalty kicks

Phil: Technically, the 32-point difference would be significant. But my guess is the result for tied/goal=win sample is 11-for-12, since the tied scenario should happen much less often. If so, a change of just one goal (10 of 12) would make the gap far short of statistical significance. I don't think that's much to rest a finding on. (Even before we get to the -1 sample being biased downward, and the tied sample being biased upward.)

Makes sense to me. I bet you're right.

The upward trend for the 5th kicker in a shootout situation is this: Many coaches/teams leave a "finisher" for the 5th kicker--similar to a closer in baseball. If 55/82 (almost 70%) get to that 5-4 (needs make to keep going) or 4-4 (needs make to win), then it makes sense that a coach wants his best or 2nd best penalty-shot-taker to make this very important kick. Definitely explains that uptick.

