Sunday, April 18, 2010

Does a cricketer's career really depend on luck?

A couple of days ago, the "Freakonomics" blog, and several others, quoted a study, based on cricket, that purported to show that random luck in your first job can have a big impact on your career. The paper is called "What Can International Cricket Teach Us About the Role of Luck in Labor Markets?" and it's by Shekhar Aiyar and Rodney Ramcharan.

It's probably true that luck plays a big part in your work life, but I don't think the study actually shows that.

Here's what the authors did (and apologies in advance if I get some of the cricket terminology wrong). They figured out home field advantage in a player's cricket batting average (runs per out made), and found that the average batter hits for about 25% more runs at home than on the road. Given that's the case, then decision-makers should take that into account when evaluating players. Obviously, a batter who hits for 30 runs on the road likely has more ability than a batter who hits for 30 runs at home.

Now, consider a batter who's playing in an elite "test" cricket match for the first time. Some portion of batters, after their debut match, will be dropped from the team -- presumably those who didn't bat well. (It turns out that figure is about 25% of first-time batters being immediately dropped.)

Obviously, the worse a player bats, the more likely he'll be dropped. But managers should take into account whether it's a home match or a road match. All things being equal, a batter who hits for X runs at home should be more likely to be dropped than a player who hits for X runs on the road.

The authors do a regression, to predict the probability of being dropped, based on their batting that match, a dummy for whether it was home or road, and an interaction term (the dummy times the number of runs). It turned out that both the dummy and the interaction term were not statistically significant.

Therefore, the authors concluded, managers neglect to take home field advantage (HFA) into account when evaluating players -- they just look at runs. Therefore, a player's career is strongly affected by random chance -- the luck of whether his first match happened to be at home (where he gets a better chance of making the team) or on the road (where he gets a worse chance of making the team).

----

Except that ... the regression does NOT show that the manager ignores HFA, not at all. The regression equation the authors found (Table 9, column 2) was

Chance of being dropped = -0.0043 * (runs) + .000356 * (runs) if at home + .0527 if at home

Now, suppose a batter hits for 10 fewer runs than average. If he does that on the road, his additional chance of being dropped is

0.0043 * 10 = 4.3%.

But suppose he does that at home. His additional chance of being dropped is:

0.0043 * 10 + .000356 * 10 + .0527 = 9.9%.

Doesn't that seem like a reasonable adjustment for HFA? I think it does. I'm not sure what an average batter hits for ... say, 35 runs? That means if the player hits for 25 runs at home, he'll be cut 10% of the time. If he hits for 25 runs on the road, he'll be cut 4% of the time. What's wrong with that?

What the authors would say is wrong with that is that the signficance levels for the last two terms were too low, so we have to drop them. To which I say, nonsense! They look almost exactly as you'd think they would based on your prior expectation of managers not being dumb. If they're not significant, it's because you don't have enough data!

Looking at it in a different way: the authors chose the null hypothesis that the managers' adjustment of HFA is zero. They then fail to reject the hypothesis.

But, what if they chose a contradictory null hypothesis -- that managers' HFA *irrationality* was zero? That is, what if the null hypothesis was that managers fully understood what HFA meant and adjusted their expectations accordingly? The authors would have included a "managers are dumb" dummy variable. The equations would have still come up with 4% for a road player and 10% for a home player -- and it would turn out that the significance of the "managers are dumb" variable would not be significant.

Two different and contradictory null hypotheses, neither of which would be rejected by the data. The authors chose to test one, but not the other. Basically, the test the authors chose is not powerful enough to distinguish the two hypotheses (manager dumb, manager not dumb) with statistical significance.

But if you look at the actual equation, which shows that home players are twice as likely to be dropped than road players for equal levels of undperformance -- it certainly looks like "not dumb" is a lot more likely than "dumb".

----

It's like this: suppose I want to sell lots of lottery tickets. So I claim that your chances of winning the Lotto 6/49 jackpot are 1 in 1000. Mathematicians and experts all say that I'm wrong, that the odds are really 1 in 13,983,816. But I don't think that's right, and I have a study to back me up!

I randomly took 500 ticket buyers, and, it turns out, none of them won the jackpot. But I run an analysis on that dataset anyway. And you know what? I find that if the odds truly were 1 in 1000, the chance of nobody winning the jackpot would be 60%. That's really insignificant, not even close to the 5% level required to reject the null hypothesis that the chance are 1 in 1000!

What's the flaw in my logic? Well, technically, there isn't one: it's actually true that the data don't permit me to reject the 1 in 1000 hypothesis. But the data also don't permit me to reject the null hypothesis that the chances are 1 in 10000, or 1 in 100,000, or 1 in 1,000,000, or 1 in 10,000,000, or 1 in 13,983,816, or 1 in 1,000,000,000,000,000,000! Why should I specifically focus on 1 in 1000? Only because I want that one to be true? That's not right.

What I should have done, and what the authors should have done, is calculate a confidence interval. My confidence interval would be 1 in (168, infinity), and the reader would see that, even though 1 in 1000 is in the interval, so is the much more plausible result of 1 in 13,983,816.

If the authors of this study had done that, they would have noticed that their confidence interval, which included "managers ignore home field advantage completely", also included "managers are perfectly rational." Not only is the "rational" hypothesis more plausible than the "dumb" hypothesis -- it sure does seem to fit the authors' data a lot better.

Labels: ,

13 Comments:

At Sunday, April 18, 2010 4:35:00 PM, Anonymous Peter said...

Thanks for this, what a fantastic analysis of a ridiculously bad paper. It's a particularly egregious and obvious example of something that unfortunately goes on all the time in social science. I had to go to the paper to verify that yes, they really did do what you said they did. This is going to become my go-to example of how statistical significance testing can go horribly wrong and why confidence intervals really are vastly superior.

Another nice little detail is their use of "robust" standard errors. Robust s.e.'s are supposed to prevent Type I errors by making confidence intervals larger. But in this case, it made the analysis *less* conservative by making it easier to get the null finding that the authors wanted!

 
At Sunday, April 18, 2010 6:32:00 PM, Blogger Brian Burke said...

The regressions are horribly constructed. The paper reeks of the kind of overly-complex stuff you'd do to find results you're looking for and hide the results you're not.

The results for bowlers (pitchers) are very different than for batsmen, which to the authors is "puzzling." It doesn't dawn on them that one of the reasons may be their strained regression models. What the results really show is that coaches (or "selectors" as they're awkwardly called) take HFA into account for one position but not the other. Puzzling, indeed.

What's particularly troubling is the entire goal of the paper. The title is "...Cricket...and the Role of Luck in Labor Markets." And the conclusion starts, "How important is luck in labor market outcomes?" The paper is part of the larger movement in modern economics that successful people don't deserve what they earn. We're all just lucky or unlucky. The market doesn't work. People are stupid (except a chosen few elite). And underneath it all is the implication that really, really smart people need to make our decisions for us.

If you think I'm stretching it, consider this little nugget from the introduction: "Nevertheless, public perceptions about the relative importance of luck versus ability can shape societal notions of 'social justice' and 'fairness,' and significantly influence taxation and broader redistributive policies." The goal of the paper is nakedly obvious.

Not to turn this thread too political, but this paper is nothing more than socialist-advocacy research. Guys like this can fool reporters, their fellow PhDs, and even themselves, but they can't fool Phil!

 
At Sunday, April 18, 2010 6:55:00 PM, Blogger Phil Birnbaum said...

I think they found signficance for bowlers (although not batters) because bowlers' datapoints have more batting attempts involved.

In baseball, a batter gets 4 PA but a pitcher may be involved in 25 PA or more. It's similar in cricket, where I think a bowler could pitch for 6 or 8 or 10 "wickets" (outs). So the individual bowler matches have (say) eight times as much data as the individual batter matches.

Since the "selectors" have a lot more data to go by when deciding if a bowler will continue to be on the team, it could be that their decisions are more predictable from the data, and that's why the bowler coefficients come out significant.

This off the top of my head. As Brian says, the analysis in the paper is much more complicated than it has to be. There are much more easier and understandable ways of looking at the data.

 
At Monday, April 19, 2010 10:10:00 AM, Anonymous Anonymous said...

While I think the study itself, and the ideas behind it are valid (not 'horribly constructed' as Brian feels--the methods are pretty straight forward as far as IV goes, but you can always field a complaint over an exclusion restriction in an IV regression or matching), it seems that what you're worried about is 'accepting the null' rather than 'failing to reject', in which case I'd agree in general that the conclusions made could be overarching.

I'm hesitant to say anything about the researchers' prior bias, as I know nothing about them (though, Yglesias certainly reveals his interests on his blog), but the first thing you learn in hypothesis testing is that you don't "accept the null". Perhaps you try to explain away why you didn't get a significant result, but those are simply observations.

However, I disagree with this statement:

"If they're not significant, it's because you don't have enough data!"

While it's true that more data with a similar result would give you significance, you should definitely be careful here. With more data, what's to say the effect won't be zero? If you have more data, and the same effect, then you probably will find something significant (which, as you say, certainly depends on the CIs), but adding more data doesn't necessarily make it significant unless those data follow the same pattern/distribution as your smaller dataset (or a larger effect).

Just a pet peeve about the precision of language, but I agree with the idea generally.

It's like when I talk to friends in science fields who complain how their study is ruined because they didn't get a significant result...to which I ask 'well how large was your sample size'...and they answer, "well, we had 6 groups and 4 animals in each group".


-Millsy (google sign in doesnt seem to be working)

 
At Monday, April 19, 2010 10:24:00 AM, Blogger Hugh said...

Phil, with all the bad sports economist papers/books and all the seemingly bad links that the freaknoomics blog thinks is good...what do you think that says in general about economists as a whole? Or why is it that sports economists are so bad or make so many mistakes?

 
At Monday, April 19, 2010 11:11:00 AM, Blogger Brian Burke said...

Hugh-They're usually not "sports" economists. They're just economists who are using sports as a neatly constructed laboratory. When I read stuff like this, I wonder if there is any real, valid, correct, or useful economic research out there at all.

And give me a break! Those regressions are about 1000x more complex than necessary. There is nothing wrong with IV regression per se, but it obscures what's really going on inside the model.

Using the interaction term to determine significance can be misleading. Why not just have 2 separate regressions--one for debuting at home and one for debuting away? Then compare the size of the coefficients.

You don't even need regression for what they're trying to do. The paper reminds me of this disaster.

 
At Monday, April 19, 2010 12:54:00 PM, Blogger John said...

I posted this on The Book Blog

Thinking about it a bit more I think an even bigger issue with this study is selection bias. In summary, worse players are more likely to debut at home. This is because on the road it is harder to change players (you select for the entire road trip, whereas at home you select for the next match).

This is compounded by the fact that at home if you are losing (or winning a test series - a set of 3-6 games) for some of the dead rubbers you are more likely to roll the dice on some unproven players. So I’d say if you make your debut on the road the odds are you are a much better player than if you make your debut at home.

So as I see it you also have to control for quality of batter somehow too.

 
At Monday, April 19, 2010 1:27:00 PM, Blogger Millsy said...

Brian,

I'm honestly not an IV fan myself, but I don't think rejecting the model based on the fact that they used it is good practice. I'm not here arguing for the conclusions of the paper, but that their model was a reasonable approach. I'd give a somewhat similar argument about complexity that you do for a lot of panel model regressions where simple dummy variables are perfectly sufficient.

John,

It seems you're implying the IV is actually correlated with the outcome...and you could be right (it's one of the reasons I personally am not an IV fan).

 
At Monday, April 19, 2010 9:04:00 PM, Blogger John said...

Millsy - I don't know much about IV regressions (would love a simple primer if you can link to one).

But back to this selection bias, what is clear in my mind is that better players will debut on the road. Better players are more likely to score more runs, will have a better non-international track record so are less likely to be dropped. In fact you could go so far as to say that very few road players will ever be dropped so would love to see the drop rate for home and road (I can't see it anywhere in the paper).

So in that sense, yes, there is a correlation between the outcome and home/road variables. Based on this your prior HAS to be that it is far more likely to be dropped at home. The burden of proof is on the authors to prove this unequivocally to be not true.

You need a variable to account for quality of hitter (perhaps career domestic batting average at time of debut - or perhaps needs to be related to age too).

Which brings me on to ...

Another issue (this paper really is bad) is that surely you have to control for park effect too. The number of runs scored depends a lot of the quality of the pitch. On bad pitch scoring 25 runs might be a *great* result. On a batters pitch not so much. What you should be trying to figure out is expected runs given environment. Only by isolating those effects will you capture the true effect of home or road. And once you adjust for run expectation (which I argue you need to do for an individual match) actually the home/road thing is irrelevant.

 
At Tuesday, April 20, 2010 10:37:00 AM, Blogger Millsy said...

John,

Honestly, I think Wikipedia does a good job of explaining IV. I'm not theoretical econometrician, and understanding the nitty gritty can, as Brian says, be complex. Below is an online lecture notes from Dartmouth

http://www.dartmouth.edu/~dstaiger/Papers/Staiger%20IV%20Handout.pdf

You certainly know more about cricket than I do (I literally know zero). The authors state their own assumption that debuting at home and away is random, but you're suggesting otherwise. YOu could be right, and I have no clue. They do try to use some data to defend this assumption, though, finding that the 'top 20 players' ever debut symmetrically home and away, among other fairly rigorous tests for endogeneity on other factors.

In this IV model, it seems that they used home/away as the instrumental variable--or a way to control for the effect of playing at home vs. away.

An instrument is supposed to be correlated with the covariates in the regression (which are endogenous on the error term and have the selection bias problem), but NOT your outcome variable. Finding a viable instrument is difficult, and often isn't valid over time (a good example is that even using month of birth to predict who plays pro hockey doesn't work as an IV, despite the fact that you'd think it's all random).

What they do is use 'debut location' as the instrument for 'debut performance' which 'controls' for unobserved ability when performance doesn't do a great job of predicting by itself (biased by home vs. away debut). They then use this debut performance as the predictor for likelihood of being dropped. So they're trying to do what you say in using ability rather than just performance and make sure that this bias is accounted for when predicting career performance.

I'm not sure the park factor adjustment is necessary, assuming the number of players is randomly distributed around the different fields (home and away). I don't know if that's the case, but I don't think that's too much worry. The same goes for controlling for hitter quality and its effect on debuting away, though you could argue it's extremely heterogeneous. But if we assume that we have the same number of road players at X talent and Y age as home players at X talent and Y age, then it shouldn't be a big deal.

As for the final probit results, it's a simple probit regression. Putting a dummy in the regression isn't much different than running 2 regressions, as Brian suggests, though I would be interested in seeing the regression with and without the interaction term. The IV regression isn't actually a factor in the retention decision probit model, which is good (since using it as a regressor implies it could be correlated with the result).

In the end, I think Phil's point as to 'accepting' the null, rather than failing to reject is a good one. But it's important to remember there's not necessarily enough evidence to say they take it into account by standard statistical practices. Chalking an insignificant result up to being 'luck' is something that is a little fishy to me in general: maybe you just didn't include something important. But that's another story.

Looking a little closer, though, it seems that they do find a significant result for Bowlers for debut location in the probit...in the direction of road debut players more likely to be dropped, and does suggest disproportionate penalization for poor road performance: evidence for their alternative hypothesis, which Phil does not present here.

 
At Tuesday, April 20, 2010 10:52:00 AM, Blogger Hugh said...

So if Economists are not using Econometrics correctly, then who the heck is?!?!?! Is everyone really down on economists in general or are these bad papers we see just something we'd see in any academic field from time to time?

 
At Tuesday, April 20, 2010 12:33:00 PM, Blogger John said...

Thanks Millsy - I now understand how an IV regression work (a little anyway)!

Think this thread is about done but the 20 best players equally debuting home or road is a silly test. I'd expect it to be 'random' for the best players. It is the marginal players that are more likely to be selected at home - and these marginal players are more likely to be dropped.

 
At Wednesday, April 21, 2010 6:59:00 AM, Anonymous andrew said...

Bit of a late-comer to all this, but as an English person, just wanted to add that John's dead right about players' selction liklihood. As pointed out, it also depends on state of the series - if you are losing, you may well choose a new player, but then they may well fail as well, as you are, after all losing the series against a presumably better team.
There are also form issues, as away series are in the off season, which I guess will effect your form, and when you're home, selectors will also pick the available in-form players. You're also right about the type of pitch.

When I first saw the headline on Freakonomics, I assumed it was about the luck of umpiring decisions, not selection. That's where luck truly plays a part in cricket. A bad decision early on in a career when you may well have gone on to score a century on debut could have massive effects on your career.
Cricket is a statistic-lover's paradise, much as I understand Baseball is in the USA. most things are recorded in some way. But if you're looking over some past scores, you'll see that the match was in Bangladesh or wherever. You probably won't find out that the batsman was given out in controversial circumstances.

 

Post a Comment

Links to this post:

Create a Link

<< Home