Sabermetric Research: Explaining the sibling study

I promised to revisit Frank Sulloway and Richie Zweigenhaft's sibling study in light of their response to me last week. And I thought it might be best if I start over from scratch.

Today, I'm going to try to just *describe* the study and the results. In a future post, I'll add my own opinions. For now, my presumption is that on everything in this post, the authors and I would agree.

In the original study, the authors' primary finding was that younger baseball players attempt more stolen bases (per opportunity) than their older brothers, with an odds ratio of 10.58. If I explain this right, by the end of this post, it should be clear what the authors did and what the 10.58 figure actually means. I've been a bit unclear on it in the past, but I think I've got it now.

------

The purpose of the study was to check whether younger brothers attempt more stolen bases than older brothers. The authors hypothesized that would be the case. That's because the psychology literature on siblings finds that younger siblings tend to develop a more risk-taking personality than their older siblings, and stolen base attempts are the baseball manifestation of taking risks.

The authors found approximately 95 pairs of siblings for their study. When they compared the younger player to the older in each pair, they found support for their hypothesis. It turned out that the younger brother "beat" the older (in the sense of trying more SB attempts per time on base) 58 times, and the older "beat" the younger only 37 times. That's a .610 winning percentage for the younger brother.

Younger brother: 58-37 (.610)

By my calculation, that's statistically significant at the 5% level (2.15 SDs from .500).

However, for reasons to become clearer later, the authors preferred to give the results in terms of odds. Stated that way, the younger brothers had odds of 58:37, which is odds of 1.57:1.

Younger brother: 1.57:1

However, that's still not quite the way the authors chose to describe results like this one. In both the paper and the response, the authors quote the "odds ratio" of dividing the younger player's odds by the older player's odds. The younger brother is 1.57:1, and the older brother is obviously the reverse, 1:1.57. If you divide the first odds (1.57/1) by the second (1/1.57), you get 1.57 squared, or 2.46. That means:

The "odds ratio" of younger brother to older brother is 2.46.

This 2.46 does not appear in the actual paper; I'm doing it here for comparison later.

------

However, the authors argue, the numbers above are misleading. They don't adequately represent the true effects of being an older or younger brother. That's because, they argue, there is a confounding effect not controlled for -- which player got called up first.

The authors and I disagree on whether this is an appropriate control, but both of us agree that controls are often useful and revealing.

Take an unrelated example. Suppose you did a study, and you found that there was an equal proportion of Canadians and Brazilians who were world-class hockey players. Would you be able to conclude that Canadians and Bermudans are generally equal in hockey talent?

No, you wouldn't, because you'd see that Canadians tend to get a lot more practice time than Brazilians. Canadians live in a cold climate, with lots of frozen ponds. And, Canada has a lot more ice rinks per capita than Brazil does. That means Canadians should get a lot more practice than Brazilians.

Given those facts, you'd expect a lot more hockey players from Canada than Brazil. If you find equal numbers, that's evidence that Brazil must have a higher aptitude for hockey than Canadians, to be able to succeed equally despite fewer opportunities to practice.

More formally, you might control for the number of ice rinks that each group has access to. You'd find that *holding ice rinks constant*, Brazilians are a lot better at hockey than Canadians are. I won't do a numerical example, but you can probably see how this would work.

So, going back to baseball siblings: the authors argue that "getting called up first" is this study's equivalent to "having lots of ice rinks". If you control for that, the odds ratio gets a lot bigger.

To prove that, they took 80 pairs of siblings where one got called up before the other, and they split them up according to whether it was the older or the younger who got called up first. The results were:

-- When the younger brother got called up first, the brother called up first went 5:1.

-- When the *older* brother got called up first, the brother called up first went 32:42.

The authors now calculate the odds ratio as it stands with this control. They divide (5/1) by (32/42), and get 6.56.

What they're saying is something like this: "The raw data makes it look like the odds ratio is 2.46. However, that's because we didn't control for callup order, which is important, just as important as the "access to hockey rinks" control. If we do that, we see that the raw data understates the true odds ratio, which is 6.56."

------

Putting in the control for callup order is like adding a variable to a regression. Whenever you do that, you check the apparent size of the effect, and also the significance level.

As we just saw, the effect size is fairly large: we went from an odds ratio of 2.46, without the control, to 6.56 with the control.

But what about the significance level? As it turns out, the control variable turns out to be not significant.

My simple argument goes like this: the six pairs in the "younger called up first" control went 5-1, or .833. If the control was not significant, we'd expect them to go .588, like in the sample as a whole. The chance of a .588 team going 5-1 or better is more than 21%, far higher than the 5% required for significance.

The evidence from the authors' paper goes something like this: for their entire sample (which we'll talk about a bit later), the authors wound up with an odds ratio of 10.58 (instead of 6.56). On page 13 of their response, they report a 95% confidence interval of (2.21, 50.73). That is easily wide enough to include the odds ratio that would have resulted if the control had no effect, which is 2.46 for our sample, and a bit higher for the authors' larger sample.

The authors never explicitly say that the "called up first" variable is not statistically significant, but it seems clear that that's the case.

-----

Now: what does an odds ratio actually *mean*? In our example above, where we came up with an odds ratio of 6.56, what does that 6.56 mean? How do we use it in an English sentence?

The answer, I think, is this:

The Vegas odds if you bet on the older brother are 6.56 times higher than the Vegas odds if you bet on the younger brother.

If you bet on the older brother attempting more steals than the younger brother when the older brother is called up first, your odds are 42:32, which is 1.3125 to 1. If you bet on the younger brother attempting more steals than the older brother when the younger brother is called up first, you get 1:5, which is 0.2 to 1. Divide 1.3125 by 0.2, and you get 6.56.

The numbers in the odds are 6.56 times higher, but maybe it's easier to understand that your *winnings* are also 6.56 times higher. Let's check:

If you bet on the older brother, you'll get odds of 42:32. If you bet $32, and the older brother steals more, you'll make a profit of $42.

If you bet the same $32 on the younger brother, you'll get odds of 1:5. That means if the younger brother steals more, you'll make a profit of $6.40.

Divide $42 by $6.40, and you get ... 6.56, as expected.

------

So that's what odds ratio means in terms of betting. Are there other intuitive ways of explaining it?

The authors don't really give any, but they do agree that it's difficult to interpret. They write,

"Ironically, although odds ratios are often used in an attempt to clarify complex statistical findings, people who are not familiar with them sometimes misinterpret what odds ratios do and do not mean.... [A]n odds ratio of 10.58 ... does not mean that younger brothers attempted 10.58 times the number of steals as did their older brothers. Similarly, this statistic does not mean, as Schwarz mistakenly reported in the New York Times, that more than 90 percent of younger brothers attempted more steals per opportunity than their own older brothers."

The authors write what it *doesn't* mean, but not what it *does* mean. Let me try to tackle that now, in a couple of different ways.

------

The most important thing is that the odds ratio of 6.56 doesn't tell you anything about how much the younger brothers outsteal the older brothers, or vice versa. It only tells you how that ratio *changes* when you swap "younger" for "older".

Again, going back to the actual numbers, which I'll repeat from a few paragraphs ago:

-- When the younger brother got called up first, the brother called up first went 5:1.

-- When the *older* brother got called up first, the brother called up first went 32:42.

What are the odds the younger brother beats the older brother? Well, if the younger brother got called up first, the odds are 5:1. If the older brother got called up first, the odds are 42:32 (1.31:1). So the younger brother occasionally wins 5:1 (6 out of 80 times), and frequently wins 42:32 (74 out of 80 times). Neither of those numbers, alone, has anything to do with 6.56 to 1.

What are the odds the brother called up first beats the other brother? Well, if the younger brother got called up first, the odds are 5:1. If the older brother got called up first, the odds are 32:42 (0.76:1). So the first-callup brother occasionally wins 5:1 (6 out of 80 times), and frequently wins only 32:42 (74 out of 80 times). Again, neither of those numbers, alone, has anything to do with 6.56:1.

The 6.56 only comes in when you divide the 5:1 by the 0.76:1. It's the *difference* between betting on the younger brother and betting on the older brother.

Look at this sentence:

"The _________ brother was called up first. The odds that he beats his sibling are _____:1."

There are two ways to fill in this sentence:

"The younger brother was called up first. The odds that he beats his sibling are 5:1."

"The older brother was called up first. The odds that he beats his sibling are 0.76:1."

What the 6.56 is saying is that, if you switch the word "older" and "younger", you have to divide or multiply the odds number by 6.56 for the sentence to still be true.

That's regardless of what the actual odds are. If, instead of 5:1 and 0.76:1, the odds turned out to be 10:1 and 65.6:1, the odds ratio would *still* be 6.56. Again, the 6.56 represents the *difference* in how the odds change, not what the odds actually are.

------

Odds and odds ratios are used a lot in regressions that try to predict probabilities. If you're trying to figure out how various factors affect cancer survival, for instance, you'll probably use odds. Why? Because, if you use probabilities, you'll usually wind up with something greater than 1, which doesn't make sense.

Suppose you test a new treatment for cancer. If you figure that it doubles your chances of a cure, you're in trouble. Because, suppose a patient presents with a 60% chance of surviving without the new treatment. Do you really want to say that, with the treatment, his chances go up to 120%?

To solve that problem, statisticians use odds, instead. The 60% patient is actually 3:2. Suppose the treatment triples the odds. Now, he's 9:2 to survive, instead of 3:2. That's perfectly OK. 3:2 is 60%, and 9:2 is 91%. Nothing over 100%.

When there are two factors, the usual assumption is that you can just multiply out the odds. If chemotherapy doubles the odds, and surgery triples the odds, then the model will say that if you get both treatments, you multiply your odds by 6. (This is the idea behind logit regression -- you just take the log of the odds so you can add linearly, like a regular regression.)

This is done all the time in statistics, but whether it's actually how nature works, I'm not sure. I can't think of an intuitive reason why it *would* work. (And I could easily make up an example where it doesn't.)

Generally, I believe it works well when the probabilities are really small, like when the odds double from 2,000,000:1 to 1,000,000:1 -- but not when they're bigger, like when they go from 1:4 to 2:4. In any case, the technique is used all the time, so there may be something to it that I don't see, and so I certainly won't argue against it.

------

So, in the baseball context, I think the implication of quoting an odds ratio of 6.56 goes something like this:

Suppose you look at two siblings, and evaluate their baseball skills and psychology from head to toe.

Brother A is a little fatter and slower than brother B. And you figure that, if Brother A is older than brother B, and called up first, he has only a 20% chance of attempting more steals than brother B.

But then, you ask, what if Brother A is *younger* than brother B, and still called up first? Well, you start with the original 20% probability, and convert it to odds, giving you 1:4. Then, you multiply by the odds ratio of 6.56. That gives you new odds of 6.56:4, which works out to 62%.

The conclusion: A may have a 20% chance to beat a *younger* brother called up first, but he'd have a 62% chance to beat an *older* brother called up first.

If you want, you can start with something other than 20%. Suppose if A is older, he's 1:1 to beat his brother. Then, if A is younger, he's 6.56:1. The probability goes from 50% to 87%.

Or, suppose A is so much faster than even when he's older, he's 9:1. Then, if he's younger, he'd be 59:1. So he'd go from 90% to 98.3%.

Or, just use the example from the actual data. If A is older, he's 32:42. If he's younger, multiply that by 6.56, and you get that he's now 5:1.

That's what the 6.56 implies, in brothers-stealing terms.

As an aside, I know I promised no criticism, but here's just one quick thing. In their second paper, the authors write,

"For brothers called up to the major leagues first, a younger brother was 6.56 times more likely than an older brother to have a higher rate of attempted steals."

I think they didn't mean to say that. While the younger player has 6.56 times the odds, he's not 6.56 times more likely. He's only 1.93 times more likely -- from 83% (5:1) to 43% (32:42). The authors do argue elsewhere that odds ratios have to be interpreted carefully, so I think this was just a slip of the tongue.

------

There's nothing in the method that tells you *how* the 6.56 change happens. Maybe if A is the younger brother, he keeps himself in better shape, so he's not as fat as if he were the older brother. Maybe first siblings tend to be fatter and slower than second siblings for genetic reasons, and so it's because when A is older than B, he's more likely to have been born fatter, and B more likely to have been born skinnier.

Or, maybe the authors' hypothesis is correct: If A is the younger brother, his personality develops him into a bigger risk-taker. If the authors are correct that the personality issue is the primary driver of the 6.56, the implications are that even if A is fatter than B, the personality factor alone is enough to more than compensate, since it can bring him from 20% to 62%, or from 43% to 83%, for psychological reasons alone.

Also, the study's raw data came up with the effect that being the younger brother instead of the older brother increases your chances from 43% to 83%. It does NOT speculate on what happens if the other player you're being compared to is not related to you at all. The study compares "has an older brother" to "has a younger brother", but ignores the "doesn't have a brother at all" case.

Furthermore, the study doesn't consider a pair of siblings unless both of them make the major leagues. But, the authors hypothesize that the risk-taking strategy is adopted in childhood. If that's the case, you'd think that it doesn't matter if one of the brothers doesn't make it past the low minor leagues; the effect should still be there. Probably, even if one brother never made it out of little league, the effect should still exist.

That would make an interesting follow-up study.

------

I've been talking about an odds ratio of 6.56, but the authors' conclusion is an even higher odds ratio, 10.58. How do they get the 10.58?

It's a combination of the 6.56, and two other cases. One other case is where the brothers got called up the same year (giving an odds ratio of 7.00). The third case is almost exactly the same as the first case, but slightly different because of the way the authors dealt with families of more than two siblings. For that third case, they wound up with a "younger brother called up first" ratio of 5:0, instead of 5:1. That works out to "infinity".

So the authors combined the three odds ratios:

6.56
7.00
infinity

How do you combine an "infinity?" Well, they used the "Mantel-Haenzel common odds ratio" technique, which is able to handle ratios with a zero denominator. They wound up with an overall odds ratio of 10.58.

So that's where the "10 times" in the original NYT article comes from, and that's where the "10.58" in the paper and the response comes from.

------

There you go. As I said, I think it's right this time. Let me know if you find anything that needs correcting. If all is OK, I'll eventually prepare a response to the authors' response, outlining more clearly why I disagree with some of their conclusions.

Labels: baseball, baserunning, siblings

Sabermetric Research

Tuesday, November 23, 2010

Explaining the sibling study

0 Comments:

About Me

Previous Posts