Wednesday, May 26, 2010

Do younger brothers steal more bases than older brothers?

Alan Schwarz's latest "Keeping Score" column, which appeared in last Sunday's New York Times, quotes an academic study that found startling sibling effects among baseball players.

A hypothesis in psychology is that younger siblings exhibit riskier behavior than older ones, "perhaps originally to fight for food, now for parental attention." If that's the case, you'd expect younger brothers to attempt more stolen bases (baseball's equivalent of risky behavior) than older ones.

In an just-published academic study, psychologists Frank J. Sulloway and Richard L. Zweigenhaft checked that, and found that evidence supporting their hypothesis to a very significant degree: a full 90 percent of younger brothers outstole their older siblings!

That's astonishing, to find an effect that large. Since the study isn't available online, I tried to reproduce the study. I didn't get 90% -- I got 56%. Which makes a lot more intuitive sense.

Here's what I did. I went to this Baseball Almanac page, which lists all brother combinations in history. I downloaded their list and fixed the spellings as best I could. Then, I eliminated

(a) all sets of twins;
(b) all sets of brothers where either or both was born before 1895 (Babe Ruth's birth year);
(c) all sets of brothers where one or both was primarily a pitcher;
(d) all sets of brothers who had identical SB rates (always both zero, I think).

That left 114 sets of batting brothers. I then computed their rate of SB per (1B + BB), to see which brother tended to steal more bases. (The original study used H+BB+HBP instead of 1B+BB, but I don't think that would affect the results much.)

64 of the 114 younger brothers outstole their older siblings. Since random chance would be 57, I don't think there's an effect there. It's 1.3 SD above expected.

I have no idea if I did something wrong, or if the authors of the study did something wrong. I'm betting it's them, just because 90% is kind of outrageous.

My data, in not particularly easy to read format, is here.


UPDATE: The authors' study appears to have been a regression where:

" ... several other factors were considered, like age differences, body size and even the order in which the players were promoted to the majors."

Still, it seems unlikely that those factors would raise the rate from 56% to 90%.


UPDATE: Here are the career batting lines for both groups, divided by 1000:

------ AB -R --H 2B 3B HR RBI BB SB -avg RC/G
Young 444 58 118 20 05 06 051 36 10 .267 4.22
-Old- 538 75 145 24 05 11 068 48 12 .270 4.55

And per 600 PA (differences in rate stats due to rounding):

------ AB -R --H 2B 3B HR RBI BB SB -avg RC/G
Young 554 73 148 25 06 08 064 46 13 .267 4.24
-Old- 550 76 149 25 06 11 070 50 12 .271 4.62

So the older brothers were bit better than the younger brothers, although the younger ones stole bases at a slightly higher rate.

Labels: , ,


At Wednesday, May 26, 2010 1:49:00 PM, Anonymous Guy said...

Phil: the idea that younger brothers attempt steals at ten times the rate of older siblings is, obviously, absurd on its face. The summary also notes that younger brothers are much better hitters. Is it possible that there are a lot of pairs in which the older brother is a pitcher (and thus steals far less)?

I suppose they could have done matched pairs, in which brothers are compared year by year. In that case, the younger brother would be - duh - younger in each year, and thus would steal more.

But it probably isn't worth trying to figure out, as they've obviously made one or more huge errors somewhere.....

At Wednesday, May 26, 2010 1:52:00 PM, Blogger Phil Birnbaum said...

It wasn't that they stole at 10 times the rate, was it? It was that they had a 9 in 10 chance of having a higher rate than their brother.

I should check if the younger brothers are better hitters overall. Will do that tonight.

The matched pairs theory makes sense. I agree with you; they probably did something way too fancy and didn't understand what it meant.

At Wednesday, May 26, 2010 1:56:00 PM, Anonymous Guy said...

The abstract says: "Consistent with their greater expected propensity for risk taking, younger brothers were 10.6 times more likely to attempt the high-risk activity of base stealing and 3.2 times more likely to steal bases successfully (odds ratios). In addition, younger brothers were significantly superior to older brothers in overall batting success, including two measures associated with risk taking."

They also had more than 700 players in their study, compared to your 228. So I don't know who they were looking at, but it must have included a lot of pitchers.
Would be interesting (but surprising) if older brothers were more likely to be pitchers.

At Wednesday, May 26, 2010 2:02:00 PM, Blogger Phil Birnbaum said...

Wow! But an odds ratio of 11 might just mean that 90% of older brothers attempted to steal (odds ratio 90:10 = 9), but 99% of younger brothers attempted to steal (odds ratio 99:1 = 99).

If that's what they mean, then, yuck. In any case, yuck.

It looks like they didn't eliminate pitchers. When I leave pitchers in the study, I get ... 49%: 86 of 175 pairs. When I leave in pitchers and old players, I get 42% (122 of 291 pairs).

Also, when there were more than two brothers, I consider only the youngest and oldest.

At Wednesday, May 26, 2010 2:19:00 PM, Blogger Phil Birnbaum said...

Oh, and thanks, Guy, I hadn't noticed the abstract. I saw a bunch of white space under the title and didn't realize there was an abstract if I scrolled down.

At Wednesday, May 26, 2010 4:33:00 PM, Anonymous Guy said...

It would be very surprising if younger brothers outhit their older siblings, since family pedigree should give a few of the younger brothers a better chance to play in the majors than is justified by their prior performance alone. And, your data suggests they don't.

At Wednesday, May 26, 2010 7:25:00 PM, Blogger Hawerchuk said...

Just fyi - the first author's prior work includes:

"Psychology. Birth order and intelligence. Sulloway FJ."

At Thursday, May 27, 2010 7:41:00 PM, Anonymous Nick Steiner said...

I don't understand why the authors would possible need to run a regression for this type of study. It's so simple.

At Friday, May 28, 2010 9:19:00 AM, Anonymous Paercival said...

Sulloway is no stranger to manipulating data to fit his agenda. A brief intro to that idea can be read here.

At Friday, May 28, 2010 9:42:00 AM, Blogger Phil Birnbaum said...

Does anyone have access to the actual study so we can see what's going on? I'd hate to have to pay $30 for it.

At Friday, May 28, 2010 11:40:00 AM, Blogger BMMillsy said...

Hey Phil,

The first difference I see is that their sample size was much larger (472 batters who had a brother that also played in MLB). I have no clue what the discrepancy is there, since they also used Baseball Almanac (and they have 228 pitchers as well...perhaps they coded their pitchers wrong when comparing position players and accidentally included them?).

It looks like there could be a bias in using the simple data you use in that the researchers found that the brother called up first usually played a much longer career (6.6 times more likley to be called up first).

Assuming the first called player is better, and controlling for the talent of the player and the expected SB total based on that talent, I imagine you could get a slightly larger effect with the right information...but I agree that being 10x as likely to steal is an oddly high number...especially when compared to their meta-analysis of sport participation that found:

"The odds of laterborns engaging in such activities (high-risk sports) were 1.48 times greater than for firstborns (N = 8,340)."

They use a Principal Components Analysis at one point, which I would actually question pretty heavily for a number of they only report 2 of the components in their table that account for only 52% of the variance in the data (the third was only 18% more). The one they don't specifically report in the table for the PCAs is what they call 'base-stealing proclivity'. A strange exclusion...the results of the PCA itself are fairly interesting when talking about risk-taking...but here I don't think it was necessary...and they don't discuss it very heavily.

And this sentence is weird:

"Younger brothers were also 4.2 times more likely to be caught stealing. In spite of being caught stealing more often, younger brothers were 3.2 times more likely than older brothers to successfully complete their stealing attempts."

I'm guessing they're just talking about count data there, so younger brothers steal more bases overall, and get caught more times. But that's a strange way to put it...perhaps an indication that...well...they're just faster runners for whatever reason.

Anyway, I don't believe what they're necessarily telling me either, but I've only had time for a quick glance.

At Friday, May 28, 2010 11:43:00 AM, Blogger Phil Birnbaum said...

I've e-mailed a couple of anonymous people to ask if they can send a copy of the study ... hopefully someone is willing to send one along.

At Friday, May 28, 2010 12:42:00 PM, Anonymous Anonymous said...

I suspect based on reading the account of Sullivan's work on other birth order studies that he counted each year of MLB playing as a separate pair of brothers experience - i.e., if brothers each played 7 years they constituted 7 pairs of brothers in his counting method.

At Sunday, May 30, 2010 12:09:00 PM, Blogger Driver said...

A small point that shouldn't have a large effect on the data, but my understanding of the Times article is that they were talking about attempted steals, whereas you seem to be using SB.

At Sunday, May 30, 2010 12:12:00 PM, Blogger Phil Birnbaum said...

Driver: good point, but CS data is unavailable for the earlier parts of MLB history ... at least in my database. I suppose I could have included them where they were available for the full career of both brothers ...

At Sunday, May 30, 2010 5:11:00 PM, Anonymous Anonymous said...

younger vs older brothers--why don't you get a life--this isn't baseball, it's mental masturbation

At Thursday, June 03, 2010 2:47:00 AM, Blogger bradluen said...

Just glommed the paper. This is from their Table 2:

Stealing attempts per on-base opportunity:
Older brothers (n = 88): 0.056
Younger brothers (n = 97, includes middle brothers): 0.093
Odds ratio: 10.58 to 1

(I think the population here is brothers who are both position players and for whom stealing attempt data are available.)

The odds ratio I naively calculate from the older-younger estimates (which are from an ANOVA) is 1.72, which is obviously not 10.58.

Their work is "controlled for call-up sequence": they're stratifying as "called up first"/"called up in same year"/"called up later". Their "odds ratio" is actually a Mantel-Haenszel estimator. I've never used the Mantel-Haenszel thingie in practice, so have no idea if it could be responsible for the huge difference between my odds ratio and their odds ratio. I do wonder how big the "called up in same year" group is.

At Thursday, June 03, 2010 9:30:00 AM, Blogger Phil Birnbaum said...

Thanks, Bradluen ... someone actually sent me the paper a couple of days ago, and I'm going to post again soon.

Younger players who are called up first are obviously much better players than their older brothers: they are called up earlier DESPITE being younger. In my (incomplete) sample, 7 of 8 such younger brothers had more steal attempts than their older brothers.

I can see THAT odds ratio being 10.58 to 1, since my partial sample (I don't have CS for all years) gave me 7 to 1.


Post a Comment

<< Home