Sunday, November 25, 2007

The "K" study: now reproducible

OK, here's another update on the The "Dave Kingman strikes out a lot because his name starts with K" study.

Last post, I mentioned that I re-reran the study and couldn't duplicate the results. In a comment there, Joe Arthur mentioned that the study actually used first name, while I used last name. (Thanks, Joe!) So I reran the study for players whose first name began with K.

There, I was able to duplicate the difference in strikeout rates. The "K" players did indeed strikeout 1.5 percentage points more than the non-Ks:

14.7% for K players (37096/252439)
12.8% for the others (1365946/10607440)

I didn't recheck signficance levels, but my guess is that the difference is about 3 SDs.

However, the difference can be fully explained by the fact that first names starting with K are more popular now than they were 50 years ago. So proportionally more of the "K" hitters played in high-strikeout eras.

Go to the "Baby Name Voyager," choose "boys" only, and enter "K". You'll see a consistent rise in K names from the 19th century to the end of WWII -- about 10 times as many "K" boys at the end than at the beginning. But then, they accelerate upward even faster, doubling again by the late 1960s before dropping a little bit after that. (Most of the post-war effect, by the way, seems to be concentrated in "Kevin." Which makes sense; I can't think of any really old guys named Kevin. Or Kyle, for that matter.)

If you average out the calendar seasons played by Ks, you get 1977. If you average the seasons played by non-Ks, it's 1963.

I think this accounts for the entire effect. Here are the stirkeout rates for Ks vs. the non-Ks by decade (players with 100+ AB):

1910s: Ks 9.2%, non-Ks 8.5% (starting 1913)
1920s: Ks 7.0%, non-Ks 6.3%
1930s: Ks 8.7%, non-Ks 7.5%
1940s: Ks 8.0%, non-Ks 8.2%
1950s: Ks 12.3%, non-Ks 10.2%
1960s: Ks 14.3%, non-Ks 13.6%
1970s: Ks 12.5%, non-Ks 12.8%
1980s: Ks 13.0%, non-Ks 13.6%
1990s: Ks 15.5%, non-Ks 15.5%
2000s: Ks 17.1%, non-Ks 16.3% (up to 2003)

10s to 50s: Ks 08.8%, non-Ks 08.1%
60s to 00s: Ks 14.2%, non-Ks 14.2%

Once you normalize by decade, the effect all but disappears. From 1960 to 2003, the rates are exactly the same. There does appear to be a small "K" effect from 1914 to 1959, but it almost certainly is not statistically significant.

But maybe the authors did correct for this, or did something different. We can check for sure when the study comes out.

Labels: ,


At Sunday, November 25, 2007 6:41:00 PM, Anonymous Matt Dobra said...

Seems like the easiest way to test that is to run a fixed effects panel regression with an additional dummy for k names. If the k dummy is significant, you have a K effect.

At Monday, November 26, 2007 12:12:00 AM, Anonymous joe p said...

You can download the paper here. Pages 4 and 5 are the baseball part.

At Monday, November 26, 2007 12:49:00 AM, Blogger Phil Birnbaum said...

Thanks, joe p ... unfortunately, when I try to download the paper, I get a "not found" error.

At Monday, November 26, 2007 9:32:00 AM, Anonymous drmarcey said...

It seems like the link left off part of the link. It should have an abstract_id of 946249 instead of 94, making the link (when you put the two lines together with no space)

According to the paper, the researchers acknowledged an increase in both the frequency of strikeouts and the number of K-initialed players. It would be interesting to see how they controlled for average year of play as they say they did.

At Monday, November 26, 2007 9:36:00 AM, Blogger Phil Birnbaum said...

Thanks, got it now. I had the whole abstract ID, but I guess the system was down last night or something.

At Friday, November 30, 2007 5:30:00 PM, Blogger Oilman said...

If anything good can come from this, it's that Roger Clemens arrogance has doomed his kids (Koby, Kory, Kacy, and Kody) to be marginal players at best....finally, karma strikes a small blow against that ass

At Monday, April 20, 2009 4:08:00 AM, Blogger cvxv said...

看房子,買房子,建商自售,自售,台北新成屋,台北豪宅,新成屋,豪宅,美髮儀器,美髮,儀器,髮型,EMBA,MBA,學位,EMBA,專業認證,認證課程,博士學位,DBA,PHD,在職進修,碩士學位,推廣教育,DBA,進修課程,碩士學位,網路廣告,關鍵字廣告,關鍵字,課程介紹,學分班,文憑,牛樟芝,段木,牛樟菇,日式料理, 台北居酒屋,日本料理,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,台北結婚,場地,住宿,訂房,HOTEL,飯店,造型系列,學位,SEO,婚宴,捷運,學區,美髮,儀器,髮型,看房子,買房子,建商自售,自售,房子,捷運,學區,台北新成屋,台北豪宅,新成屋,豪宅,學位,碩士學位,進修,在職進修, 課程,教育,學位,證照,mba,文憑,學分班,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北結婚,場地,結婚,場地,推車飲茶,港式點心,尾牙春酒,台北結婚,婚宴場地,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北結婚,場地,居酒屋,燒烤,美髮,儀器,髮型,美髮,儀器,髮型,美髮,儀器,髮型,美髮,儀器,髮型,小套房,小套房,進修,在職進修,留學,證照,MBA,EMBA,留學,MBA,EMBA,留學,進修,在職進修,牛樟芝,段木,牛樟菇,關鍵字排名,網路行銷,PMP,在職專班,研究所在職專班,碩士在職專班,PMP,證照,在職專班,研究所在職專班,碩士在職專班,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,EMBA,MBA,PMP,在職進修,專案管理,出國留學,EMBA,MBA,PMP,在職進修,專案管理,出國留學,EMBA,MBA,PMP,在職進修,專案管理,出國留學,婚宴,婚宴,婚宴,婚宴,漢高資訊,漢高資訊,比利時,比利時聯合商學院,宜蘭民宿,台東民宿,澎湖民宿,墾丁民宿,花蓮民宿,SEO,找工作,汽車旅館,阿里山,日月潭,阿里山民宿,東森購物,momo購物台,pc home購物,購物漢高資訊,漢高資訊,在職進修,漢高資訊,在職進修,住宿,住宿,整形,造型,室內設計,室內設計,漢高資訊,在職進修,漢高資訊,在職進修,住宿,美容,室內設計,在職進修,羅志祥,周杰倫,五月天,住宿,住宿,整形,整形,室內設計,室內設計,比利時聯合商學院,在職進修,比利時聯合商學院,在職進修,漢高資訊,找工作,找工作,找工作,找工作,找工作,蔡依林,林志玲

At Friday, June 26, 2009 3:45:00 AM, Blogger Kevin said...



Post a Comment

Links to this post:

Create a Link

<< Home