Sunday, December 23, 2007

A Retrosheet-based fielding evaluation method

UPDATE: After posting this as Dan Fox's method, Joe Arthur noted in the comments that Sean Smith previously introduced much the same method back in April. See Joe's link in the comments. As far as I can tell, it looks like Sean deserves credit for the method, and Dan for the improvements. I've changed the title, but the rest of the post remains the same.


Dan Fox has a significant new fielding evaluation method, which he explains over at Baseball Prospectus. (If that link is subscriber-only, try this post from Dan's own blog.)

How good your fielding stat is depends in large part on how much data you have. If you're only using "Baseball Encyclopedia" data, you've got range factor, or, more notably, Bill James' improved range-factor-type stat as explained in "Win Shares." If you've got full information of what "zone" of the field the ball covered, you can calculate a zone rating, the percentage of plays within reach of the defender that he actually turned into outs.

Fox's new stat is in the middle. It doesn't use full observational data, like Zone Rating, but it does use the play-by-play data found at Retrosheet. Specifically, it uses Retrosheet's rough description of the type of hit, and where on the diamond it went. I wish I had thought of it myself. I think it's got to be close to the best possible evaluation given the limit of publicly available data. Back in Win Shares, Bill James said, about his new method,

"this is the best sabermetric work I have done in the last ten years, maybe twenty ... I feel I have made some ... breakthroughs which will certainly lead, when other people see the work I have done and apply their own abilities to the issues I have raised, to even more and even larger advances."

My feeling is that Dan's method is the next advance above Bill's. I don't think Dan explicitly started with Bill's method, but wound up where he did through other, more obvious, paths. But, perhaps because of my experience with Bill's method (we adapted it for the eighth edition of Total Baseball), my first reaction was to see Dan's work as an extension of Bill's, so that's how I'll explain it.

In Win Shares, Bill noted that ordinary range factor (plays made per team game) suffers from a major flaw – which is, that the composite range factor for a team will always be around (27 minus strikeouts). No matter how bad a team's defense, it will keep getting chances to make plays until it's made 27 outs total. And no matter how good its defense, once it makes 27 outs, even 27 super-spectacular Ozzie-Smithian or Masafuri-Yamamorian outs, it has to stop playing.

So Bill's insight was that you have to adjust range factor to take into account plays *not* made. If team A has a "defensive efficiency ratio" of 68% (meaning it makes an out on 68% of the balls in play), but team B has a DER of 72%, then team B's players are actually making about 6% more plays than team A. You can't tell, just from the DER, which players are responsible for that extra 6%. But, as a first approximation, if you're comparing A's shortstop to B's shortstop, you can start by devaluing A's numbers by 6%. That may be wrong, of course. The team's 6% shortfall could be due to a bad second baseman, or left fielder, or first baseman, or some combination – the shortstop could be perfectly good, or even spectacular. But in the absence of any evidence either way, you assume the 6% reduction. In effect, A's man handled the same number of chances *per game* than B's man – but 6% fewer chances *per ball in play*. And the latter figure is what really matters when evaluating fielding.

What Dan Fox now adds to that is some information that tries to figure out how much of those 6% extra balls in play actually should be attributed to the shortstop. What he does is use the information Retrosheet provides – the type of hit (line drive, fly, ground), and who fielded the ball. Bill James used, as the denominator, all balls in play for the entire team. Dan improves the stat by limiting the denominator to balls hit near the defender.

For shortstops, Fox considers all the ground balls fielded by the shortstop or left fielder, and half the ground balls fielded by the centerfielder; all the line drives fielded by the shortstop, all the line drive hits fielded by the left fielder, and half the line drive hits fielded by the center fielder. For all those balls, he figures, there's at least a chance the shortstop could have handled them. But for balls hit to the right fielder, there was no chance, so they shouldn’t affect your evaluation of the shortstop. The Bill James method doesn't have a breakdown of where the balls went, so it has to assume that every fielder had a chance at every ball.

Put another way: whereas Bill James considered that the shortstop is exactly as good or bad as the DER for the team, Dan figures a specific DER for the shortstop by considering only balls in play that are roughly in his area. Also, he takes into consideration the fact that line drives are more likely to be unfieldable than ground balls, so a shortstop who sees a lot of line drive hits past him will have a better rating than one with the same number of hits, but more of those on ground balls.

Of course, the method is not perfect. For one thing (as Dan notes), a third-baseman who covers a lot of ground to his left will make a shortstop look good. For another thing, not all ground balls handled by the left fielder were actually balls the shortstop could have got to. (However, in
later revisions, Dan addresses this point by, for instance, tweaking his formula to assume any doubles to left were actually solely the third baseman's responsibility. There are other excellent tweaks too, such as updating the "50%" figure depending on the handedness of the batter.)

As an empirical test, there's a pretty good correlation between this method and the state-of-the-art, watch-where-every-ball-actually-goes, "Ultimate Zone Rating".

I really liked this system when I first read about it; now, with the revisions, I really, *really* like it. It's the best approach I've seen for players from the past. Do you want to know, for instance, how good a fielder Clete Boyer was, without need access to any proprietary data? In my opinion, this is by far the best objective method to use.

Labels: ,


At Monday, December 24, 2007 8:59:00 AM, Anonymous joe arthur said...

The 2nd Dan Fox link is still to the BPro article, not his blog entry.
I don't subscribe to BP, so I don't have a direct understanding of his method, but
1) I think in essence the same approach was taken by Sean Smith about 7 months ago. I think Dan may now have a couple additional adjustments - such as the allocation of ground ball extra base hits.
2) Retrosheet doesn't have hit types for all balls in play for all years, so it would seem that Dan's method cannot be used "as is" to evaluate players in most seasons before 2003. I think Sean tackled that somewhere with an attempt to impute the missing hit types on base hits for older seasons, so that he could tackle the Clete Boyer -type questions.

At Monday, December 24, 2007 9:41:00 AM, Blogger Phil Birnbaum said...

Thanks, Joe. Didn't realize this had been done before ... will update the post.

As for hit types ... right, that Retrosheet doesn't have them, but even just using the field the hit went to is a big improvement over the James method.

At Monday, December 24, 2007 9:51:00 AM, Blogger Phil Birnbaum said...

Anybody know of any additional work on this in between Sean in April and Dan now? I did find another post by Sean on his blog, by searching for "Total Zone".

At Monday, December 24, 2007 10:24:00 AM, Blogger Dave said...

Not to denigrate Dan's work at all, but AFAIK Sean should really be getting the kudos for moving this idea forward when he did.

At Monday, December 24, 2007 1:47:00 PM, Blogger Dan Agonistes said...

I could not agree more that Sean deserves the credit.

When I sat down to construct the system late one Friday night I thought that someone must have done this already but my brief Googling failed to find Sean's articles. I said as much on Tango's blog and in my column last week but I apologize for the confusion.

At Friday, January 04, 2008 9:43:00 AM, Anonymous Sean said...

Thanks guys.

For my system, groundball XBH have from the start been charged to the corner infielders.

I have the system ready for older seasons, and will publish soon. There are still a few improvements I'd like to make, but most of the framework is there and I'd rather spend my time writing up an explanation instead of tinkering. What good is the perfect system if you never get around to letting everyone else see it?

I think Clete Boyer did very well. Brooks Robinson was by far the best at third. Ozzie saved the most runs at short over his career, though I have Belanger a bit better on a rate basis.

At Monday, April 20, 2009 4:10:00 AM, Blogger cvxv said...

看房子,買房子,建商自售,自售,台北新成屋,台北豪宅,新成屋,豪宅,美髮儀器,美髮,儀器,髮型,EMBA,MBA,學位,EMBA,專業認證,認證課程,博士學位,DBA,PHD,在職進修,碩士學位,推廣教育,DBA,進修課程,碩士學位,網路廣告,關鍵字廣告,關鍵字,課程介紹,學分班,文憑,牛樟芝,段木,牛樟菇,日式料理, 台北居酒屋,日本料理,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,台北結婚,場地,住宿,訂房,HOTEL,飯店,造型系列,學位,SEO,婚宴,捷運,學區,美髮,儀器,髮型,看房子,買房子,建商自售,自售,房子,捷運,學區,台北新成屋,台北豪宅,新成屋,豪宅,學位,碩士學位,進修,在職進修, 課程,教育,學位,證照,mba,文憑,學分班,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,台北住宿,國內訂房,台北HOTEL,台北婚宴,飯店優惠,住宿,訂房,HOTEL,飯店,婚宴,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北結婚,場地,結婚,場地,推車飲茶,港式點心,尾牙春酒,台北結婚,婚宴場地,結婚,婚宴場地,推車飲茶,港式點心,尾牙春酒,台北結婚,場地,居酒屋,燒烤,美髮,儀器,髮型,美髮,儀器,髮型,美髮,儀器,髮型,美髮,儀器,髮型,小套房,小套房,進修,在職進修,留學,證照,MBA,EMBA,留學,MBA,EMBA,留學,進修,在職進修,牛樟芝,段木,牛樟菇,關鍵字排名,網路行銷,PMP,在職專班,研究所在職專班,碩士在職專班,PMP,證照,在職專班,研究所在職專班,碩士在職專班,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,SEO,廣告,關鍵字,關鍵字排名,網路行銷,網頁設計,網站設計,網站排名,搜尋引擎,網路廣告,EMBA,MBA,PMP,在職進修,專案管理,出國留學,EMBA,MBA,PMP,在職進修,專案管理,出國留學,EMBA,MBA,PMP,在職進修,專案管理,出國留學,婚宴,婚宴,婚宴,婚宴,漢高資訊,漢高資訊,比利時,比利時聯合商學院,宜蘭民宿,台東民宿,澎湖民宿,墾丁民宿,花蓮民宿,SEO,找工作,汽車旅館,阿里山,日月潭,阿里山民宿,東森購物,momo購物台,pc home購物,購物漢高資訊,漢高資訊,在職進修,漢高資訊,在職進修,住宿,住宿,整形,造型,室內設計,室內設計,漢高資訊,在職進修,漢高資訊,在職進修,住宿,美容,室內設計,在職進修,羅志祥,周杰倫,五月天,住宿,住宿,整形,整形,室內設計,室內設計,比利時聯合商學院,在職進修,比利時聯合商學院,在職進修,漢高資訊,找工作,找工作,找工作,找工作,找工作,蔡依林,林志玲

At Thursday, June 18, 2009 3:46:00 AM, Blogger Kevin said...

牙醫,植牙,矯正,紋身,刺青,創業,批發,皮膚科,痘痘,中醫,飛梭雷射,毛孔粗大,醫學美容,seo,關鍵字行銷,關鍵字自然排序,網路行銷,關鍵字自然排序,關鍵字行銷seo,關鍵字廣告,部落格行銷,網路行銷,seo,關鍵字行銷,關鍵字廣告,關鍵字,自然排序,部落格行銷,網路行銷,網路爆紅,牛舌餅婚紗台中婚紗,腳臭,腳臭,腳臭,腳臭,腳臭,腳臭,中古車,二手車,中古車,二手車,高雄婚紗,減肥,瘦身 ,搬家,搬家公司,服飾批發,團體服,街舞


Post a Comment

Links to this post:

Create a Link

<< Home