## Saturday, April 07, 2007

### A new "protection" study using ball/strike data

Here's a baseball study described on a brand new blog by Ken Kovash, who works for "Freakonomics" economist Steve Levitt.

Kovash sets out to check whether "protection" exists. But rather than checking the hitter's batting line for evidence, Kovash checks what the pitcher throws him. He finds two statistically significant effects:

-- pitchers are more likely to throw fastballs when the on-deck hitter is better;
-- pitchers are more likely to throw strikes when the on-deck hitter is better.

I can't completely evaluate the study, because Mr. Kovash's blog just posts an outline of the method. I can't even be sure how to interpret the results, because he gives a coefficient without saying whether it's an increase in the probability, or an increase in the log of the odds ratio.

But I think that either way, the results are barely "baseball significant." Assuming the coefficient is a straight increase in the proportions, then:

-- An increase of .200 in the OPS of the on-deck hitter would increase the chance of a strike by about 3/10s of a percentage point (so if the pitcher would normally throw 60% strikes, he would throw 60.3% strikes instead).

-- Similarly, with the same .200 increase, the pitcher will throw 0.2 percentage points more fastballs – say from 40% to 40.2%, or whatever.

What surprises me is not necessarily that the differences are so small, but that such tiny effects are statistically significant – I guess that's what happens when you have four full years worth of data.

Also, you could argue that the "chance of a strike" number doesn't actually show "protection," since it could be caused by the batter's actions -- swinging at outside pitches he wouldn't normally swing at, or some such.

Hat tip:
"Freakonomics" blog.

Labels: ,

At Saturday, April 07, 2007 10:53:00 AM,  Pizza Cutter said...

There are two possible problems that could cause "fog" to borrow Mr. James' word. Either we are defining protection/clutch/whatever wrongly or we don't have a good enough sample size as he suggests. The first problem is a matter of debate, and if someone comes up with a more ingenious method of measurement, that's great. I suppose one can define "protection" in any number of ways, statistically. However, if the problem is sample size, and we really are just making Type II errors all the time, then with the data that are now available (PBP data for a few decades!), if it's just a power problem, then the effect size that we are chasing must be microscopic. I don't begrudge someone wanting to look into all the little details (it's part of the charm of the field of Sabermetrics), but is an effect size that small really useful to anyone?

At Sunday, April 08, 2007 11:16:00 PM,  Anonymous said...

I worry about the reliance on individual pitch information as an ostensible improvement (because there is more data) over the "output data" represented by actual results of plate appearances. The nice thing about the results of plate appearances is that they are objectively verifiable: a home run is a home run and a strikeout a strikeout. The pitch data this study is relying on is much more subjective. Fastball or slider? Outside pitch or a pitch over the plate? These are matters that can depend on the eye of the beholder. If Barry Bonds is on deck and the reporter of pitch data expects to see a fastball over the plate, will he see a fastball over the plate even if in other circsumstances the identical pitch might have been reported as a hard slider off the outside corner? Personally I'd rely on the hard data of plate appearance outcomes, even if the amount of data is less than ideal, before I'd rely on subjective data about individual pitches. If it's the number of data points this writer is concerned about, stats about plate appearance outcomes goes back a century (very reliable stats go back at least 50 years), where as individual pitch data goes back only a few years. There really is no need I can see for relying on subjective pitch data.

At Wednesday, April 11, 2007 3:23:00 PM,  Ted said...

Yes -- one thing one has to be careful about "statistical significance" is that if you've got enough data, you're in a sense *guaranteed* to get it. That's why it's important to focus on the significance of any difference in terms of its application.

At Friday, April 13, 2007 11:53:00 PM,  Anonymous said...

The author confirmed by email that he did not use logs, and that OPS is in the form N.NNN (0.780 for example). So the effect is significantly less than 1%

At Saturday, April 14, 2007 12:06:00 AM,  Phil Birnbaum said...

Excellent, thanks!