Thursday, May 13, 2010

Why are Yankees/Red Sox games so slow? Part II

This is part II -- part I is here.

----------

After posting the results of my regression on how players influence the length of baseball games, some commenters here and at "The Book" blog suggested I add more variables to the mix, to see if that would affect the results.

So I added a few things. I added pickoff throws (even though those weren't available for the first couple of years of the study). I added whether the game was on the weekend. Finally, I split pitches into whether the bases were empty or not (Mike Fast noticed that pitches come a lot slower with runners on).

The "weekend" variable was not significant. Including pickoff throws helped a fair bit. But it was splitting the pitches that made the biggest improvement. It turns out that with the bases empty, a pitch takes about 19 seconds. With runners on, it takes 27 seconds. That was a big difference.

The r-squared of the regression went only from .93 to .94, but I think some of the coefficients became more accurate. The estimate for the effect of adding an extra half-inning went from less than zero (which was obviously wrong) to two minutes (which is probably right). My explanation for why this happened is here.

Still, there wasn't much difference in the overall results. The players who came out as fast last time still came out as fast now, and the slow players are roughly the same too. The rankings changed a little bit, though. Here's a new Excel file with the results (two worksheets), for those who are interested.

------

Over at "The Book" blog, Mike Fast raised some legitimate questions about whether the results are correct. He wrote,


"I believe these regressions are giving results that are incorrect ... [by actual clock times, in a certain subset of games], Jeter uses a little more than one extra minute per game as compared to Jose Lopez, rather than 6 extra minutes that the regression tells us. Now that’s just comparing between-pitch-time, but I would expect that to be the bulk of the difference between any two players."


So there are three possibilities:

1. Mike's sample of games is not representative;
2. Jeter and Lopez affect game times in other ways than just speed between pitches;
3. The regression has incorrect results.

I think there might be a little bit of all three there, but especially the third. Let me go through them:

1. I ran the regression for different sets of years, and I found that Jeter seemed to be slower in the early years of the decade than in the latter years. Mike's empirical data came from 2008-2009, so that might explain part of it.

2. Here's a New York Times article about the first game of the season, Derek Jeter stepped out of the box for 13.6 seconds between pitches. That's not the total time, just the time he stepped out of the box. Robinson Cano delayed 18.7 seconds.

Now, it's only one game. But in the full study, Jeter still comes out as slower than Cano: +3.48 minutes to +2.05 minutes. Again, the NYT only measured one game, which is perhaps not representative. And perhaps Jeter sees more pitches than Cano. But this might be very slight evidence that something else is going on.

What could it be? It would have to be something not covered by the regression. Maybe it's mound conferences. Does Jeter initiate them or prolong them? (That's not a rhetorical question: I legitimately have no idea.) Maybe it's that Jeter takes a long time to get into the batter's box at the beginning of an at-bat. If he takes an extra 10 seconds, that's 40 seconds a game, which is quite a bit. Again, I don't know. Maybe he takes a long time to get to first base on a walk? Nah, that couldn't be more than a second or two per walk, which wouldn't add up to much over an entire season. Maybe when he's on first and a subsequent batter hits a long foul, he runs all the way to third base and takes a long time to get back? Doesn't sound plausible. But, anyway, those are the kinds of things we'd be looking for.

3. Is it possible that the numbers are wrong? For sure, they're somewhat wrong: a regression only gives estimates, with standard errors. For Jeter, the estimate was 3.48 minutes with a standard error of 0.65 minutes. Since, in general, an estimate is outside two standard errors 95% of the time, that means there's a 5% chance that Jeter's actual effect is less than 2.18 minutes, or more than 4.78 minutes. Splitting the 5% between "too low" and "too high," we might guess that there's a 2.5% chance that Jeter's impact is actually less than 2.18 minutes, even though he came out at 3.48 minutes.

But, actually, it's more than 2.5%. That's because by choosing Jeter, we're falling prey to selective sampling: we're only picking on Jeter because he came out at the top of the list (actually, second from the top, after Denard Span). And players at the extremes are more likely to be inaccurate than players in the middle.

You can see why this is so if you imagine taking all the batting lines from one day, and putting them in order. All the players at the top -- the ones who were player of the game, who went 3-for-5 with a home run -- overachieved their actual talent. And all the players at the bottom -- the ones who were 0-for-5 -- underachieved their actual talent.

When Ryan Howard goes 3-for-4, it would be silly to assume that .750 is a good estimate of his actual ability. When Ken Griffey Jr. goes 0-for-5, it would be equally silly to assume .000 is a good estimate of his actual ability.

The same thing is true here. When Derek Jeter shows up at 3.48, which is the ".750 batting average" of game slowness, we have to keep in mind that it's probably significantly too high. And when Nick Markakis appears at -2.77, the third-fastest batter in the study, it's probably significantly too low.

How much too high or too low? I don't know. But what we *do* know is that we expect 2.5% of our players to be more than 2 standard errors too high. There are 694 batters in the study, so 17 batters fall into that category. Is Derek Jeter among them? We can't say for sure. We *do* know that those 17 players should be concentrated near the top of the list of players, so there's a pretty good chance that Jeter is one of them. Even if he's not, there's a good chance that he's still too high, just less than 2 standard errors too high.

Having said that: you can only figure the extremes are more likely to have been "lucky" if, after you strip out a reasonable amount of "luck", the player would collapse into the middle of the other players. If not, you have less of a reason to believe that luck was a big deal. For instance, as we figured, if you take 2 SE of luck out of Derek Jeter, he goes from 3.48 to 2.18. That moves him from 2nd place to ... 10th place. That's not that big a drop, and it means that although Jeter is still likely to have his slowness overestimated, it's probably not overestimated as much as if he dropped closer to the middle of the pack. If I absolutely had to guess Jeter's actual slowness, based only on the results of the regression, I'd say maybe regress him to the mean by about 1.5 SDs . That's just a gut feeling.

What we *can* conclude is:

-- the order of the players is roughly correct. As a group, the players at the top are indeed much slower than the guys at the bottom.

-- A player at the top is always more likely to be slower than a player below him in the list. Even though Jeter, the second-slowest hitter in the study, is likely too high, so are the third- and fourth and fifth-slowest hitter, so Jeter is still likely to be slower than they are. Not 100% likely, but always more than 50% likely.

-- most of the players near the top and bottom of the list project to being significantly more extreme than they actually are in real life. They are still significantly faster or slower than average -- just not as much so as the raw numbers imply.

-- your limit of plausibility is about 3 standard errors (1 in 100 players, or maybe 10 in the list, would be that far off). So for Tim Wakefield, who is 11.5 standard errors faster than your typical pitcher ... well, maybe, in the worst case, he's only 8.5 SE faster. So he's obviously still really fast. (And, since even that would only move him from fastest to fifth-fastest ... well, 3 SE is probably too much to deduct from him in the first place.)

-- if you do NOT selectively sample your players based on how high they are on the list, you should expect the regression estimates to be about right. So if you average a whole team, or look at lefties vs. righties, or high draft choices vs. low draft choices, or some other criterion that doesn't have to do with how fast or slow a player is ... you should be able to reasonably trust the results.

-----

So, anyway, my guess about Jeter is that, these days, he's maybe 2 minutes slower than normal, not 3.5. Why? Because

-- he's faster now than he used to be
-- empirical data suggests he's not as slow as 3.5 minutes, although that only looked at pitch times
-- he finished near the top of the list of players, which suggests that he needs to be regressed to the mean a fair bit.

-----

One technical point: I was assuming that, for the 694 batters, the estimates are all independent (which is why you'd expect 2.5% of players to be too high). I'm not sure if that's true ... it could be that if you found out for sure that Jeter is only (say) 90 seconds slower than average, and forced that into the regression, a whole lot of other players would change significantly. If anyone has expertise on this point, or any other statistical argument raised here, please chime in.

Labels: , ,

4 Comments:

At Sunday, May 16, 2010 7:23:00 AM, Blogger Hugh said...

I thought one of the most important factors left out in the regression was "National TV" - ESPN or Fox national games (not local Fox games). If the length of commercial breaks is in fact longer, and the Yankees/Red Sox in fact do play more national TV games than most other teams, that may be a big impact.


I thought you had agreed there was a possibility that might be an issue. Maybe it is difficult to get that data, but I was a bit disappointed not to see that in the new regression since that seems like a possible difference maker.

Frankly, to me, this is the type of research usually done in academic studies --- in other words, it leaves it open for another academic to write another paper with the new variable, and thus changing the conclusions. It allows two guys to publish a paper and make a living. I know that sounds a bit strong, and sorry if it comes out that way, I just don't see any reason not to include the National TV factor. It can't be that difficult to find data for that, can it?

 
At Sunday, May 16, 2010 8:30:00 AM, Blogger Phil Birnbaum said...

Yes, it's difficult to get the data. If you have a source, let us know!

I think someone did say they checked for that and didn't find anything. The "weekend" variable was an attempt to control for that, since maybe national games are played more on weekends.

 
At Sunday, May 16, 2010 9:31:00 AM, Blogger Phil Birnbaum said...

I think someone told me they checked ESPN games for one season, and didn't notice much difference. But, as you say, that's probably not enough to go on.

Redoing the study is very easy if anyone knows a data source!

Failing that ... has anyone timed the inning breaks for both a Fox game and an ESPN game? I could do that next time I'm home for a broadcast ...

 
At Friday, June 11, 2010 1:03:00 PM, Anonymous Anonymous said...

Phil: As regards half-inning breaks, couldn't you go to the Pitch f/x logs and tally a break time between the last pitch of the previous half-inning and the first pitch of the next, based on the time stamps (to the nearest second)?

 

Post a Comment

Links to this post:

Create a Link

<< Home