Sabermetric Research: Charlie Pavitt reviews a hitting streak study

Monday, February 09, 2009

Charlie Pavitt reviews a hitting streak study

I [Charlie Pavitt] am writing this in response to Trent McCotter’s piece on hitting streaks from the 2008 Baseball Research Journal. I want to begin by commending Trent on this fine piece of work. In short, a series of Monte Carlo tests revealed that the number of actual hitting streaks of lengths beginning with 5 games and ending with 35 games or more between 1957 and 2006 was, in each case, noticeably greater than what would have been expected by chance. It is always good to see evidence inconsistent with our “received wisdom.” What I have to say here in no way attempts to contradict his research findings. My problem is with his attempt to explain them.

Trent first proposed three “common-sense” explanations for what he found. The first was that a batter might face relatively poor pitching for a significant stretch of time, increasing the odds of a long streak. But, in his words (page 64), “the problem with this explanation is that it’s too short-sided; you can’t face bad pitching for too long without it noticeably increasing your numbers, plus you can’t play twenty games in a row against bad pitching staffs, which is what would be required to put together a long streak.” He then goes on (page 65) “The same reasoning is why playing at a hitter-friendly stadium doesn’t seem to work either, since these effects don’t continue for the necessary several weeks in a row.” His third “common-sense explanation” is that, as hitting overall is thought to be better during the warm months, hitting streaks may be more common than expected during June through August. This is because, and this is critical (page 65), “hitting streaks are exponential…a player who hits .300 for two months will be less likely to have a hitting streak than a player who hits .200 one month and .400 the next...[because]…hitting streaks tend to highly favor batters who are hitting very well, even if it’s just for a short period.” This is absolutely correct. Unlike the first two proposed explanations, in this case Trent looked for relevant evidence, claiming that he looked for more streaks in June, July or August and found no more than in May. Trent, how about April and September?

Anyway, rejecting all three of these, Trent then proposed two possible psychological explanations. The first is that hitters aware of a streak intentionally change their approach to go for more singles, particularly when the streak gets long; and he has evidence that longer streaks occur less randomly than shorter ones, which would occur under this assumption (players would more likely think about keeping their streak going when it was long ongoing). The second is that hot hands really exist, and his claimed evidence is that taking games out of his random sample in which the player does not start increases the number of predicted hitting streaks, bringing it more in line with the number that actually occurred. Makes sense; a hitting streak is easier to maintain the more at bats one has in a game. He proposes that this could reflect real life because managers would start a player proportionally more often when he was hitting well. True, but we should keep in mind that the same statistical effect for starting games would occur whether there is a hot hand or not. In other words, I don’t think his evidence is very telling.

I want to be very clear here about my position on this issue. I have absolutely no problem with the suggestion that players’ performance is impacted by psychological factors; I don’t see how they aren’t. My problem is with the way in which those suggestions are treated. If we are serious about sabermetrics as a science, then we have to meet the standards of scientific explanation. As esteemed philosopher Karl Popper pointed out in his now-classic 1934 book "The Logic of Scientific Discovery," if a proposed explanation for observations is impossible to disconfirm, then we can’t take it seriously as scientific explanation. This is my problem with Trent’s treatment. Let us suppose that rather than finding more hitting streaks than chance would allow, Trent had found fewer. He could then say that the reason for this is that batters crumble under the stress of thinking about the streak and perform worse than they would normally. If Trent found no difference, he could then say that batters are psychologically unaffected by their circumstance. The point is that this sort of attempted explanation can be used to explain anything, and given our present store of knowledge about player psychology they are impossible to evaluate. Again, Trent’s proposals may be correct, but we can’t judge them, so we can’t take them as seriously as Trent appears to.

In contrast, the first three proposed explanations can be disconfirmed, so we can take them more seriously. Trent claims to have disconfirmed the third, but we need to know about April and September. But the real issue I have is with his dismissal of the first two, because he did not apply the logic in their case that he correctly applied for his “hot weather” proposal. Let me begin with the first. A batter does not have to face a bad pitching staff in consecutive games for his odds of a hitting streak to increase. Let us suppose that a batter faces worse pitching than average during only 10 of 30 games in May and makes up for it by facing worse pitching than average during 20 of 30 games in June. We use the same exact logic that Trent used correctly for the “hot weather” proposal; his odds of having a batting streak, which would occur during June, would be greater than another batter that faced worse pitching than average during 15 games in May and 15 games in June. The same explanation goes for hitter-friendly and hitter-unfriendly ballparks, and is strengthened in this case because of well-supported known differences in ballpark effects. If a player’s home field was hitter-friendly and, during a stretch of time, many of his road games were in hitters’ parks, he could easily have 20 or more games in this context in a given month.

I have no idea whether either of these two explanations for Trent’s findings is correct. But the difference between these and his psychological proposals is that we could test these two and not those he favors. Given the importance of Trent’s original findings, I would obviously like to see that happen. And I would very much like it if we remain very careful about not taking our psychological speculations too seriously.

Labels: baseball, hot hand, streakiness

21 Comments:

At Monday, February 09, 2009 10:48:00 AM, Anonymous said...: Interesting study. However, I think that some -- perhaps all -- of the gap between actual streaks and predicted streaks is a function of players having streaks of high-AB games. The odds of a 20-game hit streak are much higher if a player has 20 consecutive games with at least 4 PAs and a mean of 4.8 PAs, compared to a stretch with a mean of 3.8 and one or two games with only 1 AB. The number of PAs per game is decidedly non-random, as it's a function of 1) being a starter every day, 2) lineup position (the higher in the lineup, the more PAs), and strength of team offense (the better the offense, the more PAs). Clearly, players will have stretches within a season in which they consistently have more (or fewer) average PAs, for example because they win a starting job or the manager moves them up in the lineup. Less frequently, players will change teams mid-season. And of course, a hitter in the midst of a 20-game hit streak is very unlikely to be asked to pinch hit, or even to be batted 8th in the lineup. So in the study, when the sequence of players' games are randomly scrambled I'm sure the result for many players is fewer extended periods of consistently high-PA games, and thus a lower likelihood of long hitting streaks.

To determine whether hitters' performance in each game is independent, you would need to find a way to control for the AB factor.
At Monday, February 09, 2009 12:41:00 PM, Anonymous said...: Having read the paper more carefully, I see that the author notes that hitters with long streaks do indeed average about 6.9% more ABs during the streak than over the rest of his season. He speculates that this may reflect managers rewarding players with more ABs once their streak reaches a certain length. While I'm sure that's true, it probably accounts for a small share of the increased ABs (since it can't possibly impact the first 15 or some games). Far more likely is that McCotter has reversed cause and effect here: more ABs yield a better chance of having a streak, and that's why he finds more ABs during streaks.

The chance of a streak is extremely sensitive to the number of ABs, in a way I'm not sure McCotter recognizes. He uses the mean ABs over a period of games to estimate streak probabilities. So a .300 hitter with an average of 4 AB/G has a 76% chance of 1+ hit per game. But all means are not equal in this case. Here are probabilities of a 20-game streak over any given 20 games for a .300 hitter averaging 4 ABs in every case:
20 G @ 4 ABs: .0041
10 @ 3 AB, 10 @ 5 AB: .0024
4 @2 AB, 8 @ 4 AB, 8 @ 5AB: .0017

Mixing in just a few 2-AB games has a huge impact on streak probabilities (and a 1 AB pinchhit game still larger).

So again, I think the higher than predicted number of streaks likely reflects the fact that some players enjoy long (and non-random) streaks in which they consistently get 4+ PAs in every game.
At Monday, February 09, 2009 2:17:00 PM, Don Coffin said...: I just want to note that almost everything Guy suggests in his comments is, in fact, testable. To do so requires game-level data, but that is available. For example, take the hypothesis that players with hitting streaks exceeding some threshhold get more plate appearances per game. That's clearly directly testable, and, in fact, can be used to try to determine what the minimum length hitting streak is that gives rise to this (managerial) behavior--it, in fact, such behavior exists.
At Monday, February 09, 2009 3:01:00 PM, Anonymous said...: Doc: I think that would actually be pretty hard to test. Let's say you looked at hitters with a 15-game streak. Most of them would end their streak within 3-4 games. Once the streak is over, the effect presumably disappears. That gives you a pretty small sample to look at. In any case, I think the impact of the streak itself is small, except that a player with a long streak probably won't be asked to pinch hit in a game he sits out.

The real point is that PAs are not randomly distributed. The year B. Santiago had a 34-game streak, he batted mainly 6th or 7th in April and May and averaged 3.76 PA/G. Then he moved up to 5th, and soon thereafter started his streak, during which he average 4.08 PA/G. That lineup change greatly increased the odds of his streak (and was not the result of the streak). But if you go back and randomly mix up Santiago's games, as McCotter's model does, you'll underestimate the chance of his streak.

Probably the biggest factor here is injuries. For players good enough to put long streaks together, low-PA games are not randomly distributed. They mainly happen when the player is injured (or maybe in late Sept. if team is a non-contender). For example, Pujols had a 30-game hit streak in 2007, during which he never had fewer than 4 PAs. But in April of that season, fighting injuries, he had 11 games with fewer than 4 PAs, including five (!) 0-for-1 days as a PH. If you randomize those games, Pujols will basically never have his streak. But in fact, such low-PA will tend to be grouped together for a player like Pujols, so his chance of a 30-game streak was much higher than the model will imply.
At Tuesday, February 10, 2009 12:59:00 AM, Anonymous said...: 1. McCotter says his simulation disregarded 0 for 0 batting lines because they don't affect a batting streak, but there is an exception to this: sacrifice flies do end batting streaks, so any 0 for 0 batting lines with a sacrifice fly should have been included in his simulations, reducing the number of modelled streaks. Of course handling this correctly would slightly increase the discrepancy between modelled and actual streaks.
2. The model is likely to overstate streaks for another reason - it does not correct for random fluctuation. Thus a "true ability" .300 hitter might hit .320 over one season by chance and .280 the next. By modelling the player as a .320 hitter for 162 games and .280 for 162 games, instead of as a .300 hitter for 324 games, the model should be overstating the overall likelihood of long streaks. And yet the modelled streaks still fall short of the real streaks.

Of course I agree with Guy about the importance of avoiding low PA games in order to have a chance to assemble a long streak, though I think it should be acknowledged for those who do not read the paper itself that McCotter is also aware of this and offers an alternate simulation toward the end of his article composed only of batting lines from games started, which would rarely include low PA games. How fully this compensates for the probable selection bias Guy notes is another matter, but McCotter's main intent was to undermine the "random coin flip" model as previous writers have implemented it, and in this I think he succeeds.

I do agree with Charlie Pavitt that McCotter has too hastily discarded explanations involving ballpark and quality of opposition. These may well account for some of the excess of long batting streaks in real life over the model, but I don't agree with him that McCotter's so-called "psychological" theories are either unfalsifiable or unscientific. Pavitt imagines one of them to be falsified by observation, then imagines some more that McCotter would respond to this falsification by coming up with a new theory which would be the opposite of his original theory. Though not strongly corroborated yet, McCotter's actual theory is certainly falsifiable. And in addition, from what I read, Popper's falsification principle isn't widely accepted as a satisfactory demarcator between scientific and non-scientific explanation anyway.
At Tuesday, February 10, 2009 5:54:00 AM, Anonymous said...: Joe:
You're right of course -- I skimmed the final section and missed the discussion of the model based only on games started. But I think those results actually confirm my point: that the frequency of streaks reflects greater opportunities, not superior performance, than the original model predicts. For 20-game streaks, fully 82% of the "surplus" streaks disappear using the second model, and for 15-game streaks there are no longer any extra streaks. (But a significant gap persists at 30 games, 19 real vs. 10 modeled.)

However, while eliminating pinch hit games deals with the biggest problem in the first model, there remain other important changes in players' opportunities due to batting order. I mentioned Santiago above, whose extended stretch hitting higher in the order greatly impacts the chance of a streak. Chase Utley (2006) is another good example: he batted 4th or 5th the first 17 games of season, averaging just 4.1 PA, then moved to the 2nd slot where he averaged about 4.7. He had 9 0-fers in those first 17 games, a much higher rate than rest of season.

I still think that McCotter is largely confusing cause and effect when he concludes players' additional PAs during streaks are the result of the streak. These guys are all regulars anyway, and most hit high in the order. A few move up in the order during their streak, but in most cases the player's usage changed first, creating the streak. And either way, of course, the change in opportunities does not mean a "hot hand."

Other points:
McCotter suggests the high number of 3- and 4- hit games, as well as 0-hit games, compared to a coin flip model shows the existence of a hot hand. But I think he's failing to account for pitcher quality, which changes a hitter's true talent each day.

Joe, I don't agree with your 2nd point. The fact is that a true .300 hitter will hit .320 in some seasons (just by random chance), and will be more likely to have a long streak in that season (retrospectively). He's trying to figure out if there are more streaks than that random variation would normally produce. So I don't think the model overstates the chance of streaks for that reason.
At Wednesday, February 11, 2009 6:36:00 AM, Anonymous said...: I'll make a few minor points first.
1) Examination of Santiago's batting log in 1987 indicates that he was removed from games late in his streak twice after he had gotten a hit. So this might be evidence of another way in which batters or managers can alter their ordinary behavior under the influence of a long streak. In this case it would decrease average AB during the later part of a streak.
2) Hitting streaks can wrap across seasons; Jimmy Rollins added a couple of games in 2006 to his long streak ending the 2005 season. It's not clear that McCotter made any recognition of this in his model. He may be giving an apples-to-apples comparison of simulated streaks to real streaks in the sense that he counts only the in-season streaks - thus he may have counted Rollins' real streak as a 36 game streak instead of a 38 game streak spanning two seasons.
3) In terms of "psychological" eplanations [batters and perhaps managers and opponents adapting their behavior under knowledge of the streak], although I do think such theories are formally falsifiable, I want to point out that such explanations are also historically contingent. With media coverage of "statistical" matters like streaks perhaps more intense now, a player these days might tend to start adapting when his streak reaches e.g. 20 games. That may not have been true in 1968 or 1978. Unfortunately you can't just grab every game available from retrosheet for 50 years to increase your sample size and look for a timeless cause-and-effect relationship.

The main issue I see now is that McCotter's model effectively involves sampling without replacement. Here's a simple example describing what I think he says his model does:
Suppose a player has a 5 game season and has hits in 4 of the games. Whatever the actual order of these games, there are 5 permutations of this, where x means a game with at least 1 hit, and o means a hitless game:
xxxxo
xxxox
xxoxx
xoxxx
oxxxx
It seems that McCotter takes the actual sequence of games, whatever it was, and randomly permutes it many times, counting the average number of streaks he finds. In this example, per five seasons, he will find an average of 2 2-game streaks [both in same season], 2 3 game streaks and 2 4-game streaks. He will never find streaks of 0,1, or 5 games. In his model if a player has a 4 game streak going at the beginning of his season, he has zero chance to extend it to five games, because the only game left to permute is a zero hit game. In the "coin-flip" model which he is challenging, the player would still have a 4/5 chance at that point to extend his streak to 5 games. If I've calculated correctly, the coin flip model would give 1.64 5 game streaks per five seasons, 0.82 4 game streaks, 1.13 3 game streaks and 1.49 2 game streaks.
For a more realistic example, like Santiago who got hits in 107 out of 140 starts in 1987 (76.4% chance of at least one hit), in McCotter's model, if Santiago had a 10 game streak pending in the model, there would be 97 games with hits available out of the other 130 to be the next game in the permutation, and he would have a 74.6% chance of extending the streak to 11 games. If Santiago's streak did reach 30 games in McCotter's model, he would have just a 70% chance remaining, 77/110, of lengthening the streak to 31 games.

So I think McCotter's finding that there are more real streaks than predicted by his model doesn't mean that there is a hot hand, and doesn't mean that players and managers adapt to the streak. I think it means his model is biased against generating long streaks.

So I've changed my mind from my prior comment; I don't think he has undermined the "coin flip" model after all. He claims the "coin flip" model does not accurately predict the frequency of zero and multi-hit games. If the coin flip model doesn't predict zero hit games accurately enough, it won't give the right probabilities for long batting streaks either. So I think that this was the impetus for his approach, which in essence uses actual frequencies of each type of game. Instead of simulating seasons by random permutations of the game logs, maybe he can improve his model by doing sampling with replacement.
At Wednesday, February 11, 2009 10:08:00 AM, Anonymous said...: Joe: I don't think you've identified a problem in the model. Yes, Santiago's odds go down after a series of games with a hit. But it's also true that when the model spits out 0-fers for his first 10 games, his odds of a hit go up to 82.3% for the remaining 130 starts. So it's a wash.

You're thinking prospectively: Santiago's chances of a hit are (we assume) the same regardless of what happened yesterday. But the model works retrospectively. The randomness is already "baked in" in terms of how many 0-hit, 1-hit, etc. games a player had. The question is whether those games are distributed randomly or non-randomly, and I think the model does test that.

However, to the extent the results appear non-random -- which is mainly more 30+ hit streaks than expected -- I still think that reflects a non-random distribution of opportunities (ABs) rather than hitter performance. Even a very subtle tendency to group high-AB games together has a big impact. For example, take a hitter who averages 4.2 ABs over 30 games, with 4 3-ABs games, 16 @ 4 ABs, and 10 @ 5 ABs. Change that to 24 4-AB games and 6 5-AB games -- same 4.2 mean -- and his chance of a 30-game streak increases 25%.

I think McCotter could figure out whether this is true (though it might not be easy). One option is to see what proportion of 20-game streaks in the real data and the model results, using starts only, create good conditions for a streak: something like mean PA > 4.5, 20 G >= 3PA, 18 B >= 4PS. My guess is there are more real than modeled stretches of this kind. More ambitiously, he could look at all .300+ hitter seasons, and estimate the probability of a hit streak for every 20-game string based on the sequence of ABs, and again compare the actual data to his model results.
At Wednesday, February 11, 2009 10:11:00 AM, Anonymous said...: Oops: last graf should say "mean PA > 4.5, 20 G >= 3PA, 18 G >= 4PA."
At Thursday, February 12, 2009 5:25:00 AM, Anonymous said...: Guy,
I don't understand the relevance of your prospective/retrospective distinction. One thing which is "retrospective" about McCotter's model is that he computes probabilities using actual game lines, rather than building simulated game lines. But once you have taken that step, you could still use a coin flip model, or a random sample with replacement model, to simulate the stringing together of games. McCotter's random sort/permute model amounts to sampling without replacement when looking for streaks. It is guaranteed to add up to the real overall season totals in every test, but so what?

You also argued:
Yes, Santiago's odds go down after a series of games with a hit. But it's also true that when the model spits out 0-fers for his first 10 games, his odds of a hit go up to 82.3% for the remaining 130 starts. So it's a wash.
I'm not sure how to construe this specific example into a general argument. Regardless of what happened before and therefore what the starting probability for game 1 of the streak is, in McCotter's model the probability of extending a streak keeps going down as you lengthen the streak. I assume you are saying something like 'if I'm having a 30 game streak, I more likely got there in the first place by using up a lot of my hitless games beforehand.' Apart from the special case of the very start of a season, we can infer a hitting streak is preceded by one hitless game, and otherwise we are ignorant of what preceded that single hitless game. (It's random to us.) When comparing sampling without replacement to a coin flip model, the probability of extending a hitting streak from x to x+1 games is a wash at a specific value of x, not generally for hitting streaks of any length. For Santiago, a .300 hitter in 1987, it's a wash at about 8 games. For the average hitter in McCotter's dataset, the break-even point would occur earlier. Shorter streaks are more likely, longer streaks are less likely. A 27 game streak for Santiago is about twice as likely when using a coin flip model as when using McCotter-style sampling without replacement [his starts-only model]. And that is roughly the same result McCotter has found when comparing reality to his model across all players. Reality is twice as likely to have 30 game streaks.

Translate my argument to this: What is your chance of drawing a King from a standard deck of cards 4 times in a row? if each draw is independent, it's 4/52 *4/52 *4/52 * 4/52. If the cards are used up (not replaced) and you don't know anything about other already discarded cards, then the probability is 4/52 *3/51 *2/50 *1/49. In the case of drawing Kings, the right model is the one which actually models the game -are the cards replaced or not? Just so for hitting streaks - sampling without replacement is not a realistic model. Rather than just "baking in randomness", it also bakes in an unrealistic constraint. Thus it substantially underpredicts long streaks, and mildly overpredicts very short ones.
At Thursday, February 12, 2009 11:32:00 AM, Anonymous said...: Joe: Perhaps we'll just have to agree to disagree here. If the model had the flaw you describe, I think we'd see much larger disparities in 20- and 30-game streaks. And the model does not over-predict 5-game streaks, as you say it should.

Look at it this way: Suppose you ran a large number of 150-game simulations using sampling with replacement, as you recommend, and tally the 20-game streaks. Now, take each of those 150-game trials and randomize the sequence of games within that trial many times (so Trial #40 always has 100 hit games, #41 always has 112, etc.). You should get the same frequency of 20-game hit streaks, right? The second round of randomization hasn't added anything.

That's all McCotter has done, but he's used reality as the first set of trials. (Which is a whole lot easier than trying to figure out the true BA talent of every player.) If those "real" trials were a random walk, they should yield the same # of streaks as you get when you randomize the order of games (the second randomization is essentially redundant). But if you get fewer streaks after randomization (or more), it means the real data was not actually a random walk.

* *

There is one way in which I think this model could over-predict streaks, and thus underestimate the hot hand effect. Suppose you believe that player' true talent is highly variable over short periods of time, for example that Joe DiMaggio really was a .357 hitter in 1941 and truly became a .400 hitter during the final 20 games of his streak (or whatever). If so, DiMaggio's high number of games with a hit in that season already reflect, in part, his hot hand that year. But McCotter's method takes the season line as a given, and so only tells us if DiMaggio's streak is unlikely assuming that high level of performance. Now, as far as I know, everyone who has studied this finds that players' seasonal variation is consistent with the coin-flip model. But it's hard to disprove the idea that players' true talent suddenly changes, because the variance in season performances would only be slightly larger than a coin-flip model predicts (and hard to distinguish from real variance caused by injuries, parks, quality of opposing pitcher, etc. etc.). Which is why I suppose some people will always believe in the hot hand....
At Friday, February 13, 2009 9:47:00 AM, Anonymous said...: Phil: No thoughts on this paper? Seems like your cup of tea.
At Friday, February 13, 2009 10:52:00 PM, Phil Birnbaum said...: Hey, Guy,

OK, a few thoughts ...

1. I agree with you that the permutation method is valid. I'm not sure I understand 100% what Joe is saying, but still.

2. I agree with you, again, that the fact that the discrepancy goes down when you consider only games started is strong evidence that AB are a large part of the cause.

3. If, as Joe points out with the Santiago example, players get removed late in the streak after getting their hit, then any analysis of streaks in terms of AB has to take that into account. Late in the streak, a 3 AB game is really as valuable as a 4 AB game, because 3 AB would only occur if the 3 AB included a hit.

4. I like the "less likely to walk" explanation a lot -- the one that says perhaps players get more AB because they are less inclined to take pitches, especially late in the game when they haven't got their hit yet.

5. Maybe the "change in quality" factor isn't completely out of the question. If a player improves between age 25 and 26, why wouldn't he improve between age 25.25 and 25.75? Just a thought, and it probably wouldn't affect the results that much if, for instance, the guy was a .300 hitter for half the year and then a .310 hitter for the other half. (Or would it?)

6. As Doc says, you can actually check some of these hypotheses using play-by-play data. It wouldn't be that hard, would it?

7. Here's a way to check the magnitude of the effect. 20-game streaks happened about 30% more than expected. So, figure out the theoretical chance of a .290 hitter getting a 20-game streak. What does he have to hit to get 30% more such streaks? If it's .292, the "change in talent" hypothesis is reasonable. If it's .320, it's not.

Then go further: how many extra AB per game does he have to have in order to get 30% more such streaks? How often would a friendly official scorer have to call an error a hit to increase the chance by 30%? (Probably too much for this to be the cause.) How many more strikes would the opposition pitchers have to throw to increase the chance by 30%? Is that number reasonably consistent with the hypothesis that there's an unwritten code to avoid walking an 0-for-3 streaker? (Indeed, are "too many" streaks kept alive in a fourth plate appearance?)

These questions can't be too hard to answer.

P.S. Disclaimer: I was one of SABR's anonymous peer reviewers for this paper.
At Saturday, February 14, 2009 9:31:00 AM, Anonymous said...: Phil: I agree that a lot of these theories could be tested. But you'll face some pretty serious sample size barriers. Let's say you think players and managers start to alter their decisions after a streak reaches 15 games. You could look at whether players' BB-rate drops and frequency of BIP increases in games 16 and beyond. But the large majority of these streaks will end within the next 4 games, so you don't have a lot to work with. Same thing with ABs.

It would be interesting to see how much players' AB/G increases once they have a 10- or 15-game streak. My guess is that you'd find the 7% increase in ABs applies nearly as much to the early games (when managers can't yet be influenced by the streak) as the later games.

Question: how do you calculate a 30% disparity for 20+-game streaks?
At Saturday, February 14, 2009 9:57:00 AM, Phil Birnbaum said...: On the sample size issues: there were 75 20-game hit streaks. If you start looking at the 21st game, and the average 20-game streak continues for two more games, then you have a 150-game sample. It's not huge, but shouldn't it be enough?

You may be right that there's no difference after the 20th game.

I don't understand your last question ... I read the 30% figure off the chart on page 6 of the study.
At Saturday, February 14, 2009 11:17:00 AM, Anonymous said...: With that sample size, you would need to have a change in BB rate of about .02 BB/PA to be statistically significant, about a 20% drop for these kind of hitters. So you'd need to see a pretty dramatic change. Not sure how you would estimate the expected variance in AB/G.

I see: the 30% gap is using the original model. It shrinks to 6% if you use the starter-only model. (Although, I think the author should determine the number of real streaks excluding non-starts to make that comparison exact, and it doesn't appear he did that.)

FYI: Tango et. al. looked at hitter performance after short hitting streaks (5 or 7 games) in The Book, and found a very small improvement. But it was small enough that weather and park could explain it. Still, even a very small hot hand effect could show up in the 30-game frequencies.
At Saturday, February 14, 2009 11:48:00 AM, Phil Birnbaum said...: I'm not too worried about statistical significance ... some evidence is better than no evidence. If you get baseball significance but not statistical significance, that's still pretty good.

And, besides, I wouldn't be at all surprised to see a 20% drop in BB in the 3rd and 4th PA of a (so far) hitless game.

Agreed, there might be an actual small hot hand effect happening, for reasons you discussed. It does seem strange to think that ability can change from year to year, but not from month to month.
At Monday, February 16, 2009 9:59:00 AM, Anonymous said...: After running a lengthy simulation of my own, I must admit that I was in error - random sorting does not distort the probabilities.
At Tuesday, February 17, 2009 5:19:00 PM, Anonymous said...: Joe: Nice of you to report back, especially since the results weren't what you expected. Much appreciated....
At Friday, March 13, 2009 7:37:00 PM, Anonymous said...: I'm glad you guys are talking about the article. I was the one who wrote it (I just now came across this blog).

Here are a few points that might clarify:

1)I looked only at single-season streaks, so 'wrap-around streaks' (which I actually think are just as valid as single-season ones) aren't included in either the permutations or the 'actual' streak totals.

2)I removed all 0-for-0's, EXCEPT the 0-for-0's with sac. flies, so those aren't influencing anything.

3)My real purpose was to undermine the standard method of computing probabilities, which relies on the players' final season totals (150games, 600ab, etc) and requires the independence assumption in order to come up with probailities that make sense. I think that the rand. perm. gets rid of the independent part (almost a de facto thing, with the way that the test works), but of course not many casual fans care about independence tests. So I included all of the 'explanations' that I could think of, to try to get people thinking about why we see this effect.

I agree that a lot of the results can probably be explained by combinations of at-bat effects, but I really don't know how we can test it.

One last thing: one of the reasons that I tended to disregard the ballpark effect was that there have been so few long hitting streaks at particular stadiums (like Ty Cobb hitting in 48 consec. games at the Polo Grounds over 1913-1918). Naturally, we expect fewer 30g streaks at a particular stadium than vs. the league as a whole, since each stretch of games at a stadium is like a short season that doesn't allow wrap-around.

But there have been about 20 streaks of 30g since 1957 vs. the league as a whole, but only four or five 30g streaks at a particular stadium. If stadiums are so conducive to hitting streaks, it seems like we'd see more long streaks at particular stadiums (esp. home stadiums) than we actually have. Not sure why they're so rare. But I figure it's worth something.

--Trent Mccotter
At Friday, March 13, 2009 7:44:00 PM, Anonymous said...: Oops, I had it switched in my mind. There have been about as many 30g streaks at particular stadiums as there have been vs. the league as a whole.

The kind of streaks that are so rare are 30g streaks vs. a particular opponent. There have only been five streaks of 30+g vs. one particular team, since 1956:

44: Vladimir Guerrero, vs. TEX AL, 4/09/2004-8/04/2006.
35: Ken Griffey Jr., vs. CLE AL, 5/24/1992-8/06/1996.
34: George Bell, vs. CLE AL, 6/01/1985-8/08/1987.
32: Paul Molitor, vs. CHI AL, 5/01/1993-9/20/1996.
32: Dave Parker, vs. CIN NL, 7/09/1976(G1)-5/13/1979.

...which I used as a short-cut to give less weight to the idea that facing BAD TEAMS could help the chances of long streaks overall.

Sabermetric Research

Monday, February 09, 2009

Charlie Pavitt reviews a hitting streak study

21 Comments:

About Me

Previous Posts