### Chipper Jones' chance for .400

Over at Baseball Prospectus, Nate Silver gives us an estimate of the chance that Chipper Jones will wind up hitting .400 this year. (Subscription required, I think.)

Silver starts by estimating that, before the season started, our best estimate for Chipper's talent was a normal curve centered around .310. But, now that he's hitting .419 so far this year (or whatever it was when Silver wrote his analysis), the best estimate is now a normal curve centered at .345 or so.

Then, it's straightforward: figure the chance that Chipper is a .300 hitter, and multiply the chance that he'll hit well enough as a .300 hitter to finish him above .400. Repeat for .301, .302, and so on. Add up all those numbers and you have his probability.

Actually, Silver didn't quite do it that way: he did it by simulation, 1000 repetitions of picking a random talent, then playing out the season. I think a thousand reps isn't very many to get a precise estimate, but it should be unbiased, at least.

Silver comes up with a probability of 12-13%.

However, there are a couple of problems with the analysis, that Tom Tango nails over at his blog.

First, the estimate of Chipper's talent shouldn't be normal. It should be biased towards the left. That is, even if your best guess is that Chipper is .345, you should give him a much better chance to be .335 than .355. Silver makes them equal. [UPDATE: Tango's comment, and further reflection, have convinced me that this criticism is not correct, and that Silver's symmetrical distribution is in fact correct. See the comments.]

Second, bumping Chipper from .310 to .345 on the basis of a couple of hundred at-bats is too big a jump. His talent almost certainly should be centered lower than .345.

As Tango points out, when you're looking at low-probability events, a small change in assumptions makes a huge change in probabilities. And so Silver's probability estimate is probably too high – much too high.

Labels: baseball

## 7 Comments:

Also at Tango's blog, Guy makes a good point that Chipper's PA this year have (so far) come against below-average hitters. That should being our estimate of his talent down even more. Throw in the fact that age is working against him (he's 36), and that brings it down even further.

----

On the same subject, Carl Bialik quotes David Pinto as saying that Chipper is a 1 in 600 shot if he's a .310 hitter. And it's only a 1 in 60 chance if he's a .331 hitter.

I just want to point out that when I did the Bayes analysis, it did come out as a symmetrical distribution. It's contrary to my initial instinct, but makes perfect sense. If you have a normal distribution of true rates, the only guys who can be from the far right are the really good ones. A whole ton of players can be those just to the right-of-center. So, when you do the Bayes analysis, you end up that a hitter with a great sample rate will be drawn from a far right point, centered as symmetrical. A bit weird, but you can try it out for yourself.

In the last 4 months last year, Jones batted .354. Just doing a cursory look at his Retrosheet stats, that comes pretty close to what his best 4 month period (within one season) might have been anytime in his career. But it was over only 353 ABs.

Anyway, combining the last 4 months from last year with this year, he has batted .378 over his last 580 ABs. And his yearly AVGs have been going up lately. Starting with 2004, here are his averages with his age

.248 (32)

.296 (33)

.324 (34)

.337 (35)

This does not seem like a normal aging/performance pattern. How unusual, I don't know. But I just wonder if something is going on or changing with this guy that makes it really hard to know his true ability.

While I doubt Chipper Jones will hit .400 it is impressive to watch him play. He is one of the best switch hitters ever to play the game and he does remind me of Mickey Mantle (his hero). Jones is no doubt a Hall of Famer.

Tango, yeah, now that you say it, I think you're right. The point is that the distribution of all players -- but the distribution of *that* player, after regressing to the mean, is symmetrical.

That is: if Chipper hits .340, he might be a .310 hitter who got lucky or a .370 hitter who was unlucky. The probability of hitting exactly .340 in either case is equal, but there are so many more .310 hitters than .370 hitters that you have to conclude Chipper is closer to .310.

But, once you have adjusted Chipper (say, to .320), the argument doesn't hold any more. Yes, you can point out that there are many more .290 hitters than .350, but, now, Chipper would have been much more likely to hit .340 *if he actually were a .350 hitter*. The lower probability of him being a .350 hitter by the distribution of hitters is cancelled out by the higher probability of him being a .350 hitter by virtue of the fact that he did in fact hit .340. So the distribution can wind up symmetrical.

I'll correct the post.

There's one aspect of Silver's simulation which I haven't seen mentioned much, and that is his rather aggressive assumptions about the possibility and effects of injury.

At the time Silver's article was written, I think Chipper had 257 PA, meaning that he could qualify as a .400 hitter with as few as 245 more PA. At an average rate of just over 4 PA per team game, he'd need to stay healthy about 60 more games. The "best case" scenario for Jones to hit .400 for the season was to sustain a .380 or better average for maybe 205 more AB [and of course he also could get lucky and get a greater proportion of walks and sac flies and have even fewer at bats]. A true .348 hitter would have about a 18% chance to hit that well in 205 AB. (7% chance for a .330 hitter). If he got 250 more AB the rest of the season, he'd have to average about .384 (.348 hitter: 13% chance; .330 hitter: 4% chance). If he got 300 more AB he'd have to average .387. (.348 hitter: 9% chance; .330 hitter: 2% chance.) If he got 350 more AB, he'd have to average .389 (6% and 1% chances)

Silver's simulation broke the rest of the season into 15 day segments and assumed that Jones has about a 13% chance of being on the DL for that segment, with no impact on his performance if he returned from the DL. (13% is his simple rate of time on the disabled list since 2003, according to Silver [I didn't attempt to verify]). Intuitively I doubt that players who are DL'd have no related performance decline, either before the DL stint (for a wear-and-tear type condition) or after (for a "catastrophic" injury).

Basically Silver's model gives Jones a good chance of being limited to 60, 75 or 90 team games the rest of the season and neeeding to sustain his performance for 300 or less ABs.

I guess staying very healthy for more than 2.5 months in 2008 hasn't affected the estimate for his ability to stay healthy in the same way that hitting very well has affected the estimate for his ability to hit. :-)

My probability estimates, by the way, were based on 100,000 season trials in a simulation.

Assuming no bias in difficulty of opportunities though weather or park effects or quality of opposition, there's a 1.6% chance a .348 hitter would hit .420 or better in 219 AB, as Jones had done at the time of Silver's simulation. There is a 0.3% chance for a .330 hitter to hit that well, and a 0.1% chance for a .320 hitter.

Post a Comment

<< Home