Wednesday, June 26, 2013

Pitchers who never appeared on baseball cards

(Continued from last post)

Here are the pitchers (last MLB game 1999 or earlier) with the most career innings pitched who never appeared on a regular-issue Topps baseball card.  

(The first column in the is actually the number of *thirds* of innings.  The year is the player's last MLB game.)

IP*3   year   
---------------------------------------
1246   1998    Rich Robertson
1170   1988    Salome Barojas
1158   1999    Doug Johns
1095   1958    Jim McDonald
1036   1965    Marshall Bridges
 885   1998    Matt Beech
 851   1998    A.J. Sager
 838   1955    Fred Baczewski
 801   1966    Jim Duckworth
 725   1956    Dick Marlowe
 711   1983    Steve Baker
 691   1955    Al Corwin
 677   1999    Jim Pittsley
 636   1997    Vaughn Eshelman
 608   1987    Keith Creel

Marshall Bridges and Jim Duckworth are the two with no major issue cards at all.  Bridges pitched 345 1/3 innings in his career, Duckworth 267 innings.

Barojas and Creel both appeared in Fleer and Donruss.  Baker appeared in the 1983 Topps Traded set.  The 1950s players appeared in Bowman.  The 90s players appeared in one of the many other sets around at the time.

Here's a brief online conversation I found about why there was never a card for Bridges.


(Update, 6/28: Removed Randy St. Claire from the list; he shouldn't have been there.  Thanks to Don for letting me know in the comments.)



(Update, 2/4/14: Removed William Van Landingham, he appeared on a 1995 Topps card.  Thanks to an anonymous commenter.)

Labels: , ,

Hitters who never appeared on baseball cards

Last post, I gave the answer and asked you to give me the question.  The answer was:

For hitters: Dave Schneck (followed by Loren Babe).

For pitchers: Marshall Bridges (followed by Jim Duckworth).

-----

The question is:


"What player had the most career AB [innings pitched] without ever appearing on a major issue baseball card?"

-----

The answer depends on what you consider "major issue".  Tony Horton retired in 1970, with 2228 career at-bats, after stress issues and a suicide attempt.  He never appeared on a Topps card, but he did appear on a 1971 Kellogg's card.  If you don't want to count Kellogg's as a major set, then Horton is the answer.  I'm not sure why he didn't appear on a Topps card ... a friend suggested that maybe he refused because he didn't want the attention.  Anybody know?

I also ignored cards produced well after the player's career ended, including  cards where he appeared as a manager.

And, finally, I counted only AB accumulated in 1952 or later, and I eliminated players still active in 2010.

-----

Here's the list of hitters with at least 250 AB but no card.  The year in the chart is the season of the player's last game in the major leagues.

 AB    Year    
----------------------------
413    1974    Dave Schneck
382    1953    Loren Babe
298    1970    Van Kelly
274    1966    Ernie Fazio
257    1961    Joe Altobelli
255    1957    Jack Littrell

There are no players after 1974.  That's probably because of the explosion in card sets that started in 1981, after Fleer and Donruss ended Topps' monopoly.  By the 1990s, there were dozens of sets.  

Topps also had competition from Bowman from 1950 to 1955, but that didn't seem to stop Jack Littrell and Loren Babe from making the list.

----

The way I figured this out ... I copied Topps checklists off the web (thanks to this site) for every year (Topps regular sets only).  Then, I tried to cross-reference the player names to the Lahman database.  I fixed as many problems as I could -- different name spellings, Bob vs. Robert, "Vandeberg" vs. "Vande Berg", too many Greg Harrises, stuff like that.

Then, I eliminated everyone on the list who played at least one game in 2000 or later ... I figure there are so many sets these days, that I probably wouldn't find any candidates from this century.  When I get less lazy, I'll confirm that.  It seems not unreasonable, considering there were no players even from the 1980s or 1990s who qualified.

Then, I sorted the non-Toppsed batters by AB.  There were still some uncaught false positives that I just crossed out.  Finally, I checked every remaining player against an online card database, to make sure.

For the record, here's the list of batters without a Topps regular set card who played their last game before 2000:

 AB     Year    
------------------------------
2228    1970    Tony Horton
1160    1955    Tom Umphlett
 625    1957    Mel Clark
 579    1956    Wayne Belardi
 564    1999    Manny Martinez
 501    1999    J.R. Phillips
 482    1998    Damon Mashore
 423    1998    Jesus Tavarez
 413    1974    Dave Schneck
 382    1953    Loren Babe
 381    1960    Billy Shantz
 355    1997    Sherman Obando
 336    1999    Dave Silvestri
 319    1999    Bobby Hughes
 317    1998    Rico Rossy
 313    1978    Art Kusnyer *
 307    1976    Jim Cox *
 305    1998    Andy Tomberlin
 301    1999    Ed Giovanola
 298    1970    Van Kelly
 274    1966    Ernie Fazio
 269    1999    Matt Luke
 258    1998    Frank Bolick
 257    1961    Joe Altobelli
 255    1957    Jack Littrell
 254    1992    Kevin Ward
 253    1956    Rudy Regalado
 252    1997    Tilson Brito

Art Kusnyer and Jim Cox were each on one Topps "Future Stars" card (1972 and 1974, respectively), but my programming didn't pick it up because of how the checklists were written.  So, maybe, think of this as players not having a Topps card *to themselves*.

The unbolded players from the 1950s all had Bowman cards.  The 1990s players all had multiple cards from one of the many, many other sets of the era.

Except Manny Martinez, who is fifth on the list.  Martinez came *very* close to qualifying for the overall "title".  He had only one "major set" card that I could find -- this one, from 2000.  Why only one appearance when so many sets were produced?  I don't know.  Most of his AB came with the Expos in 1999, his last season.  Perhaps Topps had planned a card for him, but dropped it after he was released?  

Also, Topps' 1999 and 2000 sets were very small -- less than 500 cards each, compared to 660 in the 1970s and 792 in most of the 1980s.  But that still doesn't explain why none of the other sets picked him up.

------

While I'm here ... if I extend the cutoff past 1999, the list gets much bigger ... there are a lot of 21st-century players of who didn't have a regular Topps card.  That's probably because of Topps' small set sizes in that era.  

In fact, there are eight recent hitters who would finish ahead of all but Tony Horton and Tom Umphlett on the all-time list.  Here they are.  I haven't checked for errors; there might be a false positive or two in the list.  

 AB     Year    
------------------------------
1109    2008   Sal Fasano
1085    2006   Lou Merloni
 779    2005   Jeff Liefer
 754    2000   Aaron Ledesma
 735    2008   Adam Melhuse
 713    2007   Josh Paul
 713    2004   Lou Collier
 636    2004   Robert Machado


------

I'll do the pitchers next.



Labels: , ,

Tuesday, June 25, 2013

Baseball trivia question

There's a baseball (trivia) question I was always curious about.  So, I decided to do a bit of research to try to figure out an answer.

The question is actually more interesting than the answer ... so, for now, I'll give you the answer, and see if you can figure out the question.  It has nothing to do with sabermetrics, and it has a pop culture aspect to it, so you won't get it just by looking at baseball statistics.

----

For hitters, the answer is: Dave Schneck (followed by Loren Babe).

For pitchers, the answer is: Marshall Bridges (followed by Jim Duckworth).

-----

Notes/Hints:

-- The question has to be slightly altered between pitchers/hitters.

-- It's nothing complicated. You can pretty much ask the question in 10 or 15 words.

-- My research may be bad.  But even if I got the wrong answer, you can probably still figure out the question, I think.

-- The timeframe is from the early 50s to now, which is a good hint.  

-- I'd be surprised if the answer changed in my lifetime.

-- Depending how you phrase the question, the answer for hitters *could* be, "Tony Horton, by a mile."  (Which, I think, could be another hint for some people on the pop culture side.)

-----

Answer next post ... by that time, I might have more results from my programming to answer some other related questions.




Labels: ,

Thursday, June 20, 2013

Are eBay baseball card buyers racially biased?

Here's a nice recent post by Bo Rasny, "Ten Articles on Baseball Cards and Race".  Number ten is actually one of my own posts ... I'm going through the other nine.   Most of them are academic studies, some of which aren't available in full.

But this one is, a downloadable 2011 study called "Race Effects on eBay."  It finds that cards auctioned on eBay by black (African American) sellers sell for significantly less money than cards by white sellers.

Authors Ian Ayres, Mahzarin Banaji and Christine Jolls -- I'll call them "ABJ" -- bought 394 baseball cards on eBay in order to resell them.  Before posting their auctions, they split the cards into two groups, at random (actually, alternating alphabetically, but close enough).  The first group would ostensibly be posted by a black seller, and the second by a white seller.

How would the bidders know the race of the seller?  When posting, ABJ used a photograph of the card held in what was ostensibly the seller's hand.  Here's one of their example photos I'm stealing, just for fun (and because I love 1973 Topps, and because I keep seeing Don Sutton on Match Game):




When all was said and done, it turned out the white sellers' cards went for 14 or 20 percent more than the black sellers' cards, depending on how you look at it.  The result was significant at 2.2 SDs, and the authors conclude that eBay bidders discriminate by race.

It's kind of a fun study.  But, as usual, I'm dubious.  

--------

Suppose you want to find out whether white sellers can sell their car for more money than black sellers.  So you get a bunch of BMWs, and a bunch of Chevrolets, and assign them randomly to your test subjects.  It turns out the white sellers got higher prices than the black sellers.

But, what if the white sellers got more expensive cars to sell?  It could be that the white guys (and gals) wound up with 70 percent of the BMWs, and that's might explain why they got the results they did.  If that's what happened, it explains what happened without recourse to race, right?  It doesn't matter that the cars were assigned randomly, with the best random number generator money can buy.  Remember, even the best-designed study is going to give a false positive, by chance, 5 percent of the time.  This would be one of those times.  The fact that we see the intermediate result -- more BMWs for white guys -- lets us catch that false positive for what it is.  

Of course, if the cars were assigned randomly and *we never looked* at which races got which cars, we'd publish our study without hesitation.  It would still be a false positive, but we wouldn't know it.  

--------

Well, that's what happened with the baseball cards.  The authors showed the original purchase prices of the cards in the two groups.  One group sold cards that were originally purchased for an average of $9.82 each, and the other sold cards that were purchased for an average of $9.23 each.  And that difference was statistically significant at the 5 percent level! 

That is: one group got significantly "more expensive" cards to sell than the other group!  It's exactly like the BMW and Chevy example.

Except -- and here's the interesting part -- the difference went the *other way*.  It turned out that the white sellers had the "cheaper" cards, and the black sellers had the "more expensive" cards! 

In that case, it looks like that makes the effect *harder* to explain, not *easier*.  In fact, the effect is "double" what we thought it was.  Not only did the white sellers get more for their cards, but they did so even while selling less worthy cards in the first place!

The thing is, though, I'm not sure they *were* less worthy cards.  And that's the next part of the argument.  

--------

As you can imagine, when the authors auctioned off the cards they just bought, they didn't sell for the exact amount as the purchase price.  Sometimes they sold for more, and sometimes less.

But, mostly less.  A lot less.  It appears that, on average, the authors paid $9.75 for their cards, and sold them for around $6.15.  They took a 37 percent bath.

It seems that ABJ overpaid quite a bit for the cards they bought.  How is that possible, in an auction format?  On eBay, you never pay much more than the second-highest bidder.  Even if you bid $1000 ... if the next highest bid was $1, you'd wind up paying only $1.25.

It might be just because cards in this price range vary a lot in winning bids.  It's kind of random, depending on how many people happen to be interested at that particular time.  

For instance: the card in the picture, 1973 Topps Don Sutton #10, graded PSA 8.  I'm looking that up in eBay's completed auctions right now ... and I find two of them.  (Here's a link, but the results change over time.)

One sold at $7.50.  The other sold six weeks later at $13.49, almost double.  The first one had only two bids, while the second one had ten.  The first auction drew bids only on the last day.  The second one started drawing bids three days before the auction's close.

It's probably just a random thing -- the first auction had one serious bidder, while the second one happened to have two.

Checking out another card from the study, 1975 Topps #195 ... in PSA 8, there were three: $23, $33.57, and $45.  In PSA 7, they're more consistent, from $9.95 to $11.95.  

Finally, I checked 1983 Fleer Wade Boggs, #179, PSA 8 ... there are six, ranging from $1.04 to $7.50.  (The $1.04 has a "PD" downgrade, but, strangely, so does the $7.50!)

So: prices vary a lot.  And, I think, the authors' rules for when to bid amount to selective sampling on the expensive auctions.  

First: they chose only auctions with existing bids.  The auctions where the price gets bid up are the ones most likely to have bids early.  If ABJ chose to visit eBay on a random day, they'd be three times more likely to wind up in the second Don Sutton auction than the first.  

Second: ABJ chose only auctions where the existing high bid was between $3 and $8.  But bids don't reach close to their final level, usually, until close to the end of the bidding... sometimes, even, the last few minutes of a week-long auction.  That means the authors, again, selectively sampled those auctions where the price had been bid up early.

For instance: on those 1983 Wade Boggs cards, ordered by final selling price (low to high), the amount of time the bidding was over $3 was: 

never (the $1.04 card)
1 minute
36 hours
14 hours
9 hours
13 hours (the $7.50 card)

The weighted average of selling prices, by time over $3, is $4.80.  The average price overall was $4.01.  ABJ's strategy, in this case, would have them overpay by 20 percent.

ABJ acknowledge that they overpaid for the cards.  They write, " ... in purchasing the cards initially we did not exert significant effort to minimize our buying prices." 

It doesn't really matter that the selling prices were lower than the purchase prices ... so this isn't meant to be a criticism.  But the pattern of how the cards were bought is something that's going to factor into my coming argument.  

------

The authors sold the cards, overall, for about 60 percent of the prices they paid.  But, the relationship between purchase price and selling price is actually weaker than that.  In their regressions, the authors show that for every additional $1 they paid for the card, they received only an extra 40 cents.  That's the kind of relationship you'd expect when you overpay differently for some cards than others.  (For instance, if they paid the same for all the cards in a "grab bag" random purchase, the relationship would be zero.)

But, 40 cents is 40 cents, right?  The cards that cost more actually *did* bring in more, so means the black/white effect is still compounded.  It's still true that the white sellers brought in more money while selling cards that should have brought in *less* money.  Not as much less as originally thought, but still less.  Right?

Well ... it's possible that there's something else going on.  It may sound a little contrived, but ... I think it might be right.  That is, I'm not just playing devil's advocate in trying to shoot down the finding: I actually think it's plausible.  

Suppose (to simplify things) that there are two types of cards sold on eBay.

Group 1 cards are "commodities".  There are lots of copies sold, and prices are pretty constant.  There are many "buy it now" copies available, so nobody bids prices up too high, because you can always just abandon the auction and buy the card at a fixed price.  

Group 2 cards are "obscure".  They're not actually obscure in the real world -- there are lots of them around -- but they're scarce on eBay, since they're low demand and harder to sell.  They don't come up for auction much.  Common cards, say, or semi-commons.  If you have one of those, you might just put it up on a fixed-price basis, since there aren't going to be throngs of people lining up to bid. 

You can't pay too much for "commodity" cards.  For "obscure" cards, though ... it's a crapshoot.  There are few bidders.  If you're the only bidder, you get it cheap.  But if there's competition, the card goes for more than it's "worth", because it's a card that doesn't come up often and you want for personal reasons, like to complete a set.  

That is: there is wider variance in prices received for "obscure" cards.  Sometimes they go cheap, and sometimes way expensive.  

In my experience selling on eBay, this actually happens.  I had one set where I sold graded commons from the same set, and, it turned out, some of them went for two or three times as much as others.  Of course, they were different players, but ... they were all classed as commons, and there was no obvious reason why one should go for that much more than another.  It just so happened that some cards had more interest than others, for what I think was just luck of the draw.  

As I argued, ABJ would selectively wind up in the "expensive" obscure card auctions, since they'd never be the only bid, and they'd never bid until the price passed $3.  Likely, the "obscure" cards are probably the ones for which they overpaid the most.

Now -- and here's the weakest part of my argument -- maybe those are the cards they paid more for, on an *absolute* basis.  Maybe the scarce cards went for, say, $12 each, on average, and the commodity cards went for, say, $7 each, on average, just to pick numbers out of a hat.

In that case ... it's possible that the more expensive cards, overall, would actually be expected to return less on resale than the cheaper cards!  And, since the black sellers got more expensive cards, at a statistically significant level, maybe *that's* why they earned less.  Not racial bias, but worse cards.

-------

Here's a made-up example of how that might happen.  There are five commodity cards, where you pay $2, $4, $6, $8, and $10, and their resale value is 80% of what you paid.  And, there are five obscure cards, where you pay $8, $9, $10, $11, and $12, but those have a resale value of only $3 each.

The obscure cards cost you an average $10 each; the commodity cards cost you $6 each.

If you predict resale price from purchase price, using all ten cards in the regression, the coefficient is positive: for every additional $1 you pay, you get an extra 16 cents in resale.

But: suppose you give the black sellers the $4 and $8 commodity cards, and the $8, $11, and $12 scarce cards.  Now, they're selling cards bought for an average of $8.60.  The white sellers get the rest, which were bought for an average of $7.40.  

The black sellers receive $18.60 for their cards.  The white sellers receive $20.40.

The white sellers got higher prices than the black sellers, even though they sold "more expensive" cards.  When you run a regression to include a dummy variable for race, you find that the coeffiecient for "black seller" is negative 57 cents.

That works -- it matches the direction of every effect in the actual study.  The black sellers receive less money for cards that actually cost more, with no race bias whatsoever.  What makes this work is that the *category* of cards is more important than the *purchase price*, and cards in the loser category tend to be the ones that cost more.

--------

Why do I think this is what happened?  Here's one thing.

The lowest selling price for any card was 99 cents, because that was what ABJ chose as the minimum bid.  If you look at the study's ""Figure 3b", which shows selling prices, it turns out that white sellers had only 7 cards that sold at the minimum, while black sellers had 16 of them.  Now, that could be racial bias ... but it could also be that black sellers just happened to have more "obscure" cards, that were the overpriced ones, that are more likely to have only one bidder.

Why do I favor the "obscure" hypothesis over the "racial bias" one?  I find it much more plausible, in general.  The idea that eBayers are so racially biased that, for nine extra cards, *none* of the thousands of baseball card collectors on eBay wanted to bid $1.24 because the seller was black ... that seems way too extreme.  

But even if you don't agree with that ... there are probably other things it could be.  When the study itself shows that there was a significant difference between the two sets of cards, you have to at least suspect that there might be something else going on, don't you?

Look, suppose it turned out that the black sellers had wound up with mostly hockey cards, and white sellers with mostly baseball cards.  Wouldn't it be plausible that hockey cards resell for less because they're easier to overpay for?  Or, suppose that the black sellers had wound up with ungraded cards, and the white sellers with graded cards.  Again, couldn't the argument be that ungraded cards are easier to overpay for in an auction format?

Now, suppose that the black sellers wound up with cards that originally cost more, and white sellers with cards that originally cost less.  Which is what happened!  Isn't it plausible to argue that there might be something about the "expensive" cards that makes them actually harder to resell than the "cheap" cards?  Even if I haven't got that "something" exactly right?

--------

If we had the complete data, we'd have a way to test my hypothesis, that the price difference between the groups is causing the effect.  

You could take a random sample of 50 cards from the white group, and 50 cards from the black group.  If the black group still has more expensive cards, on average, throw it away and repeat.  Eventually, after several tries, you'll have a random sample with roughly equal card prices in both groups.  Test that new sample, based on the original eBay sales, and see if the results still hold.  

You can repeat that a few times, if you like, and see what happens.



Labels: , ,

Thursday, June 13, 2013

Eliminating stupidity is easier than creating brilliance

Is poker a game of pure luck, or is there skill involved too?  

One way to test, as Steve Levitt suggested, is to check if it's possible to lose on purpose.  If it is, then there must be skill involved, because the player has some control of the outcome.  And, of course, it *is* possible to lose at poker at will, if you want to ... so it's reasonable to argue that poker is a game of skill.

On the other hand, you can't lose the lottery on purpose, no matter how hard you try.  So, the lottery is just a game of luck.

But ... there are exceptions to the "lose on purpose" rule.  

An easy one is tic-tac-toe.  It's easy to lose on purpose -- just go second, and take a side square when it's your turn.  The first player, assuming he plays his best, is certain to beat you.  On the other hand, you can't *win* on purpose.  If both competitors play optimally, the result will always be a draw.

If you don't like that one, try casino blackjack.  You can lose on purpose just by hitting every hand -- eventually, you'll go over 21 and bust.  But, can you *win* on purpose?  Only to a certain limit.  If you aren't a card counter, the best you can do is to faithfully follow "basic strategy." In that case, you'll reduce the house advantage to its minimum possible value -- 0.5% -- which means that you'll lose, on average, $1 for every $200 you bet.  Any deviation from that will be random luck.

That is: you can lose as much as you want, on purpose.  But you can't win any more than the best of the other players, on purpose.

Even though blackjack passes the "lose on purpose" rule, I think most people would argue that it's a game of luck.  Even though you can lose on purpose, there's no way to *win* on purpose ... that is, there's no way to beat the best players by improving your skill.  

------

Why is this, that you can lose on purpose, but you can't win on purpose?  In this case, it's deliberate, human-caused.  When we invent games of skill, we keep the ones that have an interesting struggle to win.  We don't care whether there's a struggle to lose, because, who cares?  The object is to win.

Or, you can look at it this way.  When there's competition for a goal, it's hard to win, because you have to beat your opponent, who's trying just as hard as you.  When there's no competition for a goal -- like losing -- it's easy, because nobody is trying to prevent you.

If everyone is trying for X, it's hard to be the most X.  But it's easy to be the most "not X".

------

This seems like it doesn't matter much, but ... there are interesting consequences.  Let's suppose that you're a baseball team, and you're trying to decide who to draft.  There are 29 other teams competing with you to make the best choice, but nobody competing with you to make the worst choice.

That means it's hard to beat the other teams on purpose.  But it's easy to lose to the other teams on purpose -- just pick your mother, for instance.

Now, the interesting part.  The same thing applies about winning and losing by accident.  It's hard to *beat* the other teams by accident, but it's easy to *lose* to the other teams by accident.

Suppose you scout a player, and you think he's the next Mike Trout.  The other teams are scouting him too.  If you're right about him, and the other teams are too, the only way you're going to get him is if you have the first draft choice.  Otherwise, some other team will snap him up before you. 

But ... suppose he's *not* the next Mike Trout.  You just happened to see him on a day he went 5-for-5 with three home runs.  He's really just a fourth round pick, and you've badly overrated him.  What happens?  You inevitably draft him too high, and you suffer.  You've lost by "accident".  By mistake.  By lack of skill.

It's hard to win by intention, fluke, or skill -- but it's easy to lose by intention, fluke, or (lack of) skill.

------

Let's suppose your scouting department concentrates on a few players.  It spends substantial time analyzing those players, and it usually does OK evaulating them.  

There's a player named Andrew.  The MLB consensus is that he's going to be the 20th pick.  Your scouts spent a lot of time on him, and they think he's better than that.  Their opinion is that he's actually the 9th best player in the draft.  

There's another player named Bob.  MLB thinks he's the 16th best player.  Your scouts think he's only the 30th best.  

The draft comes along, and you have pick number 16.  How much benefit do you gain from all that intelligence gathering?  Suppose, if you like, that your scouts are absolutely correct, that they have the players ranked perfectly.

Well, if Andrew is available when your turn comes along, you snap him up for a gain of "7" spots.  But that's not guaranteed, because, after all, you're not the only team doing scouting!  If any one of the teams drafting from 9th to 15th came to the same conclusion, they've already grabbed him.  In that case, your benefit from all that scouting winds up being ... zero.

What if Bob is available when your turn comes along?  Well, you're going to pass on him, because you know he's not that good.  But, if you hadn't done the scouting, you would have taken him with your number sixteen pick.  You would have had a loss of "14" spots.  

In the case of Bob, your intelligence *did* help you.  It helped you a lot.  And, it doesn't matter if other teams scouted him.  Even if every other team reached the same conclusion you did, you've *still* saved yourself a big mistake by scouting him too.  If you hadn't scouted him, you would have made a big mistake.

The moral: you gain more by not being stupid, than you do by being smart.  Smart gets neutralized by other smart people.  Stupid does not.

-------

If you're still not convinced, try this.  I gather 10 people, and show them a jar that contains $1, $5, $20, and $100 bills in equal proportions.  I pull one out, at random, so nobody can see, and I auction it off.  The bidding will probably top out at around $31.50, which is the value of the average bill.

I do it again, but, this time, I'm not that careful, and you get a glimpse of the bill.  So does Susan, the stranger sitting next to you.

What happens?  

Well, if it's a $100 bill, you and Susan bid up the price to $99.99.  Neither of you really benefit.

But, if it's a $1 bill ... neither you nor Susan bids.  Each of you would have had a 1-in-10 chance of paying $31.50 for the bill and suffering a loss of $30.50.  On an expected value basis, each of you gained $3.05 from your secret knowledge.

------

As I said at the Sloan Conference -- well, I don't remember saying it, but someone else said I did -- "one of the things that analytics can do really well is filter out the really stupid decisions."  

What I was probably thinking, was something like this: If the 1980 Expos had had a sabermetrics department, they could have spent hours trying to squeeze out a couple of extra runs by lineup management ... but they would have been much, much better off figuring out that Rodney Scott's offense was so bad, he shouldn't have been a starter.

It works that way in your personal life, too.  You can spend a lot of time and money picking out the perfect floral bouquet for your date ... but you're probably better off checking if you have bad breath and taking the porn out of the glove compartment.

-------

If it's true that sabermetrics helps teams win, I'd bet that at most of the benefit comes from the "negative" side: having a framework that flags bad decisions before they get made.

And that's why, if I owned a professional sports team, that would be my priority for my sabermetrics department.  First, concentrate on eliminating bad decisions, not on making good decisions better.  And, second, figure out what  everyone else knows, but we don't.


Labels: , ,

Tuesday, June 04, 2013

The OBP/SLG regression puzzle -- Part V

(Links: Part I, Part II, Part III, Part IV)

When you run a regression to predict a team's runs per game based on OPS and SLG, you get that a point of OPS is about 1.7 times as important as a point of SLG.  When you predict *opposition* runs per game, you get 1.8.  But, when you try to predict run differential based on OPS differential and SLG differential, you get 2.1.

Why the difference?

It all hinges on the idea that the relationship between runs and OBP/SLG is non-linear.  

Suppose there are seasons where teams hit for a higher OBP and SLG are higher than normal -- steroid years, say.   And suppose those are the exact seasons where a single point of OBP or SLG is worth more in terms of runs.  That's not farfetched -- an offensive event seems like it would be worth more in years when there's more offense.  When there's lots of hitting, a walk has a better chance to score, and a double has more men on base to drive in.

So, it's a double whammy.  OBP/SLG are higher, and each point of OBP/SLG is worth more.  It's almost like a "squared" relationship, which is non-linear.  

-----

A good analogy would be something like, tickets sold vs. ticket revenue.  On one level, it's linear, because for each extra ticket you sell, your revenue goes up $25 or whatever.  But, then, the second whammy: attendance is much higher now than it was in the sixties.  So, if you sell a lot of tickets, it's more likely you're in 2011 than 1966, which means that, along with your higher attendance, you also have higher-priced tickets!  So, more tickets means more revenue because of more sales, but also more revenue because of more revenue *per ticket*.

Now, what happens when you switch to *differences*, so you're predicting "revenue over opposition" based on "tickets over opposition"?

Well, that depends.  The original source of the "double whammy" was that teams with high tickets sold were more likely to have higher ticket prices.  Is that still true for teams with high *differences* of tickets sold?

Maybe, or maybe not.  Suppose you have two teams, team A from 2002 that averaged 40,000 tickets, and team B from 1964 that averaged 10,000 tickets.  That's a ratio of 4:1, four times as many tickets sold when per-point OBP/SLG values were high.

After you take the difference, does the 4:1 ratio change?  If it stays at 4:1, nothing happens.  If it moves higher -- maybe team A outdrew its opposition by 10,000, but B outdrew its opposition by only 1,000, for a ratio of 10:1 -- the relationship becomes even more non-linear.  If it moves lower -- team A outdraws by 2,000, team B outdraws by 1,000, for a 2:1 ratio -- the relationship becomes *less* non-linear.

(Actually, I'm not sure if I should be dividing (to get the ratio, which I did) or subtracting (to get the raw difference) or something else.  But this is just an intuitive explanation anyway.)

Since there is no reason to expect that the differences will have the exact same ratio as the original attendance figures, we are almost *assured* that the nature of the relationship will change.  And, since, as a general rule, the coefficient increases with non-linearity, we expect the coefficients to change.

So it was a mistake, on my part, to originally assume that the "difference" regression should have the same ratio as the "single-team" regression.

-----

BTW, that was the "non-baseball" explanation that I promised you.  If tickets is still too basebally, just substitute any non-baseball relationship where a high X is correlated with a high value per X.  What works well is a time series featuring some commodity that sells more now, when per-unit prices are obviously higher because of inflation.  

Like, say, Starbucks coffee, or bicycle helmets.  If you want a "decreasing returns" example, I bet cigarettes is a perfect one -- lower cigarettes sold is strongly associated with higher prices per cigarette.

------

Now, to the actual baseball data.  

For the MLB teams and their oppositions in the dataset, we need to know: when we look at *differences* in OBP and SLG instead of the actual values, do high differences still correlate with times when individual points are more valuable?  That is: do the ratios get wider, or narrower?  It's hard to know, exactly, but we can get a rough idea, by looking at the spread of the data before and after.  

Let's start with OBP.

For the seasons in the study, the range of team OBP was .273 to .372, which is 99 points.  For opposition OBP, the range was 103 points.  In terms of standard deviations, it was .015 for the offenses, and .016 for the opposition offenses.

But, for the differences, the spread was wider.  The range was 123 points, and the SD was .019.

Teams:       spread  99 points, SD .015
Oppositions: spread 103 points, SD .016
Differences: spread 123 points, SD .019

That makes sense ... when you subtract opposition hitting, it's like adding team pitching.  The spread should be wider because if you take a team with awesome hitting, and they also have awesome pitching, they stick out from the average twice as much. 

In general, when you subtract one variable from another, if they're independent (or have a negative correlation), the SD and spread increase.  If hitting and opposition hitting were, in fact, independent, you'd expect the SD to increase by a factor of root 2 -- from .015 to .021.  It only increased to .019, because hitting and opposition hitting are, in fact, somewhat correlated.  They both are affected by whether you play in hitters' parks or pitchers' parks.  They also depend on what era you play in.  

Still, the difference in OBP seems to have increased the spread.  We don't know for sure, though, that the new difference numbers are still correlated with high values for each point OBP, but it seems reasonable to expect.  

-------

What about SLG?  

Well, the range for teams was .301 to .491 (190 points).  The range for opposition teams was .306 to .499 (193 points).  

But -- surprisingly -- the range for (team minus opposition) was almost the same.  It was 194 points.

And the SD of the differences was *smaller* than the SDs of the originals.  The teams were .034, the opposition was .033, but the SD of the differences was only .031.

Teams:       spread 190 points, SD .034
Opposition:  spread 193 points, SD .033
Differences: spread 194 points, SD .031

Why is the SD actually *lower* when you combine the two variables?  

Well, it's the same argument as for OBP: they're correlated with each other.  But, it seems, SLG correlates much more highly than OBP did.  Probably, park and era effects are slugging related more than on-base related -- after all, parks are better known for home runs than for walks, say.  

I wasn't expecting those effects to be so huge ... this seems to suggest that environment may be *more important* than team talent, at least for raw SLG!  

So, it would be reasonable to assume that, when we subtract opposition runs from team runs, the "non-linearness" of SLG to runs stays about the same, or decreases a bit.  

-------

What have we found?  We found that when we use "team minus opposition," we're increasing the non-linearity of OBP, but *decreasing* the non-linearity of SLG.  So, we'd expect the OBP coefficient to increase, and the SLG coefficient to decrease.  And that's what happens:

Teams:       16.62 OBP, 9.98 SLG   [ratio: 1.67]
Opposition:  17.38 OBP, 9.38 SLG   [ratio: 1.85]
Differences: 18.73 OBP, 8.97 SLG   [ratio: 2.09]

And that explains why the ratio for differences is higher than the ratio for individual teams.

By the way: in the above table, every higher coefficient in a column is associated with a higher SD of that variable, and every lower coefficient in a column is associated with a lower SD of that variable.  That doesn't have to be the case, but it's more likely that way than any other way, I would think.


Labels: , , ,

Sunday, June 02, 2013

The OBP/SLG regression puzzle -- Part IV

(Here's part 1, part 2, and part 3.)

---

A couple of posts ago, Alex and others suggested that I try a regression to predict runs per game (RPG) instead of winning percentage.  Maybe *that* regression would come out to the OBP/SLG ratio of 1.7 that we've been expecting.  

Jared Cross actually did that, in another comment, and it worked!  Here's my version of Jared's result:

RPG = (16.6*OBP) + (10.0*SLG) - 4.9  [ratio: 1.67]

And the same regression, but for opposition runs per game:

RPG = (17.4*OBP) + (9.4*SLG) - 4.9  [ratio: 1.85]

The two ratios, 1.67 and 1.85, are almost perfectly in line with the expected 1.7!

------

So, my first reaction was ... I wasted my time!  All those worries about walks and non-linearity and increasing returns ... well, they weren't necessary.  The issue was just that I used a different variable, winning percentage instead of runs!

But ... actually, on reflection, I don't think that's it.  I think it's just a bit of a coincidence that these regressions work out to 1.7.  Let me give you a couple of intuitive arguments that may or may not convince you.  

First argument: I redid the first regression above, the 1.67 one, but three times, with the dataset split based on walk tendencies (BB/(H+BB)).  "High" and "low" mean one percentage point above or below average; "medium" is everyone else.  The ratios:

High walks:   1.70
Medium walks: 1.25
Low walks:    1.89

It's strange: the teams with average numbers of walks had a much lower ratio than the high-walk and low-walk teams.  But the original 1.7 ratio, the one that Tango got, was based on a perfectly average team.  

So: if this is the answer, that this regression is the right one ... shouldn't Tango's result have come in at 1.25 instead of 1.7?  

Second argument: I repeated the regression, but this time I combined the team with its opponents.  That is, I predicted (RPG minus opposition RPG) from (OPS minus opposition OPS) and (SLG minus opposition SLG).

Since we're just subtracting the two equations, you'd expect that the ratio would be somewhere in between 1.67 and 1.85.  But, no:

RPG = (18.73*OBP) + (8.97*SLG)   [ratio: 2.09]

The ratio goes up to 2.09.

Why should that persuade you that the original 1.67 is coincidence?  This is a red herring: Tango's analysis didn't include opposition.

But ... it did, in a way.  Tango's logic and numbers work exactly the same way if you include opposition.

Instead of asking, "what happens if we add an event to an average batting line," you can ask, "what happens if we add an event to the zero batting line that's the difference of two average teams."  The calculation is exactly the same either way.  

Specifically, if adding a point of OBP to an average team gives 1.7 times as many additional runs as a point of SLG, then adding a point of OBP to an average team *with a given opponent* should add 1.7 times as many additional runs *over that opponent* as adding a point of SLG.  Right?

But ... here, the results are different.  We we added in the opponent, we got 2.09 instead of 1.67.  And the difference, I will argue next post, is real, not just a random artifact.  Actually, I don't even think the difference has anything to do with baseball.

-------

Talking baseball for now, though ... why are these results so much different from the winning percentage case?  Especially the walk breakdown.  With winning percentage, it seemed like walks increased the ratio, but, when it comes to runs, it seems like sometimes walks increase the ratio and sometimes they decrease it!

Well, it's going to sound like I'm just making this up, but here's my latest theory, for what it's worth:

The linear weights values of the various offensive events are based on an average team.  In real life, their actual values are different for good teams and bad teams, but we just assume the differences don't matter.  And they wouldn't, if they all changed roughly equally, because then the regression would adjust.

But, maybe they don't.

I ran a regression to predict the linear weights values based on runs.  Then, I looked at only the best offenses (based roughly on a 1.7 OPS stat), and the worst.  (I know it's not very accurate to use regression for this, but I was too lazy to do play-by-play data, and I only need roughly correct values anyway.)

event     1B   2B   HR    BB   out
----------------------------------
average  .53  .71  1.45  .34  -.10
----------------------------------
high     .58  .69  1.50  .34  -.13
low      .49  .72  1.47  .34  -.08

Almost all the difference is in singles and outs!  

In our sample, every team had roughly the same number of outs, so the out value doesn't matter that much.  But, not every team has the same number of singles, relative to the other events.  

For a given (1.7 * OPS + SLG), the team with the higher slugging will probably have more singles.  Therefore, they will score more runs than expected from their 1.7-weighted OPS.  Therefore, the regression will attribute that to the SLG, and weight it higher, reducing the ratio.

This is the reverse of what I thought happened in the "winning percentage" case, which is why you should assume I don't know what I'm talking about.  But, if you're still with me ... well, if I'm going to change my mind, I should probably come up with an explanation of why the cases are different. 

Here's an attempt.  I'll put it in block quotes to emphasize that it's a guess and I'm just throwing it out there ...


The singles thing was happening in the winning percentage case, too, obviously.  However, it was overshadowed by another factor.
 
When a team has a high SLG, much of that is caused by the park.  In that case, the opposition will also have a higher SLG.  So, much of the high SLG doesn't translate into a higher winning percentage (although it *does* translate into more runs).  

On the other hand, if a team has a high OBP, more of that is "real", since park factors don't affect walks and singles as much as, say, doubles and home runs.  

So that's why walks mattered more when we were looking at winning percentage.  When it comes to runs, walks are taken at face value.  But for winning percentage, we have to give them extra weight, because the relationship between slugging and winning percentage is too tainted by park. 

Is that right?  Who knows.  It sounds plausible.  But, so did everything else I argued earlier.  I'm probably full of crap.  Take this part with a grain of salt.

-------

For the record, though: when we combine a team and its opposition, we get a ratio of 2.33 for winning percentage, and 2.09 for run differential.  Those aren't too far off from each other, and still higher than 1.7.

-------

So, the question remains: if I've convinced you that the 1.7 is coincidence, and the 2.09 matters just as much ... then, why do we still have that difference?  I'm going to back off from specific theories, and stick to generalities.

The relationship between OBP/SLG and runs/wins is non-linear in many ways.  One way is that a point of OBP/SLG has different numbers of events depending on how good the offense is.  Another is that a point of OBP has different proportions of walks/hits depending on slugging.  A third is that the different basic events have different run values depending on offense.  And there are probably a lot more.  

So, the answer to the question is: because there is so much non-linearness going on, there's no reason to expect that the coefficient of the average team will equal the coefficient from a regression of all teams.  

-------

Next post, I'm going to make even a stronger argument, one that has nothing to do with baseball: I'm going to argue that the ratio we get from these regressions is almost useless anyway.  Tango's 1.7, for an average team, is meaningful -- but these other regression results that I've been doing, these 1.67s and 2.33s and 2.09s, don't tell us anything useful at all.



Labels: , , ,