Monday, July 30, 2012

Sabermetrics, scouting, and Joe Morgan

Baseball people have done pretty well for themselves without sabermetrics, all things considered.

Many decades before Bill James, they figured out that power hitters should bat third or fourth.  They knew, long before any decent fielding stats were available, that it's better to let Aurelio Rodriguez play shortstop than, say, Al Kaline.  I bet they even knew that ERA was a better indicator of a pitcher's performance than W-L record.

How did they know?  Just the buildup of conventional wisdom and observation.  Even without scientific studies, some truths have a way of making itself known, over time. 

Even 2,000 years ago, it was known that certain plant extracts, such as willow bark, help relieve pain and fever.  That was despite the absence of peer-reviewed scientific journals, randomized double-blind experiments, and formal analysis.  It was just common knowledge.  How did that knowledge emerge?  I don't know, exactly.  Maybe someone, somehow, randomly discovered it worked for them, and passed their experience on to others, who found it worked for them too.  By word of mouth it would become part of the conventional wisdom.

It wasn't until the mid-1800s that the chemical responsible was synthesized in the laboratory, and it took until 1971 for someone to figure out the biochemistry.  But the ingredient -- acetylsalicylic acid, or aspirin -- worked, and everyone knew it, as early as 400 BC.

-----

There are lots of other areas of knowledge where this happens, that people generally know something, even though nobody can prove it.  Agriculture, say.  In order to get a good crop, you have to do a lot of things right.  A thousand years ago, before Gutenberg, how did anyone know what to do?  Communal knowledge, communal wisdom.  Farmers knew, for the most part, which crops needed more water and which needed less, which needed certain types of land and which needed other types, and so on.  I don't know anything about farming, but I bet that, for the most part, they were right.

And there's some of that in the field of normal human experience, what would now be considered psychology.  Take any of the morals of Aesop's fables, or any old proverb.  "Misery loves company?"  Yes, it does.  Not always, but as a general rule.  I don't know of any peer-reviewed journal article that runs a regression and confirms it, but it's true nonetheless. 

And there are things in our everyday lives, too -- most of the mundane things.  I know that I have to take the garbage to the curb early in the evening, because if I wait until too late, my experience is that I'm tired and cranky and it becomes a huge chore, much worse than it should be.  Again, I haven't commissioned a study on my behavior to prove it, but I'm pretty sure it's true.

-----

So, coming back to baseball ... it's got to be true that old-school baseball people know things that we sabermetricians don't.  Lots of things -- things that sabermetrics can never find out, and also things that sabermetrics could find out, but hasn't yet. 

There is no doubt in my mind that sabermetricians can learn a lot from scouts.

But there's still some friction between traditional baseball people and sabermetricians.  Recently, Russell Carleton ("Pizza Cutter") described his experiences working for a major league team.  As I interpret it, the old-school baseball guys say something along the lines of, "generations of conventional wisdom, and our own experience and observations, show us that X is true."  And Russell, the sabermetrician, would say something like, "well, we don't believe you, because you're not doing science."

But Russell tells us that now he realizes he underestimated the contributions of the baseball traditionalists:

"... I talked about how numerical models could take vast pools of data into account (entire decades of games!) and how my models could give unbiased estimates of how important various factors really were.  ...  That's a major point in the sabermetric method’s favor, and to this day, I believe it to be its primary strength.

"But had I been a more charitable man, I could have pointed out that the eye test has its benefits too. There are a lot of baseball lifers who have been around the game so long they instinctively— sometimes subconsciously—know to look for things that I don’t even know exist. Not all of their theories and beliefs are going to be right, but I acted as though, by dint of their non-outsider, non-Ph.D-having status, anyone who wasn’t quoting the latest sabermetric research was automatically wrong."

Quoting Russell, Tango echoed the idea that you have to look at both sources:

"My favorite description of the role that performance analysis plays in the front office was the one used by Theo Epstein at the start of his career: performance analysis is one lens, and scouting observation is the other lens, and the GM needs both lenses to see."

And to both Russell and Tango, I say -- yes, I agree.  But ... the fact is, that when science contradicts conventional wisdom, you have to go with science.  That's not a judgment that sabermetricians are smarter than scouts, nor is it a declaration that numbers are always better than intuition.  It's simply a restatement of the idea that evidence matters.  Evidence has to trump preconceptions, no matter how strong, and how established those preconceptions are.

Suppose you have the question: is there beer in the fridge?  And you get a bunch of scouts together, with a combined 300 years of experience.  And one of them says, "yeah, of course there's beer in the fridge.  The fridge is in the basement, beside the TV, and there are six single guys there watching the football game.  My experience says that there's always beer in the fridge in those cases.  You have all five tools -- guys, single guys, TV, sports, and man caves." 

The other scouts nod in agreement.  And one of them says, "you'll notice that there's another fridge upstairs, that's full of food.  Obviously, this one is for something else, and, in my experience, it's usually beer."  And another says, "look at all those empties sitting beside the fridge.  That, too, shows that there must be a case or two in the fridge."

And the scouts are showing a pretty good mastery of how basement fridges and beer work.  But ... now, the sabermetrician comes along, and says, "well, let's just look inside the fridge."  And he opens it, and ... it's empty.

That wins, doesn't it?  No matter how expert the scouts are at recognizing beer evidence and fridge types and demographics, no argument can contradict that, when you look inside the fridge, there's nothing there.

It's not a question of, what the scouts think, versus what the sabermetricians think.  It's a question of conventional wisdom, versus data and evidence.  The scouts may not like that, but ... there it is.

When you want to ask the scouts is when you don't have enough evidence to decide.  We don't have to ask scouts any more about whether clutch hitting exists, or whether some batters have a "hot hand," because we have studies that answer the question beyond most reasonable doubts.  That doesn't mean baseball people can't give another perspective on the issue, or they might suggest angles we never thought of.  But it *does* mean that, on those particular issues, the opinions of the scouts no longer have a lot of weight.  We've already opened the fridge.

----

Which is, then, why I disagree, a little bit, with Russell on this.  Addressing Joe Morgan, Russell says,

"I owe you an apology. ... My goal in writing this letter isn't to say that you were right all along about sabermetrics. In fact, Mr. Morgan, I still disagree with you on plenty of issues, most notably in that I believe sabermetrics can offer a lot to the game of baseball. I’ve come to the conclusion that sabermetrics is a young, toolsy prospect. There’s a lot of potential there to be a game-changer, but maybe, just maybe, there’s something to be gained by sitting down and listening to a wise man who’s been around for a while.

"Mr. Morgan, I was arrogant and believed that I had the power to answer all questions. I indulged in the idea that someone who didn't speak about baseball in the same language that I did was somehow beneath me."

I disagree, as I said, just a little bit.  Of course, baseball lifers have a lot of accumulated knowledge, and we probably need to listen to them more than we do.  But ... the Joe Morgan types need to accept that, no matter how hard-won their expertise, it still can't stand up to actual evidence. 

That's the issue. 

Unfortunately, Joe Morgan doesn't see it that way.  He doesn't agree that sabermetrics is science.  He doesn't try to understand the methods sabermetrics uses, or look at the evidence, or follow the logic.  To Joe Morgan, there are two ways to know things: watching baseball, and doing math.  He thinks they're equal at best.  They're not.  "Doing math" actually means looking at the evidence, looking at all the baseball games than anyone has ever watched.  "Doing math" is opening the fridge.

-----

In a way, I feel bad for the baseball "lifer" types, the Joe Morgans.  For decades, for generations, scouts and old-school analysts were respected professionals, observing the game and distilling their experiences into opinion.  They were considered experts because they were there, on the inside.  They were considered experts because they were the only ones who understood the nuts and bolts.  Who better to advise on the big picture, than the people who know the little picture?

And that didn't work too badly: the best players still got the jobs, and the early draft picks still played better than the late draft picks, and batting orders were close enough to perfect that it didn't matter.  And even though some scouts thought player X could turn it on in the clutch, they could get away saying it because, in truth, hardly anyone actually acted on it.  It was just their well-considered opinion.

The problem is, now we're not satisfied with opinion.  The goal has changed.  We now want science.  We want to look at evidence, and really find out what's right.  For the first time, we have people telling those scouts they're wrong -- and they can back it up.

Baseball lifers, I bet, never wanted to be scientists.  They wanted to be opinion leaders.  They wanted to be respected no matter what they said -- or at least, no matter what they said that was within the bounds of conventional wisdom among their peers. 

Now they can't. 

Joe Morgan can't utter a sentence without someone running up the stairs from his parents' basement to show he's wrong.  I can see why Joe might not like that.

But the fact is ... if you want to claim to be an expert, you have to be willing to look at evidence, revise your views, and admit when you're wrong.  Lifers never had to do that before ... and now they do.  I feel bad for them: their job has changed into one they hadn't bargained for. 

I agree with Tango that you need two lenses -- the sabermetrician's, and the scout's.  But at the same time: the scout's job is to do the best he can to be as accurate as he can.  To me, it's maybe OK for a lifer to not delve to deeply into sabermetrics, so long he's able to appreciate and process the evidence that emerges.  But to *deny* the relevance of sabermetric research ... to claim you know better by observation without even understanding how the evidence is relevant ... that's not acceptable. 

And that's where I disagree with Russell, that Joe Morgan is owed an apology.  Yes, he knows a lot of things about baseball that we sabermetricians never will.  And perhaps we could pick his brain.  But, so long as he denies the relevance of sabermetrics, and refuses to look at what the analysts are saying ... well, are we really going to make any progress?  Aren't we more likely to benefit more by finding someone else's brain to pick, someone who's not afraid to look at the evidence and change his mind?






Labels: , ,

Monday, July 23, 2012

"Effect of Jon Stewart" -- a reply to Tango

(Non-sports post.)

Tango quotes Jon Stewart arguing that Mitt Romney is "speaking out of both sides of his mouth," by criticizing Obama while saying much the same thing:

Romney’s own remarks to Olympians, offered during the opening ceremonies for the 2002 Winter Games in Salt Lake City that Romney led, hewed closely to Obama’s suggestion that success is communal.

“You Olympians, however, know you didn’t get here solely on your own power. For most of you, loving parents, sisters or brothers encouraged your hopes,” he said after praising the competitors in footage unearthed by NBC News. “Coaches guided, communities built venues in order to organize competitions. All Olympians stand on the shoulders of those who lifted them. We’ve already cheered the Olympians, let’s also cheer the parents, coaches and communities.”

Obama, speaking in Roanoke, Va., on July 13, came to a similar rhetorical conclusion.

“Look, if you’ve been successful, you didn’t get there on your own.… If you were successful, somebody along the line gave you some help,” Obama said. “There was a great teacher somewhere in your life. Somebody helped to create this unbelievable American system that we have that allowed you to thrive. Somebody invested in roads and bridges. If you’ve got a business – you didn’t build that. Somebody else made that happen.”

That quote was from Stewart.  This one is from Tango:

"I’m sure some political hack is going to explain the nuance that the two aren’t comparable, and explain it in a way that no one is going to believe him, but he says it loud, so he thinks people are listening."

Well, I'm not sure I'm a political hack, but I'll try it anyway.  (Tango didn't open his post to comments, which is why I'm doing it here.)

The difference between the two cases is that, for the Olympian, most of the help came from people who were close to the athlete personally, didn't get paid, and performed actions that very specifically benefited him or athletes similar to him.

For the businessman, most of the help came from strangers, who got paid, and performed actions (building roads) that benefited almost everybody.

The businessman depended on communal resources and paid help.  The Olympian, as Romney describes him, depended on unpaid resources targeted directly to him, and benevolent unpaid help.

It is indeed true that everything we do is dependent on the work of others.  In business, those others get paid.  I don't have the cite right now, but I remember reading that corporate profits average 8 percent of sales.  Suppose a company builds a widget.  Obama is correct -- somebody else made that happen.  Many somebodies.  Other people built trucks, and roads, and widget stores, and widget design software, and fork lifts.  Other people created a police force to apprehend those who try to hijack the company's widget trucks. 

And that's why all those people get 92 percent of the widgets.  

The company pays, through salaries and taxes and purchases, for the "somebody else [who] made that happen".  By a large margin -- 11 for the others, and 1 for them. 

On the other hand, I think we'd agree that "loving parents, sisters or brothers" didn't get paid.  I'd imagine that most coaches are volunteers, at least at low levels.  And communities usually ensure that the best Olympic prospects get better-than-average access to the venues they built -- in effect, an Olympian's neighbors subsidize him or her directly.  I'm sure Sidney Crosby got more game time and practice time than lesser players in his hometown ... but Wal-Mart never got its own exclusive lanes on the Interstates.

My view -- and presumably Obama's and Romney's too -- is that if you get something, you should give something back in return.  The widget company gives back with money.  The Olympian, on the other hand, cannot (and should not). He has to pay with love, and appreciation, and recognition, and respect for the people who helped him without any benefit to themselves other than good will.

We understand this intuitively already.  When Dad wakes up at 5 am every day to drive us to the rink, we appreciate it forever, and never forget his sacrifice, and say we couldn't have done it without him, and give him the puck from our first NHL goal.

When Acme Taxi drives us to the arena, we don't. 

That's the difference.  




Friday, July 13, 2012

Are economists bad at statistics?

(Warning: this is a boring "how to interpret a regression" post, not much sports.)

Are economists bad at statistics?

Felix Salmon comments on a paper that presented academic economists with the results of a hypothetical regression, and asked them several questions about the results.  It turned out that most of them got it right when they looked at a scatter plot of the raw data.  But when they were given traditional regression results, as produced by statistical software, they blew it.

Specifically, for three of the four questions, a majority of econometricians got them wrong when looking at only the regression results. 

I'll translate one of the questions into baseball (using unrealistic numbers that I made up). 

A regression finds that each point of OPS (that is, .001) is worth $20,000 in salary.  The regression found an r-squared of 0.5.  In the data, salary had an SD of $4 million, and OPS had an SD of .200. 

1.  What is the minimum OPS for which a player has a 95% chance of earning more than $10 million?

2.  What minimum OPS would give a player a 95% chance of earning more money than a player with an OPS of .600?

3.  Given that the confidence interval for salary-per-point-of-OPS is ($19,500, $20,500), if a player has an OPS of .800, what is the chance he will earn more than $15.6 million (which is $19,500 multiplied by 800)?

4.  If a player has an OPS of .800, what is the chance he will earn more than $16 million (the point estimate)?

You should be able to figure all these out exactly, except #3 (which you can still estimate). 

Here are my answers:

1.  The SD of salary was $4 million.  The r-squared was 0.5.  So, after the regression, the SD of unexplained salary is around $2.8 million ($4 million divided by the square root of 2). 

You need about 2 SDs above the expected $10 million for a 95% chance.  2 SDs above $10 million is $15.8 million.  That translates into an OPS of about .790.

(Actually, I think you need only 1.65 SDs, because it's one-tailed, but never mind.)

2.  The SD of unexplained salary is $2.8 million.  So the SD of the difference of two of those is $4 million.  The .600 guy makes $12 million.  For a 95% confidence interval, we add 2 SD, giving $20 million.  That works out to an OPS of 1.000.

3.  The confidence interval is a bit of a red herring.  My first reaction was to estimate the chance the *estimate* was greater than $19,500, which is .975.  But that's not what's being asked.  What's being asked is the chance a *player* is over $19,500.

The point estimate is $20,000 per point, and the chance of the player beating the estimate would be exactly 50 percent.  However, we're being asked the chance the player beats $19,500.  That's a bit easier, so the chance is a bit higher.  Call it, say, 52 percent or something like that (I won't bother figuring it out exactly).

4.  Since the point estimate is unbiased, the chance of beating it is exactly 50 percent.

-----

These questions are much easier if you look at the scatter plot.  I've stolen it from the original post:




The equivalent of question 1 was: what value of X do you need for a 95% chance the value of Y is greater than zero?  That's really easy from the plot -- it looks like it's somewhere between 40 and 50. 

Most of the economists got that. 

-----

But most got the first three questions wrong when they had the numbers.  In their defense, though, those aren't really the kind of questions normally answered in academic papers.  Normally, questions involving "95%" refer to the coefficient estimates, not the individual datapoints. 

So, I'm not convinced that, in every case, the results show a real flaw in their education; rather, I think some of the economists answered a different question, by force of routine. 

My guess is that if you explained to some of the participants why their "red herring" answer was wrong, they'd say, "oh, right," and most of them would come up with the right answer.

But I might just be making excuses, because I fell for the red herring trap myself, at first.

-----

I agree with Salmon that more of the economists should have been able to answer the questions.  But I'm not sure about his conclusion:

 " ... I see a paper demonstrating a statistically significant correlation between one variable and another, and I generally assume that if the experiment were repeated, we'd see the same thing again.  But that's not actually true.

And so it's easy to see, I think, how economists become convinced of things the rest of us aren't sure of at all -- and how the economists often end up being wrong, while the rest of us were right to be dubious.

 ... A lot of papers are written; a few of them have interesting findings.  Those are the papers which tend to get publicity.  But there's a very good chance they don't actually show what the headlines say that they show."

Actually, I don't disagree with these statements: I *agree* with them, very much so.  But I disagree that it has much to do with the economists being wrong about this quiz.  Yes, it's true that the incorrect answers tended to discount the amount of randomness in a single observation, assuming that individual datapoints were clustered much closer to their estimate than they really are.  But, strictly speaking, that has nothing to do with whether the experiment is repeatable, or whether the effect is real.

It's like, a study finds that smokers have a 20 percent chance of getting cancer, plus or minus 2 percent.  And the incorrect economists say, "Joe Smith is a smoker.  There's a 95 percent chance that between 16 and 24 percent of Joe's body will get sick."

The economists have missed the point, sure.  But that doesn't affect how real the link is between cancer and smoking.







Hat tip: anonymous commenter in the previous post.



Labels: ,

Friday, July 06, 2012

Why are soccer penalties so harsh?

There are lots of things I don't like about soccer, but one in particular bothers me the most.  It's the way penalties are so harsh.

As I understand it, if you commit a foul while the opposing player is dangerously close to your goal -- that is, in the "box," or "area", an 18-yard by 44-yard rectangle in front of the goal -- he gets a penalty kick from 12 yards in front of the net.  On those penalty kicks, he's almost assured of scoring.  The conversion rate is over 80 percent.

That's the case for *any* foul in the box.  Touching the ball with your arm, even inadvertently, or stepping on the ball handler's heel while challenging, costs you 80 percent of a goal.  Even if it wasn't a serious scoring chance -- maybe the player was at the top of the box, with five defenders still to beat -- it's 80 percent of a goal.

I mentioned this to a friend of mine, a big soccer fan.  His response was: yes, it's true that the penalties are harsh.  But, *because* they're so harsh, players know they have to be extra careful when defending in the box.  Players know that they have to keep their arms fully touching their body (in which case a hand ball is not an offense), and they know that they have to take very, very good care not to make contact with the player they're shadowing. 

Effectively, my friend says, the harsh penalties act to keep the game clean and fair and beautiful.  And, if a defender steps on a foot by accident, it's still his own fault -- he should have been much more cautious.

Well, I don't buy it. 

To me, it's like the death penalty for speeding.  If you execute drivers for doing 66 in a 65 zone, you're going to get very, very careful drivers, and almost no violations.  But, sometimes, a driver will forget for a moment, and exceed the limit, and be executed. 

It seems like a high price to pay, in terms of justice.  Even if the new law actually saves lives, by preventing more fatal accidents than it creates capital offenses, it still doesn't seem right.

One of the most important things, in sports, is that the winner of the game should appear to be determined predominantly by which played better overall, rather than by which one got lucky.  The "80 percent of a goal" rule violates that principle.

For instance, suppose, in baseball, you painted a two-foot circle on the first deck in right field.  And you changed the rules to say, if a home run hits that target, that team instantly wins the game. 

That wouldn't be good, would it?  Yes, it takes skill to hit a home run, and, yes, it takes skill to aim towards the target in hopes of hitting it.  But ... it still feels like cheating if you win that way.  It's just too random.  Even though, in the long run, the better teams will hit the target more often than the worse teams, on any given occasion, it seems like it's arbitrary and fake.

The soccer penalties feel the same to me. 

Any time your opponent has the ball near your goal, it's a serious situation, and you have to defend.  You have to challenge for the ball as best you can, without actually committing a foul.  But, sometimes you're going to commit the foul anyway, despite your best efforts.  Nobody's perfect. 

It's a game theory situation.  Maybe, by challenging aggressively and carefully, 99 percent of the time, you save .01 goals.  But, 1 percent of the time, you wind up committing a foul and costing your team .80 goals.  It's still worth it: for every 100 challenges, you wind up .19 goals ahead, overall.

But ... that one time, where you wind up stepping on the guy's foot or putting your elbow on the ball by accident ... well, those come up randomly.  And they change the game.  Eight-tenths of a goal is a lot, especially in soccer, where a 2-2 game is an offensive explosion.

Here's a video from a Euro 2012 match, where an Italian defender had the ball contact his arm.  Yes, it's a foul, and, yes, it was right in front of the net, but ... it just doesn't seem worth an entire goal, which is what the Germans got out of it.  Fortunately, it didn't decide the game; Italy won 2-1 instead of 2-0.  But, it could have, and it just seems ridiculous to me.

-----

In defense of the soccer death penalty, you could argue that, if these fouls are so random, and so important, how come the better team wins so often?  The home field advantage in soccer, which serves as a pretty good proxy for how much skill affects the outcome, is quite high -- between 60 percent (Asia/Africa) and 69.1 percent (USA) (according to "Scorecasting").  That's higher than all four major North American pro sports.  Only NCAA basketball (68.8 percent) and football (63 percent) are in the same league.

That suggests that even though the overharsh penalties are random, they aren't affecting the outcomes much.

-----

There are, of course, other arguments, on both sides.

I might say, there is more to fairness than just that the better team win a certain proportion of the time.  Otherwise, we could eliminate a bunch of games, by taking the Vegas odds, choosing a random number, and not playing the game at all.  That wouldn't do.  We need to feel that the winning team *deserved* to win.  The random fouls upset that expectation. 

To which a critic might say: soccer isn't really that much different from other sports.  Bill Buckner's error, in 1986, was at least as important to that game as a single penalty kick -- and errors by first basemen are even rarer than hand balls.  Why am I ignoring that case, and so many others in other sports?

To which I respond, the problem isn't just the rareness and randomness.  It's the injustice of the rules.  In baseball, we didn't arbitrarily punish Buckner for missing the grounder, by declaring the Mets winners as a punishment.  It's just a natural consequence of the principle that if you don't make the out, a run might score.  Buckner's situation is not like the death penalty for speeding.  It's more like nature's death penalty for losing control of your car and driving off a cliff.

-----

Or, I might argue, maybe many games *are* affected by the absurd punishments, despite the HFA appearing to be so high.  How do we know that the HFA wouldn't go up to 70 percent, or 75 percent, if the punishments were more suited to the crimes? 

A reasonable response might be: the more you're in the other team's box, the more penalties you're going to induce.  Therefore, the stronger team is going to wind up with more opportunities for penalty kicks than the weaker team, just because they have the ball so much more.  Since goals in soccer are so hard to score, the penalties actually provide an extra bonus reward for dominating the play, which might actually reduce randomness of outcome, not increase it.

And, actually, now that I think about it, I think that might be right.  But, still, it's not fair.  It's like drawing a playing card every time a team penetrates the box, and if it comes up the queen of spades, you give them 0.8 goals.  Statistically, it works to the benefit of the stronger team, but, morally, it's not in keeping with the ideals of sport, that the consequences of a foul should be proportionate to the act.

-----

OK, one more counterargument.  It's possible that discouraging fouls so disproportionately is what makes the better team win so often -- not because of goals scored on penalties, but because it discourages aggressive challenges.  This allows the better team to dominate.  If physical contact were allowed, or, at least, accepted (like in hockey, say), it would be too easy to counter the better team's skill, and defend against goals.  If that dropped overall scoring in half -- when it's already very low compared to other sports -- the inferior team would have a much better chance of securing a draw, or a freak win.

I think that counterargument is true.  But is it worth sacrificing "justice" in a single game, in exchange for improving statistical "justice" over a season?  Is it worth killing an occasional speeder to keep the roads safer and more enjoyable?

Not to me.  

Labels: