Thursday, November 20, 2014

Does inequality make NBA teams lose?

I

Some people believe that income inequality can hurt group performance. They think that people work better together when employees are more likely to see themselves as equal.

I don't know if that's true or not. But it's a coherent hypothesis, that makes sense in terms of cause and effect.

On the other hand, here's something that doesn't make sense: the idea that when salaries are more unequal, the result is that the total becomes lower. That doesn't work, right? You can tell the CEO, "if you paid your people more equally last year, the company would have done better." But you can't tell the CEO, "if you paid your people more equally last year, they'd have collectively taken home more money."  

Because, the relationship between total pay and individual pay is already known. The total is just the sum of the individuals. Equality can't possibly cause any additional pay, beyond adding up the amounts.

It would be like saying, "You shouldn't carry $50 bills and $1 bills in the same wallet. If you reduced inequality by carrying only $5 bills and $10 bills, you'd have more money."   

That would be silly.

-------

Well, that's almost exactly what's happening in a recent NBA study, by the same poverty researcher who wrote the baseball inequality article I posted about three weeks ago.

The author looked up individual player Win Shares (WS) for the 2013-14 season. He measured Win Share inequality within each team by calculating the Gini Coefficient for the population of players. He then ran a regression to predict team wins from player inequality. He found a strong negative correlation, -0.43. 

In other words, the more equal teams won significantly more games. 

The author suggests this might be evidence of the benefits of equality. On the more equal teams, the better performance might have been created by the "psychological and motivational benefits" of the weaker players having "better opportunity to develop and showcase their skills."

But ... no, that doesn't make sense, for exactly the same reason as the $10 bill example. 

Win Shares is really just a breakdown and allocation of actual team wins. The formulas take the number of games a team won, and apportions that total among the players. In other words, the team totals equal the sum of the individual totals, the same way the total amount in your wallet equals the sum of the individual bills. (*)

Last year, the Spurs won 62 games, while the 76ers won only 19. That can't have anything to do with equality. It's due to the fact that the Spurs had 62 win dollars in their wallets -- say, eleven $5 bills and seven $1 bills -- while the 76ers had only 19 win dollars -- say, a $10 bill and nine $1 bills. 

It's true that the 76ers players' Win Shares were more concentrated among their best players. In fact, the top 5 percent of their players accounted for more than half the team total. But that doesn't matter. They had 19 wins total because they had a total of 19 wins individually.

If you want the Philadelphia 76ers to win 50 games this year, find players who add up to 50 Win Shares. It doesn't matter if you find ten guys with 5 WS each, or one guy with 30 WS and ten guys with 2 WS. 

In fairness to the author, he does explicitly say that the correlation does not necessarily imply causation here. But the point is: he doesn't realize that he's looking at a relationship where correlation CANNOT POSSIBLY imply causation.

And that's what I found so interesting about the study. At first reading, it looks like such a strong finding, that equality may cause teams to win more ... but after a bit of thought, it turns out it's logically impossible!

The only other time I remember seeing that kind of logical impossibility was that study "proving" that listening to children's music makes you older, by retroactively changing your year of birth. And that one had been created deliberately to make a point.

-------

II

As an aside, another thing I found interesting: in his baseball article, the author argued against unequal pay for baseball players because, he believed, pay seemed to have so little to do with actual merit. But, here, by measuring inequality in Win Shares instead of dollars, he seems to be arguing against inequality of merit itself!

Well, that may be just a tiny bit unfair. Reading between the lines, I think the author thinks Win Shares are much more heavily based on opportunity than they actually are. He writes, "maybe top teams, by virtue of their abundance of success, are more willing to share the glory ... Lack of opportunity, by contrast, can lead to despair and diminished performance."

But, actually, the author never demonstrates that the bad teams have more inequality of opportunity (playing time). I suspect that they don't.

In any case, we can see that the 76ers high Gini isn't much caused by differences in opportunity. Even limiting the analysis to "regulars," players with 1,000 minutes or more, the effect remains. On the 76ers, the top two players had 53 percent of the regulars' total Win Shares. On the Spurs, it was only 29 percent.

-------

III

So why is it that the unequal teams tend to be worse? I think it's a combination of (a) the way the Gini coefficient measures inequality, and (b) the mechanism by which NBA performance creates wins. 

Suppose that on a good team, the five regulars have field goal percentages (FG%) of 59, 57, 55, 53, and 51 percent, respectively. On a bad team, the five players are at 49, 47, 45, 43, and 41 percent.

If you measure inequality on the two teams by variance, it comes out equal: a standard deviation of 2.8 on each team. But if you measure it by Gini coefficient, or a similar calculation of "proportion of total wealth," they're different. 

On the good team, the total percentage points add up to 275. The top player, with 59, has 21.5 percent of the total.

On the bad team, the total percentage points add up to 225. The top player, with 49, has 21.8 percent of the total. So, the bad team is equal by SD, but less equal by "percent of total."

The Gini is more than just the top player, of course ... the formula it involves every member of the dataset. Using an online calculator, I found:

The Gini of the good team is 0.029. 
The Gini of the bad  team is 0.036.

So, by Gini, the bad team is less equal than the good team. (A higher Gini means less equality.)

Why does this happen, that the Gini is higher but the variance is the same? Because of the way the two measures differ. Variance stays the same when you *add* the same amount to every player. But not the Gini. The Gini stays the same when you *multiply* every player by the same amount. 

If you *add* a to every player instead of multiplying, the Gini drops. (And, if you *subtract* a positive number from every player, it increases.)

That's often what you want to have happen -- for incomes, say. If I make $50K and you make $10K, we're very unequal. But if you give both of us a $100K raise, now we're at $150K and $110K -- much more equal, intuitively.

The Gini confirms that. Before our raise, the Gini is 0.33. Afterwards, it's 0.08. (But if we use the variance instead, we look the same both ways.)

But for Win Shares, is the Gini-type of inequality really what we want? Are two players with 7 WS and 6 WS, respectively, really that much more equal, in an intuitive basketball sense, than two players with 2 WS and 1 WS? What about two players at 0.002 and 0.2 wins? In that case, one player has 100 times the wins of the other. But does "100 times" really give a proper impression of how different they are?

I don't think so. I think it's just an artifact of the way performance translates to wins.

What's wins? It's performance above replacement value. (Well, actually, WS is measuring above zero value, which is lower, but I'll call it "replacement value" anyway since the logic is the same.)  

So, to get wins, you start with performance, and subtract a constant. As we saw, when you subtract the same positive number from every player, the Gini goes up. It's a "negative raise" that makes employees less equal.

Suppose the average FG% is 50 percent. Suppose that 40 percent is "replacement level" that leads to exactly zero wins, the level at which a team is so bad it will never win a game. Conversely, 60 percent is the level at which a team is so good it will never lose a game. 

If the relationship is linear, it's easy to convert player FG% to Win Shares. Actually, I'll convert to "wins per 100 games," because the "out of 100" scale is easier to follow.

On the good team we talked about earlier, the players had FG% of 59, 57, 55, 53, and 51. That corresponds to W100 of 95, 85, 75, 65, and 55.

On the bad team, the players' FG% of 49, 47, 45, 43, and 41 translate to W100s of 45, 35, 25, 15, and 5.

See what happens? The FG% looks a lot more equal than the wins. On the bad team, the best player was only 20 percent better than the worst player in field goal percentage (49 vs. 41). But in wins ... he's 800 percent better! (45/5.)  On the good team, though, there's still enough performance after subtracting that the numbers look reasonably equal. 

The actual Gini coefficients:

FG%:  The Gini of the good team is 0.029. 
FG%:  The Gini of the bad  team is 0.036.

Wins: The Gini of the good team is 0.11.
Wins: The Gini of the bad  team is 0.32.

That's just how the math works. The Gini coefficient is very sensitive to where you put your "zero". If you measure zero as 0 FG%, inequality looks low. If you measure zero as zero wins (say, FG% of 40 percent), inequality looks higher. If you measure zero as replacement level (say, FG% of 43 percent), inequality looks even higher. And if you measure zero as an NBA average team (say, FG% of 50 percent), it's even more unequal -- the top half of the teams have 100 percent of the wins! (**) 

The higher the threshold that you call zero, the greater the inequality. 

In baseball, a player hitting .304 has only about 1% more hits than a player hitting .301. But he has 300% more "hits above .300". 

In the economy, the top 10% of families may have (say) 45% of the income -- but probably close to 100% of the new Ferraris. 

And a real-life example: In the NHL, over the last ten seasons, the Black Hawks have 13% more standings points than the Maple Leafs -- but 500% more playoff appearances.

-----

One last analogy:

Take a bunch of middle-class workers, and tax them $40,000 each. They become much more unequal, right? Instead of making, say, $50K to $80K, they now take home $10K to $40K. There's a much bigger difference, now, in what they can afford relative to each other.

But if you tax the same $40K away from a bunch of doctors, it matters less. They may have ranged from $200K to $300K, and now it's $160K to $260K. They're a bit less equal than before, but you hardly notice.

Measuring after the $40K tax is measuring "income above $40K," which is like measuring "FG% above replacement level of 40%" -- which is like measuring Win Shares.

So that's why bad teams in the NBA appear more unequal than the good teams -- because "Wins" are what's left of "Performance" after you levy a hefty replacement-level tax. Most of the players on the good teams stay middle-class after paying the tax -- but on the bad teams, while some stay middle class, more of the others drop into poverty.

It has nothing to do with the social effects of equality or inequality.  It's just an artifact of how the Gini Coefficient and basketball interact.


------

* Actually, there's a bit of wiggle room in the particular version of WS the author used, the version from basketball-reference.com. That version doesn't add up perfectly, but it promises to be close, certainly close enough that it doesn't make a difference to this argument. 

** That's if you give the bottom teams zero. If you give them a negative, the Gini actually winds up at infinity. (The overall total has to be zero relative to the average, and you can't divide by zero.)  


Labels: , , ,

Tuesday, November 04, 2014

Corsi, shot quality, and the Toronto Maple Leafs, part VII

In previous posts, I've argued that when it comes to shots, NHL teams might differ in how they choose to trade quantity for quality. That might partly explain why the Toronto Maple Leafs, for the past few seasons now, have had ugly-looking shot stats, but with an above-average shooting percentage.

Skeptics argue that team shooting percentage (SH%) doesn't seem to have predictive value from season to season, which suggests it's luck rather than skill or strategy. But, at the same time, Corsi for teams seems to have a negative correlation to SH%, which is one piece of evidence that shot quality strategy might be a real issue.

Anyway, read the previous six posts for that argument. This is just an anecdote.

It comes from a piece by James Mirtle, the Maple Leafs beat writer for the Globe and Mail. Mirtle notes that the Toronto coaching staff has directed Morgan Rielly to increase his shot attempts:

[In the October 28 game vs. Buffalo,] Rielly rang up two assists – including a beauty cross-crease pass on James van Riemsdyk’s goal – and was all over the puck generally, generating nine shot attempts.

That propensity to shoot has been Rielly’s biggest shift from a year ago. The coaches want him putting more pucks on the net, and he has responded in dramatic fashion, with 2.8 shots a game compared to 1.3 in his rookie year despite similar ice time.

Even more impressively, Rielly leads all NHL defencemen in generating shot attempts, with 21.6 per 60 minutes at even strength, meaning he’s getting a look at the net roughly every 2.5 minutes he’s on the ice.

He’s winding up more frequently than not only every Leafs defenceman but every Leaf, including shot demon Phil Kessel, something that’s helping drive Toronto to respectable totals on the shot clock most nights.

Entering [the October 31] game against the injury-plagued Blue Jackets in Columbus, the Leafs have been outshot, but only by one: 281-280.

"I told myself this year that I would shoot more," Rielly said. 

Well, isn't that exactly the kind of thing Corsi skeptics should be looking for? It's evidence that coaching decisions can affect shot quantity and quality -- in other words, Corsi and SH%.

It's a small sample size -- the Leafs had played only nine games when Mirtle's piece came out -- but let's see what happens if we take Rielly's numbers at face value and make a few estimates.

Assume Rielly gets 20 minutes of ice time per game. If 80 percent of that is at even strength. it's 16 minutes at 5-on-5. Let's call it 15 to make the calculations easier.

Since he's generating 21.6 even-strength Corsis per 60 minutes, that's 5.4 even-strength Corsis per 15-minute game.*  

*I'm assuming that the "shot attempts" in the article refers to Corsi. If it refers to Fenwick, the effect is even larger than what I'm about to calculate, because the denominators are smaller (since Fenwick leaves out blocked shots).

Rielly's shots roughly doubled since last year, so let's assume his Corsis doubled too. That means his increase from last year must be about 2.7 Corsis per game. 

Last year, those extra 2.7 Rielly shot attempts would have been passes or stickhandles. Assuming half those attempts would have eventually resulted in shots by other players, the increase due to Rielly's shooting is down to 1.3. 

How significant is 1.3 Corsis per game?  In 2013-14, the Leafs were out-Corsied 4,342 to 3,259 at even strength, giving them a league-worst 42.9 Corsi percentage. If you add in 107 Corsis to the "for" side (1.3 times 82 games), it's now 4,342 to 3,366. That would bump Toronto to 43.7 percent. Now, only second worst.

It's not huge, but it's something that would indeed show up in the stats. And, according to Mirtle, it's something that's due to a deliberate coaching decision. 

How big would the effect be if the coaches decided everyone should shoot more, instead of just one defenseman whose minutes comprise only about six percent of total player ice-time?

-------

Also, you would think those extra shots would have to result in a reduction in shooting percentage, right?  Last year, when Rielly wasn't shooting as much, it was probably because he thought he could set up a better quality shot some other way. And, I would assume, Rielly's shots are taken farther from the net than average, since defensemen usually play the point. 

You could come up with a scenario where shot quality wouldn't drop ... maybe shots from the point lead to a lot of juicy rebounds, so long shots lead to a certain number of extra dangerous shots. Sure, that's possible. But I doubt if that effect, or any other, would make up the quality difference completely. If there were *never* a tradeoff between quantity and quality, every team would be shooting all the time. So, there must be some level of "dangerousness" above which a point shot is a good idea, and below which a pass is better. For shot quality would stay the same when Rielly shoots more, all his new shots would have to come in situations where not only was the shot the best move, but the shot was SO dangerous from the point that it would even be higher quality than the best alternative from closer in.

That's unlikely to be happening if Rielly now leads the league in shot attempts by defensemen. There aren't that many ultra-super-dangerous shot opportunities, never mind ultra-super-dangerous shot opportunities that Rielly wouldn't have taken advantage of last year.

-----

As I write this, it's only eleven Leaf games into the season, which is a very small sample size.  But I checked anyway.  (Here's a YTD link that may be outdated if you're reading this later than today.)

In those 11 games, the Leafs have an above-average Corsi at 50.9% in 5-on-5 tied situations. But they've scored only 5 goals in 121 shots. That's a shooting percentage of 4.13%, dead last in the league.  

There's not enough data for that to really be meaningful, but it's interesting nonetheless.



Labels: , , , ,

Wednesday, October 29, 2014

Do baseball salaries have "precious little" to do with ability?

Could MLB player salaries be almost completely unrelated to performance? 

That's the claim of social-science researcher Mike Cassidy, in a recent post at the online magazine "US News and World Report."

It argues an "economics lesson from America's favorite pastime." Specifically: How can it be true that high salaries are earned by merit in America, when it's not even the case in baseball -- one of the few fields in which we have an objective record of employee performance?

The problem is, though, that baseball players *are* paid according to ability. The author's own data shows that, despite his claims to the contrary. 

------

Cassidy starts by charting the 20 highest-paid players in baseball last year, from Alex Rodriguez ($29 million) to Ryan Howard ($20 million). He notes that only two of the twenty ranked in the top 35 in Wins Above Replacement (WAR). The players in his list average only 2.2 WAR. That's not exceptional: it's "about what you need to be an everyday starter."  

It sounds indeed like those players were overpaid. But it's not quite so conclusive as it seems.

WAR is a measure of bulk contribution, not a rate stat. So it depends heavily on playing time. A player who misses most of the season will have a WAR near zero. 

In 2013, Mark Teixeira played only 15 games with a wrist injury before undergoing surgery and losing the rest of the season. He hit only .151 in those games, which would explain his negative (-0.2) WAR. However, it's only 53 AB -- even if Texeira had hit .351, his WAR would still have been close to zero. 

A-Rod missed most of the year with hip problems. Roy Halladay pitched only 62 innings as he struggled with shoulder and back problems, and retired at the end of the season.

If we take out those three, the remaining 17 players average out to around 2.6 WAR, at an average salary of $22 million. It works out to about $8.4 million per win. That's still expensive -- well above the presumed willingness-to-pay of $5 to $6 million per expected win.

If we *don't* take out those three, it's about $10 million per win. Even more expensive, but hardly suggestive of a wide disconnect between pay and performance. At best, it suggests that the one year's group of highest-paid players performed worse than anticipated, but still better than their lower-paid peers.

Furthermore: as the author acknowledges, many of these players have back-loaded contracts, where they are "underpaid" relative to their expected year's talent earlier in the contract, and "overpaid" relative to their expected year's talent later in the contract. 

Even a contract at a constant salary is back-loaded in terms of talent, since older players tend to decline in value as they age. I'm sure the Yankees didn't expect Alex Rodriguez to perform at age 37 nearly as well as he did at 33, even though his salary was comparable ($28MM to $33MM).

All things considered, the top-20 data is very good evidence of a strong link between pay and performance in baseball. Not as strong as I would have expected, but still pretty strong.

-------

As further evidence that pay is divorced from performance, the author notes that, even limiting the analysis to players who have free-agent status, "performance explains just 13 percent of salary."  It's not just a one-year fluke. For each of the past 30 years, the r-squared has consistently hovered in a narrow band between 10 and 20 percent.

That sounds damning, but, as is often the case, it's based on a misinterpretation of what the r-squared means. 

Taking the square root of .13 gives a correlation of .36. That's not too bad: it means that 36 percent of a player's salary (above or below average) is reflected in (above- or  below-average) performance.

Still, you do have to regress salary almost 64 percent to the mean to get performance. Doesn't that show that almost two-thirds of a player's salary is unrelated to merit?

No. It shows most of a player's salary is unrelated to *performance,* not that it's unrelated to *merit*. Performance is based on merit, but with lots of randomness piled on top that tends to dilute the relationship.

You might underestimate the amount of randomness relative to talent, especially if you're still thinking of those top-20 players. But most players in MLB are not far from the league minimum, both in salary and talent.

According to the article, the 358 lowest-paid players in baseball in 2013 made an average $534,000 each. 

With a league minimum of $500,000, those 358 players must be clustered very tightly together in pay. And the range of their talent is probably also fairly narrow. But the range of their performance will be wide, since they'll vary in how much playing time they get, and whether they have a lucky or unlucky year. 

For those 358 players alone, the correlation between pay and performance is going to be very close to zero, even if pay and talent correlate perfectly. (Actually, the author's numbers are based on only players with 6+ seasons in MLB, so it's a smaller sample size than 358 -- but the logic is the same.)

When you add in the rest of the players, and the correlation rises to 0.36 ... that's pretty good evidence that there's a strong link between pay and performance overall. And when you take into account that there's also significant randomness in the performances of the highly-paid players, it must be that the link between pay and *merit* is even higher.

------

The author has demonstrated the "low r-squared" fallacy -- the idea that if the number looks low enough, the relationship must be weak enough to dismiss. As I have argued many times, that's not necessarily the case. Without context or argument, the "13 percent" figure could mean anything at all.

In fact, here's a situation where you have an r-squared much lower than .13, but a strong relationship between pay and performance.

Suppose that player salary were somehow exactly proportional to performance. That is, at the end of the season, the r-squared turned out to be 100 percent, instead of 13 percent. (Or some number close enough to 100 percent to satisfy the author.)

In baseball, as in life, people don't perform exactly the same every day. Some days, Mike Trout will be the highest-paid player in baseball, but he'll still wind up going 0-for-4 with three strikeouts.

So even if the correlation between season pay and season performance is 100% perfect, the correlation between *single game* pay and *single game* performance will be lower.

How much lower?  I ran a test with real data. I compiled batter stats for every game of the 2008 season, and ran a regression between the player's on-base percentage (OBP) for that single game, versus his OBP for the season. 

The correlation was .016. That's an r-squared of .000265.

The r-squared of .13 the article found between pay and performance is almost *five hundred times* as large as the one I found between pay and performance. 

Even though my r-squared is tiny, we can agree that Mike Trout is still paid on merit, right? It would be hard to argue that there was a fundamental inequity in MLB pay practices for April 11, just because Mike Trout didn't produce that day.

Well, I suppose, on a technicality, you could argue that pay isn't based on merit for a game, but *is* based on merit for a season. But if you make that argument for game vs. season, you can make the same argument for season vs. expectation, or season vs. career. 

The r-squared might be only 13 percent for a single season, but higher for groups of seasons. Furthermore, if you could play the same season a million times over, luck would even out, performance would converge on merit, and the r-squared would move much closer to 100%.

And the article provides evidence of that! When the author repeated his regression by using the average of three seasons instead of one, the r-squared doubled -- now explaining "just a quarter of pay." An r-squared of 0.25 is a correlation of 0.5 -- half of performance now reflected in salary.

Half is a lot, considering the amount of luck in batting records, and taking into account that luck is much more important than talent for the bunch of players clustered at the bottom of the salary scale. 

Again, the article's own evidence is enough to refute its argument.

-------

I think we can quantify the amount of luck in a batter's season WAR. 

A couple of years ago, I calculated the theoretical SD of a team's season Linear Weights batting line that's due to luck. It came out to 31.9 runs. 

Assuming a regular player gets one-ninth of a team's plate appearances, his own SD would be 1/3 the team's (the square root of 1/9). So, that's about 10.6 runs. Let's call it 10 runs, or 1.0 WAR. 

That one-win figure, though, counts only the kind of luck that results from over- or undershooting talent. It doesn't consider injuries, suspensions, or sudden unexpected changes in talent itself. 

Going back to the top 20 players in the chart ... we saw that three of those had injuries. Another three, it appears, had sudden drops in ability after they were signed (Vernon Wells, Tim Lincecum, and Barry Zito). 

Removing those six players from the list (which might be unfair selective sampling, but never mind for now), the remainder averaged 3.4 WAR. That's about $6.4 million per win -- very close to the consensus number. It would be even lower if we adjusted for back-loaded contracts.

At an SD of 1 WAR per player, the SD of the average of 14 players is 0.27 WAR. Actually, that's the minimum; it would be higher if any of the 14 were less than full-time. Also, the list includes starting pitchers -- I don't know if the luck SD of 1 win is reasonable for starters as well as batters, but I suspect it's close enough.

So, let's go with 0.27. We'll add and subtract 2 SD -- 0.54 --from the observed average of 3.4. That gives us a confidence interval of 2.9 to 3.9 WAR.

At 3.9 WAR, we get $5.6 million per win: almost exactly the amount sabermetricians (and probably front offices) have calculated based on the assumption that teams want to pay exactly what the talent is worth.

That is: it appears the results are not statistically significantly different from a pure "pay for performance" situation.

------

When the US News article talks about luck, it's different from the kind of luck I'm calculating here. The author isn't actually complaining that the overpaid players got unlucky and underperformed their pay. Instead, he believes that the highly-paid players were overpaid for their true ability, because they were "lucky" enough to fool everyone by having a career year at exactly the right time:


"In America, we tend to think of income as a reward for skill and hard work. ...

"But baseball shows us this view of the world is demonstrably flawed. 
Pay has preciously little to do with performance. Instead, being a top earner means having a good season immediately preceding free agency in a year where desperate, rich teams are willing to award outsized long-term contracts. ... 

"In other words, while ability and effort matter, it’s also about good luck."

Paraphrased, I think he's saying something like: "I've shown that pay is barely related to performance. Why, then, are some players paid huge sums of money, while others make the minimum?  It can't be merit. It must be that some players have a lucky year at a lucky time, and GMs don't realize the player doesn't deserve the money."

In other words: baseball executives are not capable of evaluating players well enough to realize that they're throwing away millions of dollars unnecessarily.   

The article gives no evidence to support that; and, furthermore, it appears that the author himself doesn't try, himself, to evaluate players and factor out luck. Otherwise, he wouldn't say this:


"But among average players, salaries vary enormously. For every Francisco Cervelli (Yankees catcher, $523,000 salary, 0.8 WAR), there is a CC Sabathia (Yankees pitcher, $24.7 million salary, 0.3 WAR). Both contribute about the same to the Yankees’ success (or lack thereof), but Sabathia earns roughly 50 times more."

Does he really believe that Sabathia and Cervelli should have been paid as equal talents?  Isn't it obvious that their 2013 records are similar only because of luck and circumstance?

Francisco Cervelli earned his +0.8 WAR in 61 plate appearances. That's about one-and-a-half SDs above +0.3, his then-career average per 61 PA.

Sabathia's salary took a jump after the 2009 season, at a time where he was averaging around 4 WAR per season. From 2010 to 2012, he actually improved that trend, creating +15.6 WAR total in those three years. It wasn't until 2013 that he suddenly lost effectiveness, dropping to 0.3 as reported. 

So it's not that Sabathia was just lucky to be in the right place at the right time. It's that he was an excellent player before and after signing his contract, but he suffered some kind of unexpected setback as he aged. (Too, his contract was structured to defer some of his peak years' value to his declining years.)

And it's not that Cervelli was unlucky to be in the wrong place at the wrong time, unable to find a "desperate" team otherwise willing to pay him $20 million. He's just a player recognized as not that much better than replacement, who had a good season in 2013 -- a "season" of 61 plate appearances where he was somewhat lucky.

-------

In his bio, the author is described as "a policy associate at The Century Foundation working on issues of income inequality." That's really what the article is getting at: income inequality. The argument that MLB pay is divorced from performance is there to support the broader argument that inequality of income is caused by highly-paid employees who don't deserve it.

Here's his argument summarized in his own words:


"The first thing to appreciate is just how unequal baseball is. During the 2013 season, the eight players in baseball's 'top 1 percent' took home $197 million, or $6 million more than the 358 lowest-paid players combined. The typical, or 'median,' major league player would need to play 20 seasons to earn as much as a top player makes in one. ...

"But ... pay has preciously little to do with performance. ...

"In other words, while ability and effort matter, it’s also about good luck. And if that’s true of a domain where every aspect of performance is meticulously measured, scrutinized and endlessly debated, how much more true is it of our society in general?

"We end up with CEOs that make 300 times the average worker and 45 million poor people in a country with $17 trillion in GDP. And we accept it as fair."

Paraphrasing again, the argument seems to be: "Salary inequality in baseball is bad because it's caused by teams rewarding ability that isn't really there. If baseball players were paid according to performance instead of circumstance, those disturbing levels of inequality would drop substantially, and the top 1% would no longer dominate."

It sounds reasonable, but it's totally backwards. If the correlation between pay and performance were higher, players' pay would become MORE unequal.

Suppose salaries were based directly on WAR. At the end of the season, the teams pay every free agent $6 million dollars for every win above zero, plus the $500,000 minimum. (That's roughly what they're paying now, on expectation. Since expected wins equal actual wins, that would keep the overall MLB free-agent payroll roughly the same.)

Well, if they did that the top salary would take a huge, huge jump.

Among the top 20 in the chart, the top two WAR figures are 7.5 (Miguel Cabrera) and 7.3 (Cliff Lee. 

Under the new salary scale, both players would get sharp increases. Cabrera would jump to $45 million, and Lee to $44 million. The highest salary in MLB would go to Carlos Gomez, whose 2013 season was worth 8.9 WAR (4.6 of that from defense). Under the new system, Gomez would earn some $53 million. 

Under pay-for-performance, it would take only around 4.8 WAR to earn more than the current real-life highest salary, A-Rod's $29.4 million. In 2013, that would have been accomplished by 32 players

Carlos Gomez's salary would exceed the real-life A-Rod by 82 percent. Meanwhile, replacement players would still be making the minimum $500K. And Barry Zito, with his negative 2.6 wins, would *owe* the Giants $15 million. 

Clearly, inequality would increase, not decrease, if the connection between pay and performance became stronger. 

Mathematically, that *has* to happen. When luck is involved, and applies equally to everyone, the set of outcomes always have a wider range than the set of talents. As usual,

var(outcomes) = var(talent) + var(luck)

Since var(luck) is always positive, outcomes always have a wider range than expectations based on talent. 

In fairness to the author, he doesn't think teams are paid by talent. As we saw, he believes teams pay by misinterpreting random circumstances, a "right place right time" or "team likes me" kind of good luck. 

If that's really happening, and you eliminate it by basing pay directly on measurable performance, then, yes, it's indeed possible for inequality to go down. If Francisco Cervelli were being paid $100 million per season, because he was Brian Cashman's favorite, then instituting straight pay-by-performance would lower the top salary from $100 million to $53 million, and inequality would decrease.

But, as we saw, that's not the case: the real-life top salaries are much lower than the "pay-by-performance" top salaries. That means that teams aren't systematically overpaying. Or, at least, that they're not overpaying by anything near as much as 82 percent.

-------

Imagine an alternate universe in which players have always been paid under the "new" system, $6 million per WAR. In that universe, as we have seen, the ratio between the top and median salaries is much higher than it is now, maybe 50 times instead of 20.

Then, someone comes along and presents a case for more equality:


"MLB salaries aren't as fair as they could be. They're based on outcomes, where they should be based on talent. Francisco Cervelli gets credit for 0.8 wins in 61 PA, even though we know he's not that good, and he just happened to guess right on a couple of pitches. 

"Players should be paid based on their established and expected performance, by selling their services to the highest bidder, before the season starts. That eliminates luck from the picture, and salaries will be based more on merit. The salary ratio will drop from 50 to 20, the range will compress, and the top players will earn only what they merit, not what they produce by luck."

Isn't THAT the situation that you'd expect someone to advocate if they were concerned about (a) rewarding on merit, (b) not rewarding on luck, and (c) reducing inequality of salaries?

Why, then, is this author advocating a move in the exact opposite direction?




(Hat tip: T.M.)


Labels: , , ,

Tuesday, October 14, 2014

Corsi, shot quality, and the Toronto Maple Leafs, part VI

A year ago, I wrote about how I wasn't completely sold on Corsi and Fenwick as unbiased indicators of future NHL success. In a series of five posts (one two three four five), I argued that it did appear that "shot quality" issues could be a big factor -- if not for all teams, then maybe at least for some of them, like, perhaps, the Toronto Maple Leafs.

I haven't kept up with hockey sabermetrics as much as I should have, but, as far as I know, the issue of how much shot quality impacts Corsi remains unresolved.

In that light, and in hopes that I haven't rediscovered the wheel, here's some more evidence I came across that seems to suggest shot quality might be a bigger issue than even I had suspected.

It's from a post at Hockey-Graphs, where Garret Hohl looked at some shot quality statistics for every NHL team, for approximately the first 30 road games of last season (2013-14). 

His data came from Greg Sinclair's "Super Shot Search," which plots every shot on goal by plotting it on the ice surface. Sinclair's site allows you restrict your search to what he calls "scoring chances," which are shots taken from closer in. Specifically, a "scoring chance" is defined as a shot on goal taken from within the pentagon formed by the midpoint of the goal line, the faceoff dots, and the tops of the two circles. 

Hohl calculated, for every team, what percentage of opposing shots were close-in shots. (He limited the count to 5-on-5 situations in road games, in order to reduce power-play and home-scorer biases.)  Data in hand, he then ran a regression to see how well a team's "regular" Fenwick corresponded to its "scoring chances only" Fenwick. His chart shows what appears to be a strong relationship, with a correlation of 0.83. 

However: the biggest outlier was ... Toronto. 

Just as in the previous two seasons, the Leafs continued to outperform their Fenwick in 2013-14. What Hohl has done is to produce some data that shows that the effect resulted, at least in part, by their opposition taking lower quality shots. 

----

Anyway, the Leafs are really just a side point. What struck me as much more important are some of the other implications of the data Hohl unearthed. Specifically, how teams varied so much in those opponent scoring chances. The differences were much, much larger than I expected.

I'll steal Hohl's chart:




The Minnesota Wild defense was the best at limiting their opponents to weaker shots: only 32.3 percent of their shots allowed were from in close (206 of 637). The New York Islanders were the worst, at 61.4 percent (475 of 773). 

Shot for shot, the Islanders gave up twice as many close-in chances as the Wild. 

Could this be luck?  No way. The average number of shots in Hohl's table is around 750. If the average scoring-chance ratio is 44 percent, the SD from binomial luck should be around 1.8 percentage points. That would put the Islanders around 9 SD from the mean, and the Wild 7 SD from the mean. 

The observed SD in the chart is 5.6 percentage points. That means the breakdown is:

1.8 SD of theoretical luck
5.3 SD of real differences
--------------------------
5.6 SD as observed

Now, the "real" differences might be score effects: shooting percentages rise when a team is ahead, presumably because they take more chances and give up more odd-man rushes, and such. Those effects are large enough that they screw up a lot of analyses, and I wish more of those little studies you find on the web would limit themselves to 5-on-5 tied to avoid those biases.

But, in this case, the differences are too big to just be caused by score effects.

In 5-on-5 situations from 2007-2013, the league shooting percentage was 7.52 percent when teams were tied, but 9.19 percent for teams ahead by 2 goals or more. As big an difference as that is, it can't be that the Islanders were behind 2+ goals that much that it could make such a huge difference in scoring chances.

From my calculations, the difference between the Islanders and Wild is something that would happen naturally only if the Islanders were *always* down 2+ goals, and the Wild were *always* up 2+ goals.** But that obviously isn't the case. In fact, the Islanders were down 2+ goals only about 10 percent more often than the Wild last year, and up 2+ goals only 21 percent less often. The total of the two differences is about eight periods total out of a full 5-on-5 road season.

(** How did I figure that?  Suppose the shooting percentage on close shots is 13%, and 4% on far shots. At 45 percent close and 55 percent far, you get a shooting percentage of 8.1% percent. At 65 percent close, and 35 percent far, shooting percentage rises to 9.8%. That's a little bigger than the difference between up 2+ and tied.

So, it seems like, when you're up 2+ goals, 60 to 65 percent of your shots are scoring chances, compared to 35 to 40 percent when you're down 2+ goals.)

------

As for the Leafs: they were fourth-best in the league in percentage of shots that were scoring chances, at 38.2%. That's despite -- or because of? -- allowing the most shots, by far, of any team in the sample, at 926. (The second highest was Washington, at 843.)

It seems to me like this is significant evidence that teams vary in the quality of shots they allow -- in a huge way. The score effects can't be THAT large.

The only possibility that I can think of is biased scorers. But Hohl confirms that each team had an assortment of opposition home team scorers and rinks, so that shouldn't be happening.

-----

Here's some additional evidence that the scoring chance data is meaningful. 

I ran a correlation between team scoring chance percentage and goalie save percentage. If scoring chance percentage didn't matter, the correlation would be low. If it did matter, it would be high. (For save percentage, I used 5-on-5, tie score, both home and road.)

The correlation turned out to be ... -0.44. That's pretty high. (Especially considering that the scoring chance percentage was based on only 30 road games per team.)  

The SD of save percentage was 0.96 percentage points. The SD of scoring chance percentage (after 3/4 of the season) was 5.6 points. 

That means for every excess percentage point of scoring chance percentage, you have to adjust save percentage by 0.075 percentage points. 

The Los Angeles Kings gave up a bit more than 3 percentage points weaker shots than normal. That had the effect of inflating their goalies' save percentage by about 0.25 points. So, we can estimate that their "true talent" was closer to 93.45 than 93.7. 

If you like, think of it as two or three points of PDO: the Kings move from 1000 to 997.5 on this adjustment. 

For Toronto, it's five points: they drop from 1019 to 1014. 

The Rangers, for one more example, went the other way -- they gave their opponents 8 percentage points more close-in shots than average. Adjusting for that would boost their adjusted save percentage from 91.6 to 92.2, and their PDO from 974 to 980.

-----

OK, one more bit of evidence, this time subjective.

Recently, a survey from nhl.com ranked the best goalies in the league, from 1 to 14, with 15-18 mentioned in the footnotes. (I'm leaving out John Gibson, who only played one regular-season game, and I'm considering goalies not mentioned to have a ranking of 19.)

I checked the correlation between team goalie ranking and save percentage. It was -0.45. Again, that's pretty strong, considering how subjective the rankings are. 

Of course, some goalies were probably ranked high *because* of their the save percentage. So cause and effect are partly mixed up here (but I think that will actually strengthen this argument).

For the next step, I adjusted each goalie's save percentages to give credit for the quality of the shots their team faced. That is, I raised or lowered their SV% for the shot quality percentages listed in Hohl's post, at the rate of 0.075 points we discovered earlier. 

What happened?  The correlation between ranking and SV% got *stronger* -- moving from -0.45 to -0.50. 

It looks like the voters "saw through" the illusion in save percentage caused by differing shot quality. Well, that might be giving them too much credit: they might have ignored save percentage entirely, and just concentrated on what they saw with their eyes. Actually, I'm probably giving them too *little* credit: they're no doubt basing their evaluations on a full career, not just one season, and maybe team shot quality evens out somewhat in the long run.

Either way, when the voters differed from SV%, it was in the direction of the goalies who faced tougher tasks.  I think that's reasonable evidence that differences in shot quality are real. 

Oh, and one more thing: the highest correlation seems to occur almost exactly at the theoretical adjustment the regression picked out, 0.075. When I drop the adjustment in half (to 0.0375), the correlation drops a bit (-0.48, I think). When I double the adjustment to 0.15, the correlation drops to -0.44. 

Now, that *has* to be coincidence; the voters can't be that well calibrated, can they? And ranking numbers of 1 to 19 are kind of arbitrary.

Still, it does work out nicely, that the voters do seem to agree with the regression.

------

I think all this casts serious doubt on the idea that PDO (the sum of team shooting percentage and save percentage) is essentially random. The Islanders had a league-worst PDO of 982, but that's probably because their opponents took 61.4% of their shots from close-in, compared to the Islanders' own 42.8%. In other words, if you calculate a "shot quality PDO", the Islanders come in at 814. (That's calculated as 428 + (1000-614).)

The Leafs had the league's fourth best PDO, at 1019. But their shots were much higher quality than their opponents', 47.2% to 38.2%. So their "shot quality PDO" was 1090. 

For all 30 teams, the correlation between PDO and "shot quality PDO" was 0.43 -- signficantly high. The coefficient works out to approximately a 1:10 ratio. The Islanders' -186 point "shot quality PDO" difference translates to around -19 points of PDO. The Leafs' +90 works out to about +9.

I'll show data and work out more details in a future post (probably next week, I'm out of town for a few days starting tomorrow). 

(One thing that's interesting, that I want to look into, is that the SD of team quality shot percentage *for* is only about half of the SD of quality shot percentage *against* (2.7 versus 5.6). Does that mean that defenses vary more than offenses? Hmmm...)

------

So I think all of this comprises strong evidence that teams differ non-randomly in the quality of shots they allow. That doesn't invalidate the hypothesis that Corsi is still a better predictor of future success than goals scored. But it *does* suggest that you can likely improve Corsi by adjusting it for shot quality. And it *does* suggest that PDO isn't random after all.

In other words: Corsi might be misleading for teams with extreme shot quality differences.

A baseball analogy: using Corsi to evaluate NHL teams is like using on-base percentage average to evaluate MLB teams. Some baseball teams will do much better or worse than their "OBP Corsi", for non-random reasons -- specifically, if they have high "hit quality" by hitting lots of home runs, or low "hit quality" by building their "OBP Corsi" on "lower quality" walks.

In 2014, the Orioles were fifth-worst in the American League with an OBP of only .311. But they were above average in runs scored. Why?  Mostly because they hit more home runs than any other team, by a wide margin.

Might the Toronto Maple Leafs be the Baltimore Orioles of the NHL?


Labels: , , , , ,

Sunday, September 28, 2014

Experts

Bill James doesn't like to be called an "expert." In the "Hey Bill" column of his website, he occasionally corrects readers who refer to him that way. And, often, Bill will argue against the special status given to "experts" and "expertise."

This, perhaps understandably, puzzles some of us readers. After all, isn't Bill's expertise the reason we buy his books and pay for his website?  In other fields, too, most of what we know has been told to us by "experts" -- teachers, professors, noted authors. Do we want to give quacks and ignoramuses the same respect as Ph.Ds?

What Bill is actually arguing, I think, is not that expertise is useless -- it's that in practice, it's used to fend off argument about what the "expert" is saying.  In other words, titles like "expert" are a gateway to the fallacy of "argument from authority."

On November 8, 2011 (subscription required), Bill replied to a reader's question this way:


"I've devoted my whole career to battling AGAINST the concept of expertise. The first point of my work is that it DOESN'T depend on expertise. I am constantly reminding the readers not to regard me an expert, because that doesn't have anything to do with whether what I have to say is true or is not true."

In other words: don't believe something because an "expert" is saying it. Believe it because of the evidence. 

(It's worth reading Bill's other comments on the subject; I wasn't able to find links to everything I remember, but check out the "Hey Bill" pages for November, 2011; April 18, 2012; and August/September, 2014.)

Anyway, I'd been thinking about this stuff lately, for my "second-hand knowledge" post, and Bill's responses got me thinking again. Some of my thoughts on the subject echo Bill's, but all opinions here are mine.

-------

I think believing "experts" is useful when you're looking for the standard, established scientific answer.  If you want to know how far it is from the earth to the sun, an astronomer has the kind of "expertise" you can probably accept.

We grow up constantly learning things from "experts," people who know more than we do -- namely, parents and teachers. Then, as adults, if we go to college, we learn from Ph.D. professors. 

Almost all of our formal education comes from learning from experts. Maybe that's why it seems weird to hear that you shouldn't believe them. How else are you going to figure out the earth/sun distance if you're not willing to rely on the people who have studied astronomy?

As I wrote in that previous post, it's nice to be able to know things on your own, directly from the evidence. But there's a limit to how much we can know that way. For most factual questions, we have to rely on other people who have done the science that we can't do.

-------

The problem is: in our adult, non-academic lives, the people we call "experts" are rarely used that way, to resolve issues of fact. Few of the questions in "Ask Bill" are about basic information like that. Most of them are asking for opinion, or understanding, or analysis. They want to pick Bill's brain.

From 1/31/2011: "Would you have any problem going with a 4-man rotation today?"

From 10/7/2013: "Bill, you wrote in an early Abstract that no one can learn to hit at the major league level. Do you still believe that?"

From 10/29/2012: "Do you think baseball teams sacrifice bunt too much?"

In those cases, sure, you're better off asking Bill than asking almost anyone else, in my opinion. Even so, you shouldn't be arguing that Bill is right because he's an "expert."  

Why?  Because those are questions that don't have an established, scientific answer based on evidence. In all three cases, you're just getting Bill's opinion. 

Moreover: all three of those issues have been debated forever, and there's *still* no established answer. That means there are opinions on both sides. What makes you think the expert you're currently asking is on the correct side? Bill James doesn't think a four-man rotation is a bad idea, but any number of other "experts" believe the opposite. 

Subject-matter experts should agree on the basic canon, sure. It should be rare that a physics "expert" picks up a textbook and has serious disagreements with anything inside.

But: they can only agree on answers that are known. In real life, most interesting questions don't have an answer yet. That's what makes them so interesting!

When will we cure cancer? What's the best way to fight crime? When should baseball teams bunt? Will the Seahawks beat the spread?

Even the expertest expert doesn't know the answer to those questions. Some of them are unknowable. If anyone was "expert" enough to predict the outcome of football games, he'd be the world's richest gambler. 

-----

All you can really expect from an expert is that he or she knows the state of the science.  An expert is an encyclopedia of established knowledge, with enough understanding and experience to draw inferences from it in established ways.

Expertise is not the same as intelligence. It is not the same as wisdom. It is not the same as insight, or freedom from bias, or prescience, or rationality.

And that's why you can get different "experts" with completely different views on the exact same question, each of them thinking the other is a complete moron. That's especially true on controversial issues. (Maybe it's not that controversial issues are less likely to have real answers, but that issues that have real answers are no longer controversial.)

On those kinds of issues, where you know there are experts on both sides, you might as well flip a coin as rely on any given expert.

And hot-button issues are where you find most of the "experts" in the media or on the internet, aren't they?  I mean, you don't hear experts on the radio talking how many neutrons are in an atom of vanadium. You hear them talking about what should be done to revive the sagging economy. Well, there's no consensus answer for that. If there were, the Fed would have implemented it long ago, and the economy would no longer be sagging. 

Indeed, the fact that nobody is taking the expert's advice is proof that there must be other experts that think he's wrong.

Sometimes, still, I find myself reading something an expert says, and nodding my head and absorbing it without realizing that I'm only hearing one side. We don't always conciously notice the difference, in real time, between consensus knowledge and the "expert's" own assertions. 

Part of the reason is that they're said in the same, authoritative tone, most of the time. Listen to baseball commentators. "Jeter is hitting .302." "Pitching is 75 percent of baseball." You really have to be paying attention to notice the difference. And, if you don't know baseball, you have no way of knowing that "75 percent of baseball" isn't established fact! At least, until you hear someone dispute it.

Also, I think we're just not used to the idea that "experts" are so often wrong. For our entire formal education, we absorb what they teach us about science as unquestionably true. Even though we understand, in theory, that knowledge comes from the scientific method ... well, in practice, we have found that knowledge comes from experts telling us things and punishing us for not absorbing them.  It's a hard habit to break.

------

The fact is: for every expert opinion, you can find an equal and opposite expert opinion. 

In that case, if you can't just assume someone's right just because he's an expert, can you maybe figure out who's right by *counting* experts?  

Maybe, but not necessarily. As Bill James wrote (9/8/14),


"An endless list of experts testifying to falsehood is no more impressive than one."

It used to be that an "endless list" of experts believed that W-L record was the best indication of a pitcher's performance. It used to be that almost all experts believed homosexuality was a disease. It used to be that almost no experts believed that gastritis was caused by bacteria -- until a dissenting researcher proved it by drinking a beaker of the offending strain. 

Each of those examples (they're mine, not Bill's) illustrates a different way experts can be wrong. 

In the first case, pitcher wins, the expert conventional wisdom never had any scientific basis -- it just evolved, somehow, and the "experts" resisted efforts to test it. 

In the second case, homosexuality, I suspect a big part of it was the experts interpreting the evidence to conform to their pre-existing bias, knowing that it would hurt their reputations to challenge it. 

In the third case ... that's just the scientific method working as promised. The existing hypothesis about gastritis was refuted by new evidence, so the experts changed their minds. 

Bill has a fourth case, the case of psychiatric "expert witnesses" who just made stuff up, and it was accepted because of their credentials. From "Hey Bill," 11/10/2011 and 11/11/2011:


"Whenever and wherever someone is convicted of a crime he did not commit, there's an expert witness in the middle of it, testifying to something that he doesn't actually know a damned thing about.  In the 1970s expert witnesses would testify to the insanity of anybody who could afford to pay them to do so."

"Expert witnesses are PRAISED by professional expert witnesses for the cleverness with which they discuss psychological concepts that simply don't exist."

In none of those cases would you have got the right answer by counting experts. (Well, maybe in the third case, if you counted after the evidence came out.)  

Actually, I'm cheating here. I haven't actually shown that the majority isn't USUALLY right. I've just shown that the majority isn't ALWAYS right. 

It's quite possible that those four cases were rare exceptions: that, most of the time, when the majority of experts agree, they're generally right. Actually, I think that's true, that the majority is usually right -- but I'm only willing to grant that for the "established knowledge" cases, the "distance from the earth to the sun" issues. 

For issues that are legitimately in dispute, does a majority matter?  And does the size matter?  Does a 80/20 split among experts really mean significantly more reliability than a 70/30 split?  

Maybe. But if you go by that, it's not *knowing*, right?  It's just handicapping. 

Suppose 70% of doctors believe X, and, if you look at all times that seventy percent of doctors believed something else, 9 out of 10 of those beliefs turned out to be true. In that case, you can't say, "you must trust the majority of experts."  You have to say, at best, "there's a 9 out of 10 chance that X is true."

But maybe I can say more, if I actually examine the arguments and evidence.

I can say, "well, I've examined the data, and I've looked at the studies, and I have to conclude that this is the 1 time out of 10 that the majority is dead wrong, and here is the evidence that shows why."  

And you have no reply to that, because you're just quoting odds.

And that's why evidence trumps experts. 

Here's Bill James on climate scientists, 9/9/2014 and 9/10/2014:


"[You should not believe climate scientists] because they are experts, no. You should believe them if they produce information or arguments that you find persuasive. But to believe them BECAUSE THEY ARE EXPERTS -- absolutely not.

"It isn't "consensus" that settles scientific disputes; it is clear and convincing evidence. An issue is settled in science when evidence is brought forward which is so clear and compelling that everyone who looks at the evidence comes to the same conclusion. ... The issue is NOT whether scientists agree; it is whether the evidence is compelling."

If you want to argue that something is true, you have two choices. You can argue from the evidence. Or, you can argue from the secondhand evidence of what the experts believe. 

But: the firsthand evidence ALWAYS trumps the secondhand evidence. Always. That's the basis of the entire scientific method, that new evidence can drive out an old theory, no matter how many experts and Popes believe they're wrong, and no matter how strongly they believe it.

You're talking to Bob, a "denier" who doesn't believe in climate change. You say to Bob, "how can you believe what you believe, when the scientists who study this stuff totally disagree with you?"

If Bob replies, "I have this one expert who says they're wrong" ... well, in that case, you have the stronger argument: you have, maybe, twenty opposing experts to his one. By Bob's own logic -- "trust experts" -- the probabilities must be on your side. You haven't proven climate change is real, but you've convincingly destroyed Bob's argument. 

However: if Bob replies, "I think your twenty experts are wrong, and here's my logic and evidence" -- well, in that case, you have to stop arguing. He's looking at firsthand evidence, and you're not. Your experts might still be right, because maybe he's got bad data, or he's misinterpreting his evidence, or his worthless logic comes out of the pages of the Miss America Pageant. Still, your argument has been rendered worthless because he's talking evidence, which you're not willing or able to look at directly.

As I wrote in 2010,


"Disbelieving solely because of experts is NOT the result of a fallacy. The fallacy only happens when you try to use the experts as evidence. Experts are a substitute for evidence. 

"You get your choice: experts or evidence. If you choose evidence, you can't cite the experts. If you choose experts, you can't claim to be impartially evaluating the evidence, at least that part of the evidence on which you're deferring to the experts. 

"The experts are your agents -- if you look to them, it's because you are trusting them to evaluate the evidence in your stead. You're saying, "you know, your UFO arguments are extraordinary and weird. They might be absolutely correct, because you might have extraordinary evidence that refutes everyone else. But I don't have the time or inclination to bother weighing the evidence. So I'm going to just defer to the scientists who *have* looked at the evidence and decided you're wrong. Work on convincing them, and maybe I'll follow."  

In other words: it's perfectly legitimate to believe in climate change because the scientific consensus is so strong. It is also legitimate to argue with people who haven't looked at the evidence and have no firsthand arguments. But it is NOT legitimate to argue with people who ARE arguing from the evidence, when you aren't. 

That they're arguing first-hand, and you're not, doesn't necessarily mean you're wrong. It  just means that you have no argument or evidence to bring to the table. And if you have no evidence in a scientific debate, you're not doing science, so you need to just ... well, at that point, you really need to just shut up.

The climate change debate is interesting that way, because, most of the activist non-scientists who believe it's real really haven't looked at the science enough to debate it. A large number have *no* firsthand arguments, except the number of scientists who believe it. 

As a result, it's kind of fun to watch their frustration. Someone comes up with a real argument about why the data doesn't show what the scientists think it does, and ... the activists can't really respond. Like me, most have no real understanding of the evidence whatsoever. They could say, like I do to the UFO people, "prove it to the scientists and then I'll listen," but they don't. (I suspect they think that sounds like they're taking the deniers seriously.)

So, they've taken to ridiculing and name-calling and attacking the deniers' motivations. 

To a certain extent, I can't blame them. I'm in the same situation when I read about Holocaust deniers. I mean the serious ones, the "expert" deniers, the ones who post blueprints of the death camps and prepare engineering and logistics arguments about how it wasn't possible to kill that many people in that short a time. And what can I do?  I let other expert historians argue their evidence (which fortunately, they do quite vigorously), and I gnash my teeth and maybe rant to my friends.

That's just the way it has to be. You want to argue, you have to argue the evidence. You don't bring a knife to a gunfight, and you don't bring an opinion poll to a scientific debate.

Labels: , , ,