Sabermetrics vs. second-hand knowledge
Does the earth revolve around the sun, or does the sun revolve around the earth?
The earth revolves around the sun, of course. I know that, and you know that.
But do we really?
If you know the earth revolves around the sun, you should be able to prove it, or at least show evidence for it. Confronted by a skeptic, what would you argue? I'd be at a loss. Honestly, I can't think of a single observable fact that I could use to make a case.
I say that I "know" the earth orbits the sun, but what I really mean by that is, certain people told me that's how it is, and I believe them.
Not all knowledge is like that. I truly *do* know that the sun rises in the east, because I've seen it every day. If a skeptic claimed otherwise, it would be easy to show evidence: I'd make sure he shared my definition of "east," and then I'd wake him up at 6 am and take him outside.
But that sun/earth thing? I can only I only say I "know" it because I believe that astronomers *truly* know it, from direct evidence.
It occurred to me that almost all of our "knowledge" of scientific theories comes from that kind of hearsay. I couldn't give you evidence that atoms consist, roughly, of electrons orbiting a nucleus. I couldn't prove that every action has an equal and opposite reaction. There's no way I could come close to figuring out why and how e=mc^2, or that something called "insulin" exists and is produced by the pancreas. And I couldn't give you one bit of scientific evidence for why evolution is correct and not creationism.
That doesn't stop us from believing, really, really strongly, that we DO know these things. We go and take a couple of undergraduate courses in, say, geology, and we write down what the professors tell us, and we repeat them on exams, and we solve mathematical problems based on formulas and principles we are told are true. And we get our credits, and we say we're "knowledgeable" in geology.
But it's a different kind of knowledge. It's not knowledge that we have by our own experience or understanding. It's knowledge that we have by our own experience of how to evaluate what we're told -- how and when to believe other people. We extrapolate from our social knowledge. We believe that there are indeed people, "geologists," who have firsthand evidence. We believe that evidence gets disseminated among those geologists, who interact to reliably determine which hypotheses are supported and which ones are not. We believe that, in general, the experts are keeping enough of a watchful eye on what gets put in textbooks and taught at universities, that if Geology 101 was teaching us falsehoods, they'd get exposed in a hurry.
In other words, we believe that the system of scientists and professors and Ph.D.s and provosts and deans and journals and textbook publishers is a reliable separator of truth from falsehood. We believe that, if the earth really were only 6,000 years old, that's what scientists would be telling us.
Most of the time, it doesn't matter that our knowledge is secondhand. We don't need to be able to prove that swallowing arsenic is fatal; we just need to know not to do it. And, we can marvel at Einstein's discovery that matter and energy are the same thing, even if we can't explain why.
But it's still kind of unsatisfying.
That's one of the reasons I like math. With math, you don't have to take anyone's word for anything. You start with a few axioms, and then it's all straight logic. You don't need geology labs and test tubes and chemicals. You don't need drills and excavators. You don't actually have to believe anyone on indirect evidence. You can prove everything for yourself.
The supply of primes is infinite. No matter how large a prime you find, there will always be one larger. That's a fact. If you like, you can look it up on the internet, or ask your math teacher, or find it in a textbook. It's a fact, like the earth revolving around the sun.
If you do it that way, you know it, but you don't really KNOW it. You can't defend it. In a sense, you're believing it on faith.
On the other hand, you can look at a proof. Euclid's proof that there is no largest prime number is considered one of the most elegant in mathematics. The versions I found on the internet use a lot of math notation, so I'll paraphrase.
Suppose you have a really big prime number, X. The question is: is there always a prime bigger than X?
Try this: take all the numbers from 1 to X, and multiply them together: 1 times 2 times 3 .... times X. Now, add 1. Call that really huge number N. That huge N is either prime, or is the product of some number of primes.
But N can't be divisible by X, or anything less than X, because that division has to always leave a remainder of 1. Therefore: either N is prime, or, when you factor N into other primes, they're all bigger than X.
Either way, there is a prime bigger than X.
I may not have explained that very well. But, if you get it ... now you know that there is no highest prime. If you read it in a book, you "know" it, but if you understand the proof, you KNOW it, in the sense that you can explain it and prove it to others.
In fact ... if you read it in a textbook, and someone tells you the textbook is wrong, you may have some doubt. But once you see the proof, you will *never* have doubt (except in your own logic). Even if the greatest mathematician in the world tells you there's a largest prime, you still know he's wrong.
In theory, everything in math is like that, provable from axioms. In practice ... not so much. The proofs get complicated pretty quickly. (When Andrew Wiles solved Fermat's Last Theorem in 1993, his proof was 200 pages long.) Still, there are significant mathematical results where we can all say we know from our own efforts. For years, I wondered why it was that multiplication goes both ways -- why 8 x 7 has to equal 7 x 8. Then it hit me -- if you draw eight rows of seven dots, and turn it sideways, you get seven rows of eight dots.
There are other fields like math that way ... you and I can know things on our own, fairly easily, in economics, and finance, and computer science. Other sciences, like physics and chemistry, take more time and equipment. I can probably prove to myself, with a stopwatch and ruler, that gravitational acceleration on earth is 9.8 m/s/s, but there's no way I could find evidence of what it is on the moon.
But: sabermetrics. What started me on all this is realizing that the stuff we know about sabermetrics is more like infinite primes than like the earth revolving around the sun. Active researchers know sabermetrics just because Bill James and Pete Palmer told us. We know because we actually see how to replicate their work, and we see, all the way back to first principles, where everything came from.
I can't defend "e equals mc squared," but I can defend Linear Weights. It's not that hard, and all I need is play-by-play data and a simple argument. Same with Runs Created: I can pull out publicly-available data and show that it's roughly unbiased and reasonably accurate. (I can even go further ... I can take partial derivatives of Runs Created and show that the values of the individual events are roughly in line with Linear Weights.)
DIPS? No problem, I know what the evidence is, there, and I can generate it myself. On-base percentage more important than batting average? Geez, you don't even need data for that, but you can still do it formally if you need to without too much difficulty.
For my own part -- and, again, many of you active analysts reading this would be able to say the same thing -- I don't think I could come up with a single major result in sabermetrics that I couldn't prove, from scratch, if I had to. Even the ones from advanced data, or proprietary data, I'm confident I could reproduce if you gave me the database.
For all the established principles that are based on, say, Retrosheet-level data ... honestly, I can't think of a single thing in sabermetrics that I "know" where I would need to rely on other people to tell me it's true. That might change: if something significant comes out of some new technique -- neural nets, "soft" sabermetrics, biomechanics -- I might have to start "knowing" things secondhand. But for now, I can't think of anything.
If you come to me and say, "I have geological proof that the earth is only 6,000 years old," I'm just going to shrug and say, "whatever." But if you come to me and say, "I have proof that a single is worth only 1/3 of a triple" ... well, in that case, I can meet you head on and prove that you're wrong.
I don't really know that creationism isn't right -- I only know what others have told me. But I *do* know firsthand what a triple is worth, just as I *do* know firsthand that there is no highest prime.
And that, I think, is why I love sabermetrics so much -- it's the only chance I've ever had to actually be a scientist, to truly know things directly, from evidence rather than authority.
I have a degree in statistics, but if nuclear war wiped out all the statistics books, how much of that science could I restore from my own mind? Maybe, a first-year probability course, at best. I could describe the Central Limit Theorem in general terms, but I have no idea how to prove it ... one of the most fundamental results in statistics, one they teach you in your first statistics class, and I still only know it from hearsay.
But if nuclear war wipes out all the sabermetrics books ... as long as someone finds me a copy of the Retrosheet database, I can probably reestablish everything. Nowhere near as eloquently as Bill James and Palmer/Thorn, and I'd probably wouldn't think of certain methods that Tango/MGL/Dolphin did, but ... yeah, I'm pretty sure I could restore almost all of it.
To me, that's a big deal. It's the difference between knowing something, and only knowing that other people know it. Not to put down the benefits of getting knowledge from others -- after all, that's where most of our useful education comes from. It's just that, for me, knowing stuff on my own ... it's much more fulfilling, a completely different state of mind. As good as it may be to get the Ten Commandments from Moses, it's even better to get them directly from God.