(Another half-baked attempt to see if I can understand something by writing it out. Riffing on a recurring theme in Mike Flynn’s blog in light of some current misunderstandings of what data can tell you. You’d be better off reading him, frankly.)
Mike Flynn has pointed out on several occasions (for the purposes of this post, here is a good one) that a ‘fact’ is, in its Latin root, a ‘thing made’. While things may BE simply, to be UNDERSTOOD they need to be, in a sense, made. The facts, in this sense, don’t speak for themselves – it would be more truthful to say that the facts confess under the duress of a theory.
What this means in practice is that one must understand the theory and practice under which the facts are made – or, in a word evocative of a harvest at the end of a season of work, ‘gathered’* – in order to understand what the facts *are*. Then, we next must understand the explanation (or theory or chain of causality) under which the facts ‘speak’. We’ll confine ourselves to the observational data and fact-making here.

To take an egregious example from an alarmingly large pool of such examples: infant mortality. It might seem that the tragic death of a baby is a pretty eloquent fact in itself. But we are not here talking directly about infant death – we are talking about the subtly different concept of mortality, which is an impersonal rate of death over some time and (usually) place.
Often, the infant mortality rates of different countries are compared. Seems fair, but it relies on two assumptions that are far from certain:
– that infant death means the same thing always and everywhere;
– that the methods of gathering the data are the same, or at least give effectively the same results.
Neither of these assumptions hold in the real world. In America, an anencephalous baby who dies minutes after birth is counted in the infant mortality statistics; in most other places, that child would be considered a still birth; in America, traditionally, heroic efforts are made to save even very premature babies, who, if those efforts fail, is counted in the infant mortality numbers; most places in the world do not make such heroic efforts, and would count the child as stillborn. American hospitals report each infant death separately and immediately; in much of the rest of the world, infant deaths are obtained not from hospitals, but from household survey data.
And so on. Note that here I’m making no judgement about which method of counting is best, merely pointing out that the results make simply comparing one country’s reported infant mortality rate to another country’s meaningless.
The danger is when some number is taken as a data point – say, infant mortality rate in India in 1998 – and then thrown in the hopper with other numbers – say, the infant mortality rate in England in 2003 – and treated as if they are both the same sort of thing and, in fact, more real than the deaths and method of counting that underlie them. Then people try to say things about these numbers that the facts that underlie them cannot be made to say. Bad science. **
Because of these considerations alone, ANY comparison of statistics gathered from different places and times is immediately suspect. Anyone doing such a thing would need to begin his paper by addressing:
1. the definitions of what is being counted;
2. the methods under which whatever is being counted is counted;
3. how the following research allows for these differences.
Any research that doesn’t do that right out of the chute is Cargo Cult Science, full stop. Even if these steps are taken, it still remains for the reader to decide whether, in fact, the results can be relied upon to tell us anything. Remember, the best answer in science is most often: we don’t know.
And it gets worse: sometimes, the researcher themselves will change the definitions in mid-stream: ignoring what the data gatherers were gathering, they apply another definition, thereby completely obscuring whatever the true relationship between the underlying observations, facts and theory is.
Finally, there’s always a judgement call or 6 in any scientific observation. One example among many: In the physical sciences, great care needs to be taken to not intentionally or unintentionally exclude observations. The classic example is from Millikan’s famous efforts to measure the charge of an electron: when he got ‘bad’ results from any particular observation, he seems to have assumed he must have just screwed up that run and threw the results out. Perfectly understandable human behavior. Nope – too easy for confirmation bias to sneak in and start defining what a ‘bad’ observation is – any that don’t support my theory.
So, a baby’s tragic death, any particular observation of an oil drop in a magnetic field, any scientific observation or classification becomes a fact only by virtue of being defined by the method and theory under which the fact is considered. If one is to compare facts, those facts must have been made under the same theory and method, or, failing that, the commensurability of the methods and theories must be established.***
* It’s lovely to think of scientific facts as the fruits of a harvest. The ground is prepared so that the natural thing – or, more precisely, the nature of the natural thing – can reveal itself. Thus, the wheat and vine yield their grain and grapes as the actualization of their natures. Man does not create those natures, but rather creates the conditions under which they can reveal themselves. Similarly, we often say that researchers are teasing out the data – we build a supercollider to provide the right conditions for the subatomic particles to reveal their natures; we put telescopes in orbit so that the nature of the cosmos can reveal itself to us. But the farmer would not farm, nor the scientist experiment, if they did not already have a good working theory of the general nature of plants and scientific , respectively.
** Take these same definitional and reporting problems, spread it across all health issues – and now imagine how much valid information is contained in an across the board comparison of healthcare across countries.For example, healthcare in England is often held up as the ideal toward which American healthcare should strive. And, anecdotally, my sister gave birth in England twice to healthy babies, and has nothing but praise for the care she received. But reports are different for 70 year old smokers needing care. Even in America, people who can afford it go to nice state of the art research hospitals, not County General. Why? Is healthcare really fungible no matter what kind of care and regardless of where it is delivered?
Those heroic efforts to save premature babies – if it were your kid, it makes a huge difference to you that he be delivered in America and not somewhere else. Yet, because of this practice, Americans will spend much more money on care from preemies than other countries. The care is not fungible.
*** This is one of the reasons real science is such a great game, and why real scientists deserve the honor we give them. And why bad science, meaning science that doesn’t recognize and uphold these conditions, needs to be called out and held up for ridicule. The price of good science is eternal vigilance. In the hard sciences, my impression is that the policing is internal and vigorous, except in those cases where the practitioners step outside their expertise. It’s in those cases, and in the firm yet not *hard* hard sciences – medicine, biology and all their related fields – that we educated peons need to step up. The soft ‘sciences’ seem beyond hope, but at least we can point out the Hegelian nature of their methods and claims.
In statistics, we speak of “operational definitions.” A measurement is defined by the method used to produce it. I have personally seen cases where different, well-calibrated instruments have produced significantly different measurements on the same part. E.g., two ASTM-approved methods for measuring coefficient of friction of a package — the inclined plane and the dynamic pull — have been observed to give different results; two dial indicators with different fixtures to hold the part gave different results on the same dimension; etc.
http://tofspot.blogspot.com/2011/12/fun-with-statistics.html
This is why they call it hard science, right? It’s so darn hard!
I’ve really enjoyed your posts on these topics, although my feeble and rusty math chops can’t keep up very far.
I just reread that post you linked to – there were a couple of insightful comments by some guy named Joseph M. 😉 evidently, I liked it then, too.