The “Statisticians: ‘Global Cooling’ a Myth” story

By William M. Briggs, professional statistician

Your statistical model!

“J’accuse! A statistician may prove anything with his nefarious methods. He may even say a negative number is positive! You cannot trust anything he says.”

Sigh. Unfortunately, this oft-hurled charge is all too true. I and my fellow statisticians must bear its sad burden, knowing it is caused by our more zealous brethren (and sisthren). But, you know, it really isn’t their fault, for they are victims of loving not wisely but too well their own creations.

First, a fact. It is true that, based on the observed satellite data, average global temperatures since about 1998 have not continued the rough year-by-year increase that had been noticed in the decade or so before that date. The temperatures since about 1998 have increased in some years, but more often have they decreased. For example, last year was cooler than the year before last. These statements, barring unknown errors in the measurement of that data, are taken as true by everybody, even statisticians.

Th AP gave this data—concealing its source—to “several independent statisticians” who said they “found no true temperature declines over time” (link)

How can this be? Why would a statistician say that the observed cooling is not “scientifically legitimate”; and why would another state that noticing the cooling “is a case of ‘people coming at the data with preconceived notions’”?

Are these statisticians, since they are concluding the opposite of what has been observed, insane? This is impossible: statisticians are highly lucid individuals, its male members exceedingly handsome and charming. Perhaps they are rabid environmentalists who care nothing for truth? No, because none of them knew the source of the data they were analyzing. What can account for this preposterous situation!

Love. The keen pleasures of their own handiwork. That is, the adoration of lovingly crafted models.

Let me teach you to be a classical statistician. Go to your favorite climate site and download a time series picture of the satellite-derived temperature (so that we have no complications from mixing of different data sources); any will do. Here’s one from our pal Anthony Watts.

Now fetch a ruler—a straight edge—preferably one with which you have an emotional attachment. Perhaps the one your daughter used in kindergarten. The only proviso is that you must love the ruler.

Place the ruler on the temperature plot and orient it along the data so that it most pleases your eye. Grab a pencil and draw a line along its edge. Then, if you can, erase all the original temperature points so that all you are left with is the line you drew.

If a reporter calls and asks if the temperature was warmer or colder last year, do not use the original data, which of course you cannot since you erased it, but use instead your line. According to that very objective line the temperature has obviously increased. Insist on the scientificity of that line—say that according to its sophisticated inner-methodology, the pronouncement must be that the temperature has gone up! Even though, in fact, it has gone down.

Don’t laugh yet, dear ones. That analogy is too close to the truth. The only twist is that statisticians don’t use a ruler to draw their lines—some use a hockey stick. Just kidding! (Now you can laugh.) Instead, they use the mathematical equivalent of rulers and other flexible lines.

Your ruler is a model Statisticians are taught—their entire training stresses—that data isn’t data until it is modeled. Those temperatures don’t attain significance until a model can be laid over the top of them. Further, it is our credo to, in the end, ignore the data and talk solely of the model and its properties. We love models!

All this would be OK, except for one fact that is always forgotten. For any set of data, there are always an infinite number of possible models. Which is the correct one? Which indeed!

Many of these models will say the temperature has gone down, just as others will say that it has gone up. The AP statisticians used models most familiar to them; like “moving averages of about 10 years” (moving average is the most used method of replacing actual data with a model in time series); or “trend” models, which are distinct cousins to rulers.

Since we are free to choose from an infinite bag, all of our models are suspect and should not be trusted until they have proven their worth by skillfully predicting data that has not yet been seen. None of the models in the AP study have done so. Even stronger, since they said temperatures were higher when they were in fact lower, they must predict higher temperatures in the coming years, a forecast which few are making.

We are too comfortable with this old way of doing things. We really can prove anything we want with careful choice of models.

0 0 votes

Article Rating

132 Comments

mercurior

October 28, 2009 8:35 am

Lies, damn lies and statistics.

Freddy

October 28, 2009 8:37 am

Applause.

J. Bob

October 28, 2009 8:41 am

About 40+ years ago, a book came out “How to Lie with Statistics”. It’s a great book, and you can still get it from Amazon.

wobble

October 28, 2009 8:42 am

Has anyone seen the data provided by the AP?

Otter

October 28, 2009 8:42 am

‘He is the very model of a modern climate statistician.’

theduke

October 28, 2009 8:45 am

Most excellent. In other words,
“The devil can cite scripture for his purpose.”

Gary

October 28, 2009 8:47 am

“Th AP gave this data—concealing its source…”
Seth Borenstein told me he got the data that he gave to the statisticians from Dr. Christy. Wouldn’t Dr. Christy be able to tell us exactly what it was?

gary gulrud

October 28, 2009 8:50 am

Nicely done.

tmtisfree

October 28, 2009 8:50 am

So true.

Henry chance

October 28, 2009 8:54 am

Figgers lie and liars figger.

Leon Brozyna

October 28, 2009 8:58 am

The moving average of x-years is a nice tool, telling us about what has happened, but not about what will happen. To understand that, you need to understand all the various factors that impact the climate and how they interact and affect each other. Lots a luck in that department! We’ve still a ways to go before we can get a handle on that messy mass of factors. Looking again at the graph of temperature data for a number of years, the moving average becomes useless toward the end of the data series; it is still a work in progress.

Mark Hugoson

October 28, 2009 9:02 am

Again, “Standard Deviation”? Is it legitimate to call a least squares linear fit
valid, if the SD is high enough? Can we only compare GROUPS of data (say,
1950-1970, 1970-1990, 1990 to 2010. Take the SD of each data set, and
THEN see if there is a STATISTICALLY SIGNIFICANT DIFFERENCE?
Where the H-E-double toothpicks is “basic statistical practice” in this?
Too bad “statisticians” are not licensed like Medical Doctors. We could then
remove some licenses now!

jaypan

October 28, 2009 9:02 am

Thank you for giving such a pleasant intro into statistics.
Have to explain my wife now the loud laughing here.

Mark Hugoson

October 28, 2009 9:03 am

Moderator: Note I became too zealous and added my own carraige returns instead of allowing automatic work (in previous post). Sorry.

Juraj V.

October 28, 2009 9:03 am

I wanna see those graphs and linear trends in.
http://www.woodfortrees.org/plot/gistemp/from:1998/plot/gistemp/from:2002/trend/plot/gistemp/from:1998/to:2000/trend/plot/gistemp/from:2008/trend/plot/gistemp/from:2000/to:2002/trend/plot/gistemp/from:2007/to:2008/trend

Frederick Michael

October 28, 2009 9:09 am

“How to Lie with Statistics” by Darrell Huff is a classic, wonderfully illustrated by Irving Geis. I didn’t know it was still available.
However, the tricks it teaches have nothing to do with the current issue replacing data with models.

Jimmy Haigh

October 28, 2009 9:11 am

I think this short article on models says it all.
Here’s Monty Python’s take on a particular model…

Bruckner8

October 28, 2009 9:13 am

How true. “fitting” a curve to a dataset, however, IS the #1 objective of any data modeler. Even a randomly-generated dataset may have “fittable” (is that a word?) curves along most of its plot. (19th century mathematicians even imagined curves that had a different function at every point….very complex system indeed.)
Anna is always harping on this: Why do we humans insist on linear regression? The answer might be “cuz one of the basic tenets of science is that it be simple.” The problem with that is (and this goes back to the article here), when handed a random set of data (might even look more like a cloud than any line or curve), a simple straight-line fit is possible in ANY DIRECTION.
OK, so maybe we need polynomials (curves)…not straight lines. But how many degrees? Again, any degree will do, as long as it fits your pre-conceived notion. The more degrees, the more ups-n-downs. This is basic calculus. What if there are some points of discontinuity (Ha! missing data!)? What if the FUNCTION ITSELF is actually changing at every point?
I don’t think those possibilities are even discussed by today’s modelers.
Any one of us could play with data-fit using polynomials in MS Excel. Make a chart, and then choose “Add Trendline.” You’ll see all kinds of simple methodologies that assume continuous function data. This is as far as modern modeling goes, I swear! And your “guess” is as good as the Statistician’s, IMO.

Richvs

October 28, 2009 9:21 am

As a professional in process risk analysis, the old saying goes: If you torture data it will tell you anything you want to know! Works every time.

Ken Hall

October 28, 2009 9:27 am

Is temperature rising? Or cooling?
I cannot believe the amount of time wasted on this entirely subjective question. The answer is “it depends” It fully depends on what two dates are used to measure the time period over which the trend is to be determined and the length of time measured. Are we looking at a 5 year timescale? 10? 50? 100? 800? 12,000? 400,000? 100,000,000?
For ANY graph of global temperature one can pick numerous start points and end points that will show either warming or cooling. Even during the last decade I can show rapid warming, OR rapid cooling, depending on the start and end dates.
Picking a start date during 1998 and an end during 2008, one can show a strong cooling trend, however using the date range from 2001 – 2007 would show a significant warming trend.
The one thing that cannot be found is any time where climate was in stasis. The default nature of climate is one of change. And that has always been the case.

Sean

October 28, 2009 9:27 am

What’s the best kind of ruler to use when you want to draw a roller coaster?

mondo

October 28, 2009 9:30 am

AP?

John Nicklin

October 28, 2009 9:35 am

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” Sherlock Holmes.
In the case at hand it seems that the AP statisticians were able to twist the facts to suit their theories even though they had the data. All of which evokes another Doyle quote “His ignorance was as remarkable as his knowledge.”
Thanks to Dr. Briggs for putting it in perspective, much appreciated.

Jeff in Ctown (Canada)

October 28, 2009 9:35 am

I have a ruler I love that I inherited from my Grandmother. It was made by Nabob Coffee in 1912 (or there abouts).
Good analysis. I imagin a 10 year moving average would indeed hide the last 7 years cooling fairly well

Alan Cheetham

October 28, 2009 9:39 am

The global temperature was not increasing in the satellite era prior to the 1997/98 El Nino. And it has not been increasing since that El Nino. The El Nino resulted in a net change of about 0.3 degrees.
Larger image here:Global Temp Since 79
A four-year warming / cooling cycle of about +/- 0.3 degrees can be seen in the data over the last 30 years.