To Tell the Truth: Will the Real Global Average Temperature Trend Please Rise? Part 2

To Tell the Truth:  Will the Real Global Average Temperature Trend Please Rise?

Part II

A guest post by Basil Copeland

Before proceeding, I want to thank Anthony for allowing me to guest blog at Watt’s Up With That?  Anthony is doing some remarkable work in trying to insure the integrity and quality of the surface record, and it is an honor to be able to use his blog for my modest contribution to the debate over climate change and global warming.

In Part I we looked at seasonal differences in the four global average temperature metrics Anthony has recently been blogging about, and demonstrated that since around the end of 2001 there has been no “net” global warming, that positive seasonal differences have been offset by negative seasonal differences.  More recently, negative seasonal differences have dominated, suggesting the possibility of a recent negative trend in global average temperatures.

Reader comments to Part I were interesting.  It was obvious from many that they were struggling to understand what I was getting at, and that this was a different perspective on the data than usual.  Others quickly raised the specter of cherry picking the data, or suggesting a hidden agenda of some kind.  That some would jump to such conclusions without giving me the courtesy of waiting until I was finished is a sad commentary on what’s happening to the field of climate science.  Science is supposed to be all about the freedom to engage in critical inquiry without being impugned with false motives, the freedom to hold scientific consensus up to the critical scrutiny of falsifiable hypotheses.  When voices immediately seek to shut off avenues of inquiry, or impugn motives for questioning scientific consensus, I don’t know what that is, but I know that it is not science..

Resuming where we left off with Part I, if there is evidence of a recent negative trend in global average temperature, is it “statistically significant,” and if so, in what sense?  That’s the question I left hanging at the end of Part I, and is the question we will address in Part II.  There are various ways we might go about investigating the matter.  I chose one that comes from my particular field of experience and expertise (economics, though it is perhaps worth noting that my training was in environmental and resource economics): the Chow test.  The Chow test is used to test for “structural breaks” in time series data.  Just as correlation does not prove causation, a “structural break” doesn’t necessarily prove anything.  It merely suggests that things were different in some way before the “break” than afterward.  It doesn’t answer the question of “why” things changed.  Or, given the venue, we might say that it doesn’t answer the question Watts Up With That?  But it does answer the question of whether the change is “statistically significant.”  And if it is, then perhaps inquiring minds might want to know about it, and consider whether it makes any difference to matter under discussion.

The Chow test involves fitting a regression to the sub parts, and comparing the sum of the mean square error (MSE) of the sub parts to the mean square error of a regression fitted to the entire time period.  If the sub parts come from sufficiently different regimes or circumstances, splitting the time series into two parts will reduce the total MSE, compared to the MSE of a single regression fitted to the entire time period.  The Chow test follows the F distribution, and is a test of the null hypothesis of no change, or difference.

ttttpart2table1.png

Table 1 summarizes the Chow test for each of the four metrics under consideration, for a structural break at 2002:01.  The Chow test was statistically significant in all four cases, though in varying degree.  In Table 1 I describe the level of statistical significance using the same likelihood terminology used by IPCC.  Evidence for a structural break is “very likely” from the UAH satellite dataset, “extremely likely” from the GISS and RSS datasets, and “virtually certain” from the HadCRUT land-sea dataset. 

I cannot say that, though, with remarking about how silly it is.  I do not know of any other field where statistical significance is interpreted this way.  In my field, anything less than a 95% level of confidence is considered weak support of a tested hypothesis.  Instead of “very likely,” for support at the 90% level of confidence I’d say “probably.”  Instead of “extremely likely” at the 95% level of confidence, I’d say “likely.”  And instead of “virtually certain” at the 99% level of significance, I’d say “very likely.”  In other words, to my way of thinking, the IPCC likelihood terminology is shifted about two orders of magnitude in the direction of overstating the likelihood of something.  But even with my more cautious approach to characterizing the results, the evidence is somewhere between “probably” and “very likely” that a structural break occurs in the data after 2002:01.

However we choose to put it, there is statistical support for modeling the trends with a break at 2002:01.  This is done, statistically, with dummy slope and constant variables, and the results are shown graphically in Figures 1, 2, 3, and 4.  In each figure, there are three “trends” noted.  The first, to the left and above the data, is the trend for 1979-2001.  The third, to the right and below the data, is the trend for 2002 through 2008:01.  In the middle, labeled “dT” is a trend for the entire period derived from the delta, or difference, in the end points of the the trend lines, with a number in parentheses representing the decadal rate of change from fitting a single trend line to the data.  This overall trend, based on the difference in end points of the trend lines, is a “best estimate” of the overall trend using all 29 years of data (thus refuting any notion of cherry picking). 

ttttpart2figure1-520.png

Figure 1 – click for larger image

ttttpart2figure2-520.png

Figure 2 – click for larger image

ttttpart2figure3-520.png

Figure 3 – click for larger image

ttttpart2figure4-520.png

Figure 4 – click for larger image

Many readers will probably be familiar with the use of 30 years as a basis for a “climatological norm.”  While we do not have 30 years of data here, we’re close, close enough to refer to the overall trends as a climatological normal for the past three decades.  As I look at the results shown in the four figures, two things stand out. 

First, the dT of the final “best estimate” is 0.025C/decade (UAH_MSU) to 0.047C/decade (HadCRUT) lower than what we’d expect from fitting a straight trend line through the data.  That is perhaps the major point I’m trying to make in all this: that over the period for which we have satellite data to compare to land-sea data, the rise in global average temperature is not quite as great as one would think from fitting straight trend lines through the data. 

Incidentally, this not entirely owing to fitting a downward trend through the data since 2001.  Separate slope and constant dummy variables are also included for the 1998 El Nino, and this accounts for some of the difference.  In fact, somewhat surprisingly, when a constant dummy is added for the 1998 El Nino, it reduces the slope (trend) for the non-El Nino part of the time series through 2001.  We usually expect a constant dummy to affect the model constant term, not the slope.  But in every case here it reduces the slope in a significant way as well, so some of the difference in the “dT” and the result we’d get from a straight trend line owes to the effect of controlling for the 1998 El Nino.

The second thing that stands out, of course, is the downturn since 2001.  Whether this downturn will continue or not, only time will tell.  But if it continues, then the “dT” will likely decline further.

Other things may stand out to other observers.  The differences within the two types of metrics are notable.  GISS implies more warming than HadCRUT, and RSS_MSU implies more warming than UAH_MSU, with the latter showing quite a bit less warming in the period up to 2001 (given the way we’ve modeled the data).  In the case of GISS vs. HadCRUT, the trends are actually quite similar in the period up to 2001; it is after that that the difference emerges, making one wonder if something has changed in recent years in the way one or the other is taking its measure of the earth’s temperature.

Just a final comment, as a way of putting this all in some perspective.  In AR4 IPCC projects warming of 0.2C per decade for the next two decades in a variety of its climate change scenarios.  That will take a lot more warming than we’ve seen in recent decades.  And with the leveling off of the trend in recent years, even if an upward trend resumes, at present it seems highly unlikely that we will see a rise of 0.4C over the next two decades.  Of course, the future has a way of humbling all forecasts.  But perhaps the apocalypse is not as near at hand as some fear.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
69 Comments
Inline Feedbacks
View all comments
kim
March 14, 2008 3:35 am

Well, I’m pretty grateful there was a breakpoint six years ago. Had there not been, and had the ‘Post Hoc, Ergo Propter Hoc’ logical fallacy of temperature rise co-inciding with CO2 rise run successfully for a few more years, then we would have been tragically bound to the irrelevant solutions of carbon capping. As it is, the biofuel delusion has raised the price of food worldwide, utterly unnecessarily.
When the general public understands that they are kicking in an extra dollar a gallon every time they buy milk because of the ‘madness of crowds’ and the dishonesty of a few scientists, there will be Hell to pay.
‘Buy a gallon of ethanol, starve a dozen children’
==============================

March 14, 2008 5:22 am

fwiw. from 1880 to 2007 there were 48 7 year peroids with a negative trend.
22 of them stayed negative until 10 years

Basil
Editor
March 14, 2008 5:35 am

Nick Stokes,
On Menne’s paper, it is worth noting that he is analyzing annual data, and that the time frame is only through 2003. There’s nothing wrong with that per se — using annual data — but because of the smoothing performed by annualizing the data, the results are obviously going to be different than if monthly data are used. For instance, if you look at the charts in the link to Menne’s paper you posted, you can still see the effect of the 98 El Nino (if you know to look for it), but it doesn’t stand out in the same degree that it does when looking at monthly data over a shorter period of time. As for Menne not finding a break in the mid 1970’s, just eyeballing Menne’s data, I wouldn’t expect to find one either. That doesn’t mean one wasn’t there, were one to look at monthly data. In other words, applying the technique to monthly data would one expect to find more change points, with smaller orders of magnitude, than one is going to find in annualized data.
As for Menne’s technique being superior because it infers the change points from the data, rather than postulating them a priori, I would argue that from a philosophical point of view (the philosophy being the philosophy of science) that’s nonsense. There’s no disputing that Menne’s technique is more sophisticated (for want of a better way to say it) than the Chow test. But from the standpoint of science, it matters not how we come up with our hypotheses, it matters only that they be falsifiable. I could throw darts at the data, and say “let’s see if there was a change point” where the dart hit, and that would be perfectly valid.
If, after successive dart throws, and continually NOT rejecting the null hypothesis of no change, the dart landed on 2002, and I found that I COULD reject the null hypothesis of no change, would it matter how I found the change point? In truth, Menne’s approach isn’t much more than that — mining the data, looking for change points. I just mined the data differently. Having done so, the hypothesis is out there for anybody to disprove. If you want to run Menne’s methodology on the data sets we are looking at, go for it!
As for not knowing how to post a plot of the running average, just upload it to http://www.tinypic.com, and post the URL here, and we can see it that way.

tommoriarty
March 14, 2008 7:18 am

Basil,
While I am sympathetic with your conclusion that appears to be a down turn, or at least a leveling off since 2001, there is something about your analysis technique that bothers me. You say:

The Chow test involves fitting a regression to the sub parts, and comparing the sum of the mean square error (MSE) of the sub parts to the mean square error of a regression fitted to the entire time period. If the sub parts come from sufficiently different regimes or circumstances, splitting the time series into two parts will reduce the total MSE, compared to the MSE of a single regression fitted to the entire time period.

It seem inevitable that the more subparts you choose to divide the data into, the smaller the total MSE will be. Take the absurd case of dividing n points into n-1 subparts, then the total MSE will be zero.
Best Regards,
Tom Moriarty
ClimateSanity

davidsmith1
March 14, 2008 8:16 am

Hi, Athony. One of the temperature plots I watch has to do with the Indo-Pacific Warm Pool (“IPWP”). That’s the giant region of very warm tropical water which serves as Earth’s boiler room, supplying heat to the rest of the planet.
It is rather well-correlated with global temperature, as one might imagine. A map which shows the correlation is this one
http://davidsmith1.files.wordpress.com/2008/03/0313081.jpg
(Readers who aren’t familiar with these maps should simply note that the orange and red colors indicate that, as sea surface temperature changes there, so does the global air temperature.)
How has the IPWP behaved recently? I use a somewhat unusual plot to get a feel for the answer
http://davidsmith1.files.wordpress.com/2008/03/0314081.jpg
This plot shows the IPWP temperature anomaly and how it compares with the same month in prior years (1948 to the present). For example, the final month (February 2008) is about the 22’nd warmest (or 38’th coolest) of the last 60 years.
What it shows is that the IPWP has been undergoing a general cooling trend since about 2001. That’s important, because this is Earth’s boiler room.

randomengineer
March 14, 2008 8:19 am

(So, until you point to the article, I’m just going to assume that, as far as we know, Tamino pulled 1975 out of a hat.)
I would bet that you could get 1975 by plotting 2nd derivatives of temps looking for the part where the spikes are more up than down and then go +/- 5 years or so on either side of that, curve fit it, and wind up with 1975. This seems the easy way to do this since you go from cooling (40’s-70’s) to warming (70’s – present) somewhere in that 70’s data range anyway. All this does is narrow down where you can pick the transition change point.
While I’ll bet that the method used was probably more involved than that and used fancier math, I’m borrowing from Dr. Pielke’s example and doing this the simple way. But that’s my guess as to how that date was reached, and it’s not pulling it out of a hat.
Lucia — does this sound all that farfetched to you?
Regardless of how this was reached, my problem with this is that this method of hinges etc seems to be more of a low freq detector of PDO or some other cycle (avg 30 years or so) and this seems disingenuous at best to detect the cycle start point and claim warming.

Basil
Editor
March 14, 2008 8:33 am

Tom,
The Chow test only envisions subdividing into two parts. I would imagine that were it extended to the kind of case you are considering, that the degrees of freedom would rapidly decline with each successive partitioning of the data, and that in your reductio ad absurdum case the degrees of freedom would be zero, which is just another way of saying that the confidence limits would plus or minus infinity! IAC, the Chow test, limited as it is, is well established as a valid hypothesis test.
Basil

Raven
March 14, 2008 8:49 am

steven mosher says:
“fwiw. from 1880 to 2007 there were 48 7 year peroids with a negative trend.
22 of them stayed negative until 10 years”
And we did not have 2 degC/century warming either.
The issue is not: ‘can we have decadal declines in temps as part of a long term rising trend?’ (answer: of course).
The issue is: ‘how much weight should be placed on IPCC projections of >2 degC/century warming given a 10 year decline while CO2 levels continued to rise?’ (answer: not much).
That said, we are still dealing with probabilities and there is a 1 in 20 chance that the 2 degC/trend is real even it is outside the 95% confidence limits given the data since 2002.

Jsoh
March 14, 2008 8:57 am

Why does everyone use linear methods to analyze the climate system? Its natural component is almost surely cyclical, which would make a Fourier-like analysis more relevant (this is not a knock against this blog post per-se — it’s done everywhere in climate studies that I can see). What I mean is, if you take, e.g. from 0 to pi/2 (1/4 period) from a sin wave, it has a strong (and highly significant) linear trend but no one would argue that you’ve found something meaningful simply because your linear fit was at the 99.999% confidence level.
I’m not savvy enough in statistics to know what this leads to, but I do know that in general you can’t simply take data that is fundamentally non-linear and fit a line to it. All the statistical tools that measure significance, etc. are only valid on fundamentally linear data. Part of the problem is that there aren’t good techniques (or at least not any that they teach in undergrad stats) that test significance of e.g. a sinusoidal fit. This is probably why no one does it, but still, I can’t help but wonder whether this sort of thing doesn’t make a HUGE difference when, e.g. the IPCC says they are 95% confident that the recent trend is not natural.

Robert in Calgary
March 14, 2008 9:44 am

A gentleman would have quietly made the effort to get the link, even asking Tammy directly for it.
Instead we’ve got someone who has time for post after post…….almost all with a huge chip on the shoulder.

March 14, 2008 9:49 am

compare Tammy’s method ( unpublished, not peer reviewed, I checked his publication history and he has nothing published on this ) with any other method for change point analysis. I think you’ll find tammy wanting.
Essentially he fit two lines to the series to mimimize the regression error.
Why two? Put another way, choosing to find ONE hinge point in the temp series
will guide you to 1970s. A more robust method would find all the “hinge” points
All the change points. So in a way the undocumented TAMMY method just
data mines for the exact year in the 70’s that minimizes the error of two lines
of regression.
I searched in vain to find the tamino method of change point analysis
in consensus time series statistical science. I find he has no published papers
here. The Chow test is akin to the tammy test excect it is more general and
not restricted as his was to finding one break point. It and other methods
find multiple break points. Thats the point! Anyway.
Be that as it may. Perhaps somebody should publish the tammy test as a new method of change point analysis in a statistics journal.
Basil’s Selection of 2001 is fair enough. The real issue is what to make of a
seven year period of negative trend?
Not Much. just deal with the fact that you have a 7 year
negative trend. WEATHER HAPPENS. Like I said, if you randomly selected
a start year since 1880 you would have a 38% chance of picking a start year
that had a seven year negative slope. So, it’s not at all rare. neither is it likely
to last.
Now, given that you pick a 7 year period with a negative slope you have
a roughly 50% chance it will last 10 years, and a 30% chance it will last 14
years and almost no chance it will last 30 years.SO BUCK UP, the cold will
end. In fact I think the coldists are overplaying this, and just tweaking
the noses of the warmists who also overplay short term fluckuations

March 14, 2008 9:53 am

Off Topic, but please enjoy:
The winner for Most Ironic Comment is…

Basil
Editor
March 14, 2008 10:29 am

Steve Mosher is most certainly correct that we cannot “make [here I would add ‘too’] much” out of the downturn since 2001. So let’s be clear about what I made of it. All I made of it is that it is part of the 29 year period of history since satellite measurements began, and that if ignored, as in the case of fitting a straight line regression through the entire period to measure the trend, it will lead us to somewhat overstate the “best estimate” of what the “average” trend was during the entire period.
Let’s take RSS_MSU as an example to clear this up. Nowhere do I say anything about the likelihood of the -0.336 since 2001 continuing indefinitely. Rather, what I’m saying is if we do not properly model the in sample data, and simply fit a straight line through the data, we end up concluding that the average trend over 29 years was 0.169C/decade, when in truth a better estimate of the average trend over the past 29 years was on the order of 0.131C/decade.
In other words, if we were to use the last 29 years as climatological norm to predict what will happen over the next 29 years, what I’m saying is that 0.131C/decade is a better estimate than 0.169C/decade.
And nothing anybody has said, in comments, materially undermines any of this. In fact, most of it completely misses the point.
Basil

Basil
Editor
March 14, 2008 11:55 am

I doubt that anything I could say at this point will change the minds of those bothered most by what I’ve done, and I need to get busy on Part III (which will be more of a wrap up, than anything new or exciting), but if it will help in the least little bit, what I’m doing is something analogous to this:
Imagine a 30 year period in which a number is constant at a value of 20 for the first 15 years, and is constant at a value of 10 for the last 15 years. Who here thinks they can compute the “best estimate” of what the average will be for the following 30 years by fitting a trend line through the data? If this pattern repeats every 30 years, the trend is zero! Fitting a straight line regression through the data leads us to imagine that the trend is negative. The only way to accurately determine the “trend” would be to separate the series into two parts, where we would find that in the first part the “regression” is a constant 20, with a slope of zero; the “regression” in the second part is a constant of 10, with a slope of zero; and the best estimate for the next 30 years is a constant of 15 and a slope of zero.
Is that too hard to fathom?
Fitting straight lines through cyclical data creates all sorts of problems, of which correcting for serial correlation is only one. Controlling for discontinuities is another.
Yet supposed experts or authorities use straight line regression all the time, like here:
http://www.remss.com/data/msu/graphics/plots/sc_Rss_compare_TS_channel_tlt.png
Well, excuse me for having the temerity to suggest that there might be other, and better, ways than this, to estimate what the real trend has been for the past 29 years.

Sean Houlihane
March 14, 2008 12:35 pm

Interesting that noone seems to take the step-change proposition seriously. Given a 10 degree local daily variance, and a 0.5 annual global average variance, the thermal inertial argument seems flawed (oceans have massive capacity, but limited flux. For the atmosphere, I have my doubts.) For me, step changes suggest positive feedback, short term perturbations resulting in transition from one regime to another. Who is claiming that there are not localised climate systems which do not sit close to a tipping point in their behaviour?

March 14, 2008 12:56 pm

Lucia,
Menne specifically leaves open that the newer model may not be more accurate than previous descriptions – if you read the very start of the conclusions in the paper, it says so.
“Yes. Someone, somewhere, once found a hinge point at 1975.”
Not someone somewhere, but several people referenced in that paper. Is there a reason to make them sound like random clowns on the Internet when they are published statisticians (that support Lee’s claim).
By the way the support for Lee’s claim is easy to find – just search for 1975 in that paper. Or ask Lee to do it for you.
“But Menne specifically shows that disappears on further analysis! ”
He showed that it might disappear on further analysis.

Obsessive Ponderer
March 14, 2008 1:31 pm

Basil
Could you comment on the seemingly 30 year periodicity that both GISS and HADCRUT appear to show all the way back to 1880 and 1850. Start at 2004 (I use this asBriggs wavelet analysis indicates a cooling started then and it fits pretty good :] -cherry picking!! Dell commented on this in part one, but no one picked up on it.
During the period 1944 to 1974 Hadcrut shows a small warming and GISS a small cooling. This was interesting because Dell indicated (in the latest thread) that “the infamous 1975 Newsweek article on “The Cooling World’ shows a temp drop of about .65 degrees during that period of time.”
We could be just naturally going into another cool period if this periodicity has any legs.

John M
March 14, 2008 3:40 pm

Obsessive,
Regardless of what statistical test is used to assign a “hinge point”, any analysis of 20th century temperature trends, IMHO, has to start with three little letters: PDO. Here’s a comment and graph I posted at CA. I’m sure there’s a more thorough treatment someplace, I just haven’t found it yet.
http://www.climateaudit.org/?p=2223#comment-222189
David Smith has also done some extensive looking at other Oceanic trends that can impact global temperature trends.

old construction worker
March 14, 2008 3:42 pm

Chris: you said
“Ask the analyst if the technical trends suggest that the stock will go higher or lower. Guys, Wall St. have been doing this type of analysis for decades. Why re-invent the wheel?”
If you would go back to the GW’s IPO (about10,000 years ago), I’ve would have sold this loser a long time. The only way to play this stock would be to short sell it about every 1500 or 500 or 60 years.

nick stokes
March 14, 2008 4:31 pm

Basil,
OK, I followed your advice re tinypic. Here is the HADCRUT3 monthly data since 1997, plotted raw, and with it’s 12-mth moving average.
here
And here is the 12-mth curve, from the previous plot, inverted. Since I can’t preview here, I’m just hoping it works. You’ll see it looks exactly the same as the second plot in your part 1 thread.
On the general philosophy of curve fitting, I refer back to what I said early in this thread. You have a set of data, and want to see if it can be explained by some model that makes sense to you. So you check the differences, and see if they look like noise. If so, you say the model fits. It won’t be the only one that does.
You can always find something that is closer to the original data that fits better. It’s a compromise between fit and meaningfulness. And I think you have gone too far away from meaningfulness, especially when you have abrupt temperature changes, which imply infinite heat fluxes.

nick stokes
March 14, 2008 4:36 pm

Basil,
No, that tinypic didn’t work, though I did what they said. I’m hoping this will:
Here is the HADCRUT3 monthly data since 1997, plotted raw, and with it’s 12-mth moving average.
here
And here is the 12-mth curve, from the previous plot, inverted.

nick stokes
March 14, 2008 4:40 pm

Basil,
OK, nearly there. The plots are actually linked in the last post, but it isn’t clear. Wish I could preview. Hope this works:
Here is the HADCRUT3 monthly data since 1997, plotted raw, and with it’s 12-mth moving average.
And here is the 12-mth curve, from the previous plot, inverted.

Basil
Editor
March 14, 2008 4:57 pm

Correction:
in describing the Chow test, where I said “mean square error,” I should have said “sum of the squared residuals.” The Chow test calculations are correct, just not my representation of them.

Obsessive Ponderer
March 14, 2008 5:23 pm

John M
Thanks for that. I have seen graphs like that before. Are there any records or way of knowing what the PDO were back to 1850? The more I look at the data and the correlations of the sun, PDO, and other parameters (and I realize that correlation is no definitive of anything) with temperature the less likely I am to believe that increased CO2 has any overriding effect on climate and the more concerned I get about the headlong rush to “control climate change”.

John West
March 14, 2008 6:28 pm

The Gorites can’t lose this one. While we all get on here a discuss the fine points in the science of global warming vs global cooling, they left are now calling for protection against any and all climate change. Got that “Climate Change”.
This debate and movement is not about the that actual climate, it’s about the take down of the democratic free enterprise societies on planet earth. They mean to replace it with a Marxist-Leninist brand of socialism in a world of serfs under the auspices of the United Nations.
I don’t meant to belittle the work being done to disprove the loons who are selling us this load of crap, but it simply is not the issue. The issue is global slavery vs freedom.
We need to get a lot more political about this war with the Climatites and elect people who will not fall for it. Canada has Stephen Harper who has stated that he knows it’s a lot a crap and that it will destroy our economy if we do what the Kyoto Accord demands. He will have to do something to appease the zealots who insist we ‘save the planet’ or he will lost too many votes. My own provincial government in British Colombia has just levied a seven cent per liter on gas to “save the planet”. That would amount to about thirty cents per gallon.
Add to that the diversion of food stuff in favor of growing fuel for cars and trucks and we will have a world wide famine soon. All food will go up exponentially . The poor will starve and the more well off will go broke trying to keep up.
This is a global extortion the likes of which we have never seen before.