New paper makes a hockey sticky wicket of Mann et al 98/99/08

NOTE: This has been running two weeks at the top of WUWT, discussion has slowed, so I’m placing it back in regular que. – Anthony

UPDATES:

Statistician William Briggs weighs in here

Eduardo Zorita weighs in here

Anonymous blogger “Deep Climate” weighs in with what he/she calls a “deeply flawed study” here

After a week of being “preoccupied” Real Climate finally breaks radio silence here. It appears to be a prelude to a dismissal with a “wave of the hand”

Supplementary Info now available: All data and code used in this paper are available at the Annals of Applied Statistics supplementary materials website:

http://www.imstat.org/aoas/supplements/default.htm

=========================================

Sticky Wicket – phrase, meaning: “A difficult situation”.

Oh, my. There is a new and important study on temperature proxy reconstructions (McShane and Wyner 2010) submitted into the Annals of Applied Statistics and is listed to be published in the next issue. According to Steve McIntyre, this is one of the “top statistical journals”. This paper is a direct and serious rebuttal to the proxy reconstructions of Mann. It seems watertight on the surface, because instead of trying to attack the proxy data quality issues, they assumed the proxy data was accurate for their purpose, then created a bayesian backcast method. Then, using the proxy data, they demonstrate it fails to reproduce the sharp 20th century uptick.

Now, there’s a new look to the familiar “hockey stick”.

Before:

McShane-Wyner-Fig1 — Multiproxy reconstruction of Northern Hemisphere surface temperature variations over the past millennium (blue), along with 50-year average (black), a measure of the statistical uncertainty associated with the reconstruction (gray), and instrumental surface temperature data for the last 150 years (red), based on the work by Mann et al. (1999). This figure has sometimes been referred to as the hockey stick. Source: IPCC (2001).

After:

McShane-Wyner-Fig16 — FIG 16. Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD and backcasts 998-1849 AD. The cyan region indicates uncertainty due to t, the green region indicates uncertainty due to β, and the gray region indicates total uncertainty.

Not only are the results stunning, but the paper is highly readable, written in a sensible style that most laymen can absorb, even if they don’t understand some of the finer points of bayesian and loess filters, or principal components. Not only that, this paper is a confirmation of McIntyre and McKitrick’s work, with a strong nod to Wegman. I highly recommend reading this and distributing this story widely.

Here’s the submitted paper:

A Statistical Analysis of Multiple Temperature Proxies: Are Reconstructions of Surface Temperatures Over the Last 1000 Years Reliable?

(PDF, 2.5 MB. Backup download available here: McShane and Wyner 2010 )

It states in its abstract:

We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Furthermore, various model specifications that perform similarly at predicting temperature produce extremely different historical backcasts. Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago.

Here are some excerpts from the paper (emphasis in paragraphs mine):

This one shows that M&M hit the mark, because it is independent validation:

In other words, our model performs better when using highly autocorrelated

noise rather than proxies to ”predict” temperature. The real proxies are less predictive than our ”fake” data. While the Lasso generated reconstructions using the proxies are highly statistically significant compared to simple null models, they do not achieve statistical significance against sophisticated null models.

We are not the first to observe this effect. It was shown, in McIntyre

and McKitrick (2005a,c), that random sequences with complex local dependence

structures can predict temperatures. Their approach has been

roundly dismissed in the climate science literature:

To generate ”random” noise series, MM05c apply the full autoregressive structure of the real world proxy series. In this way, they in fact train their stochastic engine with significant (if not dominant) low frequency climate signal rather than purely non-climatic noise and its persistence. [Emphasis in original]

Ammann and Wahl (2007)

…

On the power of the proxy data to actually detect climate change:

This is disturbing: if a model cannot predict the occurrence of a sharp run-up in an out-of-sample block which is contiguous with the insample training set, then it seems highly unlikely that it has power to detect such levels or run-ups in the more distant past. It is even more discouraging when one recalls Figure 15: the model cannot capture the sharp run-up even in-sample. In sum, these results suggest that the ninety-three sequences that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature. See Footnote 12

Footnote 12:

On the other hand, perhaps our model is unable to detect the high level of and sharp run-up in recent temperatures because anthropogenic factors have, for example, caused a regime change in the relation between temperatures and proxies. While this is certainly a consistent line of reasoning, it is also fraught with peril for, once one admits the possibility of regime changes in the instrumental period, it raises the question of whether such changes exist elsewhere over the past 1,000 years. Furthermore, it implies that up to half of the already short instrumental record is corrupted by anthropogenic factors, thus undermining paleoclimatology as a statistical enterprise.

…

McShane-Wyner-Fig15 — FIG 15. In-sample Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD.

We plot the in-sample portion of this backcast (1850-1998 AD) in Figure 15. Not surprisingly, the model tracks CRU reasonably well because it is in-sample. However, despite the fact that the backcast is both in-sample and initialized with the high true temperatures from 1999 AD and 2000 AD, it still cannot capture either the high level of or the sharp run-up in temperatures of the 1990s. It is substantially biased low. That the model cannot capture run-up even in-sample does not portend well for its ability

to capture similar levels and run-ups if they exist out-of-sample.

…

Conclusion.

Research on multi-proxy temperature reconstructions of the earth’s temperature is now entering its second decade. While the literature is large, there has been very little collaboration with universitylevel, professional statisticians (Wegman et al., 2006; Wegman, 2006). Our paper is an effort to apply some modern statistical methods to these problems. While our results agree with the climate scientists findings in some

respects, our methods of estimating model uncertainty and accuracy are in sharp disagreement.

On the one hand, we conclude unequivocally that the evidence for a ”long-handled” hockey stick (where the shaft of the hockey stick extends to the year 1000 AD) is lacking in the data. The fundamental problem is that there is a limited amount of proxy data which dates back to 1000 AD; what is available is weakly predictive of global annual temperature. Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample.

As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has

a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth. Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model.

Our main contribution is our efforts to seriously grapple with the uncertainty involved in paleoclimatological reconstructions. Regression of high dimensional time series is always a complex problem with many traps. In our case, the particular challenges include (i) a short sequence of training data, (ii) more predictors than observations, (iii) a very weak signal, and (iv) response and predictor variables which are both strongly autocorrelated.

The final point is particularly troublesome: since the data is not easily modeled by a simple autoregressive process it follows that the number of truly independent observations (i.e., the effective sample size) may be just too small for accurate reconstruction.

Climate scientists have greatly underestimated the uncertainty of proxy based reconstructions and hence have been overconfident in their models. We have shown that time dependence in the temperature series is sufficiently strong to permit complex sequences of random numbers to forecast out-of-sample reasonably well fairly frequently (see, for example, Figure 9). Furthermore, even proxy based models with approximately the same amount of reconstructive skill (Figures 11,12, and 13), produce strikingly dissimilar historical backcasts: some of these look like hockey sticks but most do not (Figure 14).

Natural climate variability is not well understood and is probably quite large. It is not clear that the proxies currently used to predict temperature are even predictive of it at the scale of several decades let alone over many centuries. Nonetheless, paleoclimatoligical reconstructions constitute only one source of evidence in the AGW debate. Our work stands entirely on the shoulders of those environmental scientists who labored untold years to assemble the vast network of natural proxies. Although we assume the reliability of their data for our purposes here, there still remains a considerable number of outstanding questions that can only be answered with a free and open inquiry and a great deal of replication.

===============================================================

Commenters on WUWT report that Tamino and Romm are deleting comments even mentioning this paper on their blog comment forum. Their refusal to even acknowledge it tells you it has squarely hit the target, and the fat lady has sung – loudly.

(h/t to WUWT reader “thechuckr”)

0 0 votes

Article Rating

1.2K Comments

Inline Feedbacks

View all comments

GrantB

August 14, 2010 9:32 pm

Nick Stokes @ur momisugly 9:04pm
Oh dear Nick, a quotation from page 2 of the introduction putting the background in context and quoting from the IPCC. Is that the best you can do? There are another 43 pages after that or did you stop there?
Mind you, Blakeley McShane is from the Kellogg School of Management and is obviously funded by big corn.

Jimmy Haigh

August 14, 2010 9:34 pm

It’s nice to see a publication in one of the “top statistical journals” even after all these years which agrees with what the vast majority of us here have known for so long: that the hockey team’s work is pure mince.
(“Mince”: A Scottish term which roughly translates as “garbage”.)

Jimmy Haigh

August 14, 2010 9:35 pm

I can just picture Gavin Schmidt’s grandfather writing on his blog in the early 20th century: “Arrhenius disappoints”.

cohenite

August 14, 2010 9:38 pm

Oh Nick, let it go; CO2, in sufficient quantities “can force temperature increases”. We know that:
http://wattsupwiththat.files.wordpress.com/2009/07/co2_temperature_curve_saturation.png
The late S. Schneider knew that:
http://www.sciencemag.org/cgi/content/abstract/173/3992/138
It’s just that those forcings diminish to statistical errors, as does the temperature response.
The point about this [final] nail in the HS is that it shows whatever is happening today is not exceptional; that was the point about Mann’s HS and the basis of AGW; it was wrong. Find another cause; how about asteroid collisions? That’s a real issue.

AlanG

August 14, 2010 9:45 pm

To misquote Julius Henry (Groucho) Marx, amateurs should stick to brain surgery. Lightweight Math Mann was clearly out of his depth.
Looking at Fig. 16 above, it reinforces my belief that the descent into the next ice age started about 3500 years ago at the end of the Minoan warming. The GISP2 ice core shows a long term downslope during the last ice age of about 0.14C per 1000 years. The initial descent from the peak of the last interglacial was about 0.4C per 1000 years. Fig. 16 is steeper than that.

Ric Werme

Editor

August 14, 2010 9:46 pm

Michael says:
August 14, 2010 at 7:26 pm

This WUWT blog should create it’s own Hurricane prediction poll on the side line. I bet we could predict hurricane activity much closer than NOAA’s current prediction accuracy.

Easily done – instead of one forecast, we’d have 100. that greatly increases the chance one is more accurate. Or do you propose getting everyone to agree on a single forecast. (Consensus forecasting?)

This blog’s Hurricane prediction forecast poll may be the one in the future that financial institutions rely on to make actuarial plans, set premiums, and is used to make preparedness plans.

Somehow I have have trouble visualizing a Board of Directors meeting discussing the relative merits of of people who have spent 1000s of hours looking in to many details over a group of mostly nameless, and uncontactable people.

I predicted zero hurricanes last year, this year and was 100% accurate.

There were three. None hit the US mainland, but your statement is 1000% wrong.
All right, infinitely wrong. 3 / 0 does not compute.

Ron House

August 14, 2010 9:47 pm

As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth.

Poetic justice. The alarmists fiddle the temperature record to introduce a spurious temperature rise, which these statisticians trust as real, and so it becomes evidence that the other alarmist fiddle, the hockey stick, is ‘not robust’. That means, of course, that on the one hand those of us who seek truth rather than ideology must therefore have reservations about some of this paper’s results until the consequences of the temperature fiddle have been incorporated properly. On the other hand the shysters cannot consistently agree with our reservations! The irony of it!

CRS, Dr.P.H.

August 14, 2010 9:48 pm

Why should the community of climatologists object to this peer-reviewed publication? After all, they stood up & cheered on RC etc. when the Oxburgh inquiry exonerated the Hockey Team of professional malfeasance.
However, they could have used a few undergraduate classes in linear regression!
—
The panel found that the statistical tools that CRU scientists employed were not always the most cutting-edge, or most appropriate.
“We cannot help remarking that it is very surprising that research in an area that depends so heavily on statistical methods has not been carried out in close collaboration with professional statisticians,” reads the inquiry’s conclusions.
However, “it is not clear that better methods would have produced significantly different results,” the panel adds.
http://www.newscientist.com/article/dn18776-climategate-scientists-chastised-over-statistics.html
——
This latest publication seems to indicate that, yes, better statistical methods DO produce significantly different results!
This paper is huge, thanks for posting, Anthony!

Rockyroad

August 14, 2010 9:49 pm

duckster says:
August 14, 2010 at 9:20 pm
(…)
OK. So your job now would be to show consistency by fitting it into the available evidence so that it doesn’t contradict the other points you have made against CAGW. There is no point at all in destroying Mann if you have to throw out half of the all the other things that have been said on this blog in order to do so.
—–Reply:
No, nothing else needs to be said–this is a refutation of Mann’s statistical methods; as such there is NO requirement to include anything else. Your request is simply an obvious attempt of deflecting a very damning rebuttal of Mann’s mathematical acumen. His authority is over; he IS destroyed and with him goes CAGW. Gone; done; kaput.
Why? He lied. Or he was stupid. Your choice.

Honest ABE

August 14, 2010 9:55 pm

I wonder when the Real Climate team will slam out a response without an answer so their lemmings have a url to point to and declare this paper debunked.
Fortunately many of us realize that reality isn’t a function of assertion.

James Sexton

August 14, 2010 9:58 pm

While some may not see the humor in this, to me, it is side splitting, a final (b)slap……….
Our work stands entirely on the shoulders of those environmental scientists
who labored untold years to assemble the vast network of natural
proxies. Although we assume the reliability of their data for our purposes
here, there still remains a considerable number of outstanding questions
that can only be answered with a free and open inquiry and a great deal of
replication. <———— lol, Phil's greatest fears realized.
In other words, they seem to be saying, “YOU’RE DOING IT WRONG!!!” And, “you had your chance, now the grown-ups have to do it.” “Now, run along and bring me back the thermometer readings and we’ll show you how to interpret them.”

Dave F

August 14, 2010 10:02 pm

Well, in light of all this, I am open to comments on why deriving climate sensitivity from the LGM is ok.
I also would like to reiterate that we can predict temperature just fine using the tools given us by meteorology and would like to know why it is necessary to throw out any of the data used in weather prediction when it comes to climate models. Anyone?

chris y

August 14, 2010 10:04 pm

Anthony- I think Figure 17 from the paper is actually very telling, since it overlays the newly estimated error bands on top of the archival hockey stick spaghetti graphs.
Really stunning. Going back more than a few centuries, the error bands fill up the entire vertical extent!

JDN

August 14, 2010 10:04 pm

Annals of Applied Statistics is the sixth rated stastics journal (impact factor, of course, = 2.57) Of course, it could be tops in its specialty. It looks like it has a heavy representation of Japanese sponsors and some major statistics departments. The editor-in-chief is a Bush-era National Science medal award winner. The editor for physical & environmental statistics actually looks like an environmentalist from his listed interests. So, fine journal with mixed viewpoints.
This is an interesting development because it leaves the alarmist professors an out that will allow them to suspend their claims and still receive further funding. You may have won this battle if they take the offer. I don’t think you’ve won the war. That will resume when the cold snap is over and people have forgotten about scandals and such.

John F. Hultquist

August 14, 2010 10:07 pm

I always like to know who wrote what I’m asked to read. I don’t like to feel like I am part of the mushroom syndrome. A new name gets a couple of chances – if I think the comments are untracked I first try to find out whether the writer has any respectability because my lack of understanding could be the problem. Then I could do some research and reading and better assimilate the new information. After reading Mike Roddy’s statements I felt the need to check on him. The second of the listings below seems the most likely to be knowledgeable about his trade. Henceforth, I will only read Jaguar related posts by a Mike Roddy. Climate related comments by Mike Roddy – I don’t think so.
Okay, will the real Mike Roddy please stand up.
Mike Roddy is a long-time CP commenter. A UC Berkeley graduate, he has pursued many careers, including solar manufacturing, writing and research, and managing social housing projects on four continents.
OR
Mike Roddy Motors’ The Independent Jaguar Specialists’
The Leader in all aspects of Servicing, Repairs, Restorations and Improvements for all makes of Jaguar cars.

August 14, 2010 10:12 pm

That poor graph – it’s suffered a death by a thousand cuts and yet it still has stalwart defenders. I hope Mann has other successes on which to ride to comfortable retirement because this horse is finished. I can’t help but think his circle of peers is becoming a close knit bunch whose objectivity is certainly now open to wonderment.

And in the master’s chambers,
They gathered for the feast
They stab it with their steely knives,
But they just can’t kill the beast

I’m probably going to wish there was a preview function here…

CRS, Dr.P.H.

August 14, 2010 10:15 pm

GrantB says:
August 14, 2010 at 9:32 pm
Nick Stokes @ur momisugly 9:04pm
Oh dear Nick, a quotation from page 2 of the introduction putting the background in context and quoting from the IPCC. Is that the best you can do? There are another 43 pages after that or did you stop there?
Mind you, Blakeley McShane is from the Kellogg School of Management and is obviously funded by big corn.
————-
REPLY:
Sorry, mate, I’m at University of Illinois and I’M funded by big corn!! And big cheese, big meat packer, big pandemic etc.
Here’s McShane’s website:
http://www.blakemcshane.com
I’ve never met him, but I’ve lectured a bit over at Kellogg & they are usually considered one of the top graduate schools of business in the USA. His resume is very impressive.
This publication is a serious shot across the bow of the Hockey Team crowd, let’s see how they react to it.

Poptech

August 14, 2010 10:17 pm

Annals of Applied Statistics Editors better prepare for the incoming wave of team science comments on the paper. The good news is the team now has to argue their statistical “methods” with professional statisticians.
Game over man, game over.

CRS, Dr.P.H.

August 14, 2010 10:17 pm

(Mods, would you please change “school’s” to “schools” for me in the preceding post? I hate stupid grammatical errors! Thanks much, Chuck the DrPH)
[REPLY – I looked and looked. Can’t find the durn thing. Please accept a te absolve in lieu of correction. ~ Evan]
[Reply: Fixe’d! By the undercover grammar sleuth ~…]

Amino Acids in Meteorites

August 14, 2010 10:18 pm

Mike Roddy says:
August 14, 2010 at 7:13 pm
The authors of the 20- odd studies that confirmed Mann’s….
The NAS (National Academy of Science) did not affirm Mann’s conclusions:
“Even less confidence can be placed in the original conclusions by Mann et al. (1999) that “the 1990s are likely the warmest decade, and 1998 the warmest year, in at least a millennium” ”
National Academy of Science
“Surface Temperature Reconstructions for the Last 2,000 Years”
-page 4
http://www.nap.edu/catalog.php?record_id=11676

maksimovich

August 14, 2010 10:21 pm

G. Burger 2010
By avoiding the (calibrating) instrumental period, and by using a fairly robust spectral measure for low-frequency performance, the above coherence analysis has uncovered several inconsistencies among the group of millennial reconstructions that figured prominently in the latest IPCC report and elsewhere. An immediate lesson from this is that simple visual inspection of smoothed time series, grouped and overlaid into a single graph, can be very misleading. For example, the two reconstructions Ma99 and Ma08L, which have previously been described to be in “striking agreement” (cf. Mann et al., 2008), turned out to be the most incoherent of all in our analysis.
incoherent [ˌɪnkəʊˈhɪərənt]
adj
1. lacking in clarity or organization; disordered
2. unable to express oneself clearly; inarticulate
3. (Physics / General Physics) Physics (of two or more waves) having the same frequency but not the same phase

Amino Acids in Meteorites

August 14, 2010 10:22 pm

Tamino and Romm are deleting comments even mentioning this paper
How bloody scientific of them. 😉

James Sexton

August 14, 2010 10:24 pm

duckster says:
August 14, 2010 at 8:55 pm
duckster says:
August 14, 2010 at 9:20 pm
“So is this how you get around the fact that McShane and Wyner is showing almost 2 degrees of warming since 1850? This is way beyond what Mann et al show – and would be truly unprecedented, wouldn’t it?
……
OK. So your job now would be to show consistency by fitting it into the available evidence so that it doesn’t contradict the other points you have made against CAGW. There is no point at all in destroying Mann if you have to throw out half of the all the other things that have been said on this blog in order to do so.”
Sorry, I’ve been away, duckster. I’ll try and help explain things.
The graph that your looking at is a reconstruction of data using one of several statistical techniques employed by the paper in an “attempt” to determine whether the proxy data has any predictive value. The conclusion was that it doesn’t. From the paper:
“This is disturbing: if a model cannot predict the occurrence of a
sharp run-up in an out-of-sample block which is contiguous with the insample
training set, then it seems highly unlikely that it has power to detect
such levels or run-ups in the more distant past. It is even more discouraging
when one recalls Figure 15: the model cannot capture the sharp run-up
even in-sample. In sum, these results suggest that the ninety-three sequences
that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature.”
I’ll interpret. It is saying, because it couldn’t detect the sharp increase in temperatures, as seen in the 1990s, there is no reason to believe it would detect sharp increases or decreases of the past.
duckster, I know this is hard, it’s probably like the time my first wife……..well, never mind that. But, I know where you’re coming from. Remember, these are reconstructions from proxies which the paper concluded where not of the quality necessary to have predictive(or retro) value. They use the graphs to show you why they are not of good value. They are not using them to illustrate some perceived view of reality.
You could try actually reading the darn thing. If you gloss over the statistical formulas, it is a fairly nice read.

Amino Acids in Meteorites

August 14, 2010 10:25 pm

It’s late Saturday night, this post has been here 4 1/2 hours, and there are 93 comments. Busy night for Anthony and the Moderators.
[REPLY – We, er, live for, um, danger. ~ Evan]

CRS, Dr.P.H.

August 14, 2010 10:25 pm

OK, the Real Climate guys are reacting to it!
From their Comments section:
There’s apparently a paper forthcoming from McShane and Wyner in Annals of Applied Statistics to the effect (in my inexpert paraphrase) that proxies can’t say anything useful about climate. Regardless of whether CO2 produces heat, I’ll bet that this paper will.
[Response: The M&W paper will likely take some time to look through (especially since it isn’t fully published and the SI does not seem to be available yet), but I’m sure people will indeed be looking. I note that one of their conclusions “If we consider rolling decades, 1997-2006 is the warmest on record; our model gives an 80% chance that it was the warmest in the past thousand years” is completely in line with the analogous IPCC AR4 statement. But this isn’t the thread for this, so let’s leave discussion for when there is a fuller appreciation for what’s been done. – gavin]

« Previous 1 2 3 4 5 6 … 49 Next »

wpDiscuz

Welcome to Watts Up With That, one of the most well-known climate blogs! We gather the latest scientific research, news, and expert opinion to help you understand how our planet is changing and what implications it may have for humanity. Our approach is based on facts, objective analysis, and open discussions about one of the most critical issues of our time. Watts up with that climate and what changes await us – let’s figure it out together!

Watts Up With That covers a wide range of topics related to climate change and its impact on the world. Here’s what’s important to us:

Global warming – its causes, consequences, and future forecasts.
Analysis of current climate research and its findings.
Climate change news.
Extreme weather events – hurricanes, droughts, floods, and their connection to climate change.
The impact of different energy sources on the environment and the development of sustainable technologies.
Political and economic aspects and how states and international organizations respond to climate change.