NOTE: This has been running two weeks at the top of WUWT, discussion has slowed, so I’m placing it back in regular que. – Anthony
UPDATES:
Statistician William Briggs weighs in here
Eduardo Zorita weighs in here
Anonymous blogger “Deep Climate” weighs in with what he/she calls a “deeply flawed study” here
After a week of being “preoccupied” Real Climate finally breaks radio silence here. It appears to be a prelude to a dismissal with a “wave of the hand”
Supplementary Info now available: All data and code used in this paper are available at the Annals of Applied Statistics supplementary materials website:
http://www.imstat.org/aoas/supplements/default.htm
=========================================
Sticky Wicket – phrase, meaning: “A difficult situation”.
Oh, my. There is a new and important study on temperature proxy reconstructions (McShane and Wyner 2010) submitted into the Annals of Applied Statistics and is listed to be published in the next issue. According to Steve McIntyre, this is one of the “top statistical journals”. This paper is a direct and serious rebuttal to the proxy reconstructions of Mann. It seems watertight on the surface, because instead of trying to attack the proxy data quality issues, they assumed the proxy data was accurate for their purpose, then created a bayesian backcast method. Then, using the proxy data, they demonstrate it fails to reproduce the sharp 20th century uptick.
Now, there’s a new look to the familiar “hockey stick”.
Before:

After:

Not only are the results stunning, but the paper is highly readable, written in a sensible style that most laymen can absorb, even if they don’t understand some of the finer points of bayesian and loess filters, or principal components. Not only that, this paper is a confirmation of McIntyre and McKitrick’s work, with a strong nod to Wegman. I highly recommend reading this and distributing this story widely.
Here’s the submitted paper:
(PDF, 2.5 MB. Backup download available here: McShane and Wyner 2010 )
It states in its abstract:
We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Furthermore, various model specifications that perform similarly at predicting temperature produce extremely different historical backcasts. Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago.
Here are some excerpts from the paper (emphasis in paragraphs mine):
This one shows that M&M hit the mark, because it is independent validation:
In other words, our model performs better when using highly autocorrelated
noise rather than proxies to ”predict” temperature. The real proxies are less predictive than our ”fake” data. While the Lasso generated reconstructions using the proxies are highly statistically significant compared to simple null models, they do not achieve statistical significance against sophisticated null models.
We are not the first to observe this effect. It was shown, in McIntyre
and McKitrick (2005a,c), that random sequences with complex local dependence
structures can predict temperatures. Their approach has been
roundly dismissed in the climate science literature:
To generate ”random” noise series, MM05c apply the full autoregressive structure of the real world proxy series. In this way, they in fact train their stochastic engine with significant (if not dominant) low frequency climate signal rather than purely non-climatic noise and its persistence. [Emphasis in original]
Ammann and Wahl (2007)
…
On the power of the proxy data to actually detect climate change:
This is disturbing: if a model cannot predict the occurrence of a sharp run-up in an out-of-sample block which is contiguous with the insample training set, then it seems highly unlikely that it has power to detect such levels or run-ups in the more distant past. It is even more discouraging when one recalls Figure 15: the model cannot capture the sharp run-up even in-sample. In sum, these results suggest that the ninety-three sequences that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature. See Footnote 12
Footnote 12:
On the other hand, perhaps our model is unable to detect the high level of and sharp run-up in recent temperatures because anthropogenic factors have, for example, caused a regime change in the relation between temperatures and proxies. While this is certainly a consistent line of reasoning, it is also fraught with peril for, once one admits the possibility of regime changes in the instrumental period, it raises the question of whether such changes exist elsewhere over the past 1,000 years. Furthermore, it implies that up to half of the already short instrumental record is corrupted by anthropogenic factors, thus undermining paleoclimatology as a statistical enterprise.
…

We plot the in-sample portion of this backcast (1850-1998 AD) in Figure 15. Not surprisingly, the model tracks CRU reasonably well because it is in-sample. However, despite the fact that the backcast is both in-sample and initialized with the high true temperatures from 1999 AD and 2000 AD, it still cannot capture either the high level of or the sharp run-up in temperatures of the 1990s. It is substantially biased low. That the model cannot capture run-up even in-sample does not portend well for its ability
to capture similar levels and run-ups if they exist out-of-sample.
…
Conclusion.
Research on multi-proxy temperature reconstructions of the earth’s temperature is now entering its second decade. While the literature is large, there has been very little collaboration with universitylevel, professional statisticians (Wegman et al., 2006; Wegman, 2006). Our paper is an effort to apply some modern statistical methods to these problems. While our results agree with the climate scientists findings in some
respects, our methods of estimating model uncertainty and accuracy are in sharp disagreement.
On the one hand, we conclude unequivocally that the evidence for a ”long-handled” hockey stick (where the shaft of the hockey stick extends to the year 1000 AD) is lacking in the data. The fundamental problem is that there is a limited amount of proxy data which dates back to 1000 AD; what is available is weakly predictive of global annual temperature. Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample.
As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has
a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth. Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model.
Our main contribution is our efforts to seriously grapple with the uncertainty involved in paleoclimatological reconstructions. Regression of high dimensional time series is always a complex problem with many traps. In our case, the particular challenges include (i) a short sequence of training data, (ii) more predictors than observations, (iii) a very weak signal, and (iv) response and predictor variables which are both strongly autocorrelated.
The final point is particularly troublesome: since the data is not easily modeled by a simple autoregressive process it follows that the number of truly independent observations (i.e., the effective sample size) may be just too small for accurate reconstruction.
Climate scientists have greatly underestimated the uncertainty of proxy based reconstructions and hence have been overconfident in their models. We have shown that time dependence in the temperature series is sufficiently strong to permit complex sequences of random numbers to forecast out-of-sample reasonably well fairly frequently (see, for example, Figure 9). Furthermore, even proxy based models with approximately the same amount of reconstructive skill (Figures 11,12, and 13), produce strikingly dissimilar historical backcasts: some of these look like hockey sticks but most do not (Figure 14).
Natural climate variability is not well understood and is probably quite large. It is not clear that the proxies currently used to predict temperature are even predictive of it at the scale of several decades let alone over many centuries. Nonetheless, paleoclimatoligical reconstructions constitute only one source of evidence in the AGW debate. Our work stands entirely on the shoulders of those environmental scientists who labored untold years to assemble the vast network of natural proxies. Although we assume the reliability of their data for our purposes here, there still remains a considerable number of outstanding questions that can only be answered with a free and open inquiry and a great deal of replication.
===============================================================
Commenters on WUWT report that Tamino and Romm are deleting comments even mentioning this paper on their blog comment forum. Their refusal to even acknowledge it tells you it has squarely hit the target, and the fat lady has sung – loudly.
(h/t to WUWT reader “thechuckr”)

Nick Stokes @ur momisugly 9:04pm
Oh dear Nick, a quotation from page 2 of the introduction putting the background in context and quoting from the IPCC. Is that the best you can do? There are another 43 pages after that or did you stop there?
Mind you, Blakeley McShane is from the Kellogg School of Management and is obviously funded by big corn.
It’s nice to see a publication in one of the “top statistical journals” even after all these years which agrees with what the vast majority of us here have known for so long: that the hockey team’s work is pure mince.
(“Mince”: A Scottish term which roughly translates as “garbage”.)
I can just picture Gavin Schmidt’s grandfather writing on his blog in the early 20th century: “Arrhenius disappoints”.
Oh Nick, let it go; CO2, in sufficient quantities “can force temperature increases”. We know that:
http://wattsupwiththat.files.wordpress.com/2009/07/co2_temperature_curve_saturation.png
The late S. Schneider knew that:
http://www.sciencemag.org/cgi/content/abstract/173/3992/138
It’s just that those forcings diminish to statistical errors, as does the temperature response.
The point about this [final] nail in the HS is that it shows whatever is happening today is not exceptional; that was the point about Mann’s HS and the basis of AGW; it was wrong. Find another cause; how about asteroid collisions? That’s a real issue.
To misquote Julius Henry (Groucho) Marx, amateurs should stick to brain surgery. Lightweight Math Mann was clearly out of his depth.
Looking at Fig. 16 above, it reinforces my belief that the descent into the next ice age started about 3500 years ago at the end of the Minoan warming. The GISP2 ice core shows a long term downslope during the last ice age of about 0.14C per 1000 years. The initial descent from the peak of the last interglacial was about 0.4C per 1000 years. Fig. 16 is steeper than that.
Michael says:
August 14, 2010 at 7:26 pm
Easily done – instead of one forecast, we’d have 100. that greatly increases the chance one is more accurate. Or do you propose getting everyone to agree on a single forecast. (Consensus forecasting?)
Somehow I have have trouble visualizing a Board of Directors meeting discussing the relative merits of of people who have spent 1000s of hours looking in to many details over a group of mostly nameless, and uncontactable people.
There were three. None hit the US mainland, but your statement is 1000% wrong.
All right, infinitely wrong. 3 / 0 does not compute.
Poetic justice. The alarmists fiddle the temperature record to introduce a spurious temperature rise, which these statisticians trust as real, and so it becomes evidence that the other alarmist fiddle, the hockey stick, is ‘not robust’. That means, of course, that on the one hand those of us who seek truth rather than ideology must therefore have reservations about some of this paper’s results until the consequences of the temperature fiddle have been incorporated properly. On the other hand the shysters cannot consistently agree with our reservations! The irony of it!
Why should the community of climatologists object to this peer-reviewed publication? After all, they stood up & cheered on RC etc. when the Oxburgh inquiry exonerated the Hockey Team of professional malfeasance.
However, they could have used a few undergraduate classes in linear regression!
—
The panel found that the statistical tools that CRU scientists employed were not always the most cutting-edge, or most appropriate.
“We cannot help remarking that it is very surprising that research in an area that depends so heavily on statistical methods has not been carried out in close collaboration with professional statisticians,” reads the inquiry’s conclusions.
However, “it is not clear that better methods would have produced significantly different results,” the panel adds.
http://www.newscientist.com/article/dn18776-climategate-scientists-chastised-over-statistics.html
——
This latest publication seems to indicate that, yes, better statistical methods DO produce significantly different results!
This paper is huge, thanks for posting, Anthony!
duckster says:
August 14, 2010 at 9:20 pm
(…)
OK. So your job now would be to show consistency by fitting it into the available evidence so that it doesn’t contradict the other points you have made against CAGW. There is no point at all in destroying Mann if you have to throw out half of the all the other things that have been said on this blog in order to do so.
—–Reply:
No, nothing else needs to be said–this is a refutation of Mann’s statistical methods; as such there is NO requirement to include anything else. Your request is simply an obvious attempt of deflecting a very damning rebuttal of Mann’s mathematical acumen. His authority is over; he IS destroyed and with him goes CAGW. Gone; done; kaput.
Why? He lied. Or he was stupid. Your choice.
I wonder when the Real Climate team will slam out a response without an answer so their lemmings have a url to point to and declare this paper debunked.
Fortunately many of us realize that reality isn’t a function of assertion.
While some may not see the humor in this, to me, it is side splitting, a final (b)slap……….
Our work stands entirely on the shoulders of those environmental scientists
who labored untold years to assemble the vast network of natural
proxies. Although we assume the reliability of their data for our purposes
here, there still remains a considerable number of outstanding questions
that can only be answered with a free and open inquiry and a great deal of
replication. <———— lol, Phil's greatest fears realized.
In other words, they seem to be saying, “YOU’RE DOING IT WRONG!!!” And, “you had your chance, now the grown-ups have to do it.” “Now, run along and bring me back the thermometer readings and we’ll show you how to interpret them.”
Well, in light of all this, I am open to comments on why deriving climate sensitivity from the LGM is ok.
I also would like to reiterate that we can predict temperature just fine using the tools given us by meteorology and would like to know why it is necessary to throw out any of the data used in weather prediction when it comes to climate models. Anyone?
Anthony- I think Figure 17 from the paper is actually very telling, since it overlays the newly estimated error bands on top of the archival hockey stick spaghetti graphs.
Really stunning. Going back more than a few centuries, the error bands fill up the entire vertical extent!
Annals of Applied Statistics is the sixth rated stastics journal (impact factor, of course, = 2.57) Of course, it could be tops in its specialty. It looks like it has a heavy representation of Japanese sponsors and some major statistics departments. The editor-in-chief is a Bush-era National Science medal award winner. The editor for physical & environmental statistics actually looks like an environmentalist from his listed interests. So, fine journal with mixed viewpoints.
This is an interesting development because it leaves the alarmist professors an out that will allow them to suspend their claims and still receive further funding. You may have won this battle if they take the offer. I don’t think you’ve won the war. That will resume when the cold snap is over and people have forgotten about scandals and such.
I always like to know who wrote what I’m asked to read. I don’t like to feel like I am part of the mushroom syndrome. A new name gets a couple of chances – if I think the comments are untracked I first try to find out whether the writer has any respectability because my lack of understanding could be the problem. Then I could do some research and reading and better assimilate the new information. After reading Mike Roddy’s statements I felt the need to check on him. The second of the listings below seems the most likely to be knowledgeable about his trade. Henceforth, I will only read Jaguar related posts by a Mike Roddy. Climate related comments by Mike Roddy – I don’t think so.
Okay, will the real Mike Roddy please stand up.
Mike Roddy is a long-time CP commenter. A UC Berkeley graduate, he has pursued many careers, including solar manufacturing, writing and research, and managing social housing projects on four continents.
OR
Mike Roddy Motors’ The Independent Jaguar Specialists’
The Leader in all aspects of Servicing, Repairs, Restorations and Improvements for all makes of Jaguar cars.
That poor graph – it’s suffered a death by a thousand cuts and yet it still has stalwart defenders. I hope Mann has other successes on which to ride to comfortable retirement because this horse is finished. I can’t help but think his circle of peers is becoming a close knit bunch whose objectivity is certainly now open to wonderment.
I’m probably going to wish there was a preview function here…
GrantB says:
August 14, 2010 at 9:32 pm
Nick Stokes @ur momisugly 9:04pm
Oh dear Nick, a quotation from page 2 of the introduction putting the background in context and quoting from the IPCC. Is that the best you can do? There are another 43 pages after that or did you stop there?
Mind you, Blakeley McShane is from the Kellogg School of Management and is obviously funded by big corn.
————-
REPLY:
Sorry, mate, I’m at University of Illinois and I’M funded by big corn!! And big cheese, big meat packer, big pandemic etc.
Here’s McShane’s website:
http://www.blakemcshane.com
I’ve never met him, but I’ve lectured a bit over at Kellogg & they are usually considered one of the top graduate schools of business in the USA. His resume is very impressive.
This publication is a serious shot across the bow of the Hockey Team crowd, let’s see how they react to it.
Annals of Applied Statistics Editors better prepare for the incoming wave of team science comments on the paper. The good news is the team now has to argue their statistical “methods” with professional statisticians.
Game over man, game over.
(Mods, would you please change “school’s” to “schools” for me in the preceding post? I hate stupid grammatical errors! Thanks much, Chuck the DrPH)
[REPLY – I looked and looked. Can’t find the durn thing. Please accept a te absolve in lieu of correction. ~ Evan]
[Reply: Fixe’d! By the undercover grammar sleuth ~…]
Mike Roddy says:
August 14, 2010 at 7:13 pm
The authors of the 20- odd studies that confirmed Mann’s….
The NAS (National Academy of Science) did not affirm Mann’s conclusions:
“Even less confidence can be placed in the original conclusions by Mann et al. (1999) that “the 1990s are likely the warmest decade, and 1998 the warmest year, in at least a millennium” ”
National Academy of Science
“Surface Temperature Reconstructions for the Last 2,000 Years”
-page 4
http://www.nap.edu/catalog.php?record_id=11676
G. Burger 2010
By avoiding the (calibrating) instrumental period, and by using a fairly robust spectral measure for low-frequency performance, the above coherence analysis has uncovered several inconsistencies among the group of millennial reconstructions that figured prominently in the latest IPCC report and elsewhere. An immediate lesson from this is that simple visual inspection of smoothed time series, grouped and overlaid into a single graph, can be very misleading. For example, the two reconstructions Ma99 and Ma08L, which have previously been described to be in “striking agreement” (cf. Mann et al., 2008), turned out to be the most incoherent of all in our analysis.
incoherent [ˌɪnkəʊˈhɪərənt]
adj
1. lacking in clarity or organization; disordered
2. unable to express oneself clearly; inarticulate
3. (Physics / General Physics) Physics (of two or more waves) having the same frequency but not the same phase
Tamino and Romm are deleting comments even mentioning this paper
How bloody scientific of them. 😉
duckster says:
August 14, 2010 at 8:55 pm
duckster says:
August 14, 2010 at 9:20 pm
“So is this how you get around the fact that McShane and Wyner is showing almost 2 degrees of warming since 1850? This is way beyond what Mann et al show – and would be truly unprecedented, wouldn’t it?
……
OK. So your job now would be to show consistency by fitting it into the available evidence so that it doesn’t contradict the other points you have made against CAGW. There is no point at all in destroying Mann if you have to throw out half of the all the other things that have been said on this blog in order to do so.”
Sorry, I’ve been away, duckster. I’ll try and help explain things.
The graph that your looking at is a reconstruction of data using one of several statistical techniques employed by the paper in an “attempt” to determine whether the proxy data has any predictive value. The conclusion was that it doesn’t. From the paper:
“This is disturbing: if a model cannot predict the occurrence of a
sharp run-up in an out-of-sample block which is contiguous with the insample
training set, then it seems highly unlikely that it has power to detect
such levels or run-ups in the more distant past. It is even more discouraging
when one recalls Figure 15: the model cannot capture the sharp run-up
even in-sample. In sum, these results suggest that the ninety-three sequences
that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature.”
I’ll interpret. It is saying, because it couldn’t detect the sharp increase in temperatures, as seen in the 1990s, there is no reason to believe it would detect sharp increases or decreases of the past.
duckster, I know this is hard, it’s probably like the time my first wife……..well, never mind that. But, I know where you’re coming from. Remember, these are reconstructions from proxies which the paper concluded where not of the quality necessary to have predictive(or retro) value. They use the graphs to show you why they are not of good value. They are not using them to illustrate some perceived view of reality.
You could try actually reading the darn thing. If you gloss over the statistical formulas, it is a fairly nice read.
It’s late Saturday night, this post has been here 4 1/2 hours, and there are 93 comments. Busy night for Anthony and the Moderators.
[REPLY – We, er, live for, um, danger. ~ Evan]
OK, the Real Climate guys are reacting to it!
From their Comments section:
There’s apparently a paper forthcoming from McShane and Wyner in Annals of Applied Statistics to the effect (in my inexpert paraphrase) that proxies can’t say anything useful about climate. Regardless of whether CO2 produces heat, I’ll bet that this paper will.
[Response: The M&W paper will likely take some time to look through (especially since it isn’t fully published and the SI does not seem to be available yet), but I’m sure people will indeed be looking. I note that one of their conclusions “If we consider rolling decades, 1997-2006 is the warmest on record; our model gives an 80% chance that it was the warmest in the past thousand years” is completely in line with the analogous IPCC AR4 statement. But this isn’t the thread for this, so let’s leave discussion for when there is a fuller appreciation for what’s been done. – gavin]