NOTE: This has been running two weeks at the top of WUWT, discussion has slowed, so I’m placing it back in regular que. – Anthony
UPDATES:
Statistician William Briggs weighs in here
Eduardo Zorita weighs in here
Anonymous blogger “Deep Climate” weighs in with what he/she calls a “deeply flawed study” here
After a week of being “preoccupied” Real Climate finally breaks radio silence here. It appears to be a prelude to a dismissal with a “wave of the hand”
Supplementary Info now available: All data and code used in this paper are available at the Annals of Applied Statistics supplementary materials website:
http://www.imstat.org/aoas/supplements/default.htm
=========================================
Sticky Wicket – phrase, meaning: “A difficult situation”.
Oh, my. There is a new and important study on temperature proxy reconstructions (McShane and Wyner 2010) submitted into the Annals of Applied Statistics and is listed to be published in the next issue. According to Steve McIntyre, this is one of the “top statistical journals”. This paper is a direct and serious rebuttal to the proxy reconstructions of Mann. It seems watertight on the surface, because instead of trying to attack the proxy data quality issues, they assumed the proxy data was accurate for their purpose, then created a bayesian backcast method. Then, using the proxy data, they demonstrate it fails to reproduce the sharp 20th century uptick.
Now, there’s a new look to the familiar “hockey stick”.
Before:

After:

Not only are the results stunning, but the paper is highly readable, written in a sensible style that most laymen can absorb, even if they don’t understand some of the finer points of bayesian and loess filters, or principal components. Not only that, this paper is a confirmation of McIntyre and McKitrick’s work, with a strong nod to Wegman. I highly recommend reading this and distributing this story widely.
Here’s the submitted paper:
(PDF, 2.5 MB. Backup download available here: McShane and Wyner 2010 )
It states in its abstract:
We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Furthermore, various model specifications that perform similarly at predicting temperature produce extremely different historical backcasts. Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago.
Here are some excerpts from the paper (emphasis in paragraphs mine):
This one shows that M&M hit the mark, because it is independent validation:
In other words, our model performs better when using highly autocorrelated
noise rather than proxies to ”predict” temperature. The real proxies are less predictive than our ”fake” data. While the Lasso generated reconstructions using the proxies are highly statistically significant compared to simple null models, they do not achieve statistical significance against sophisticated null models.
We are not the first to observe this effect. It was shown, in McIntyre
and McKitrick (2005a,c), that random sequences with complex local dependence
structures can predict temperatures. Their approach has been
roundly dismissed in the climate science literature:
To generate ”random” noise series, MM05c apply the full autoregressive structure of the real world proxy series. In this way, they in fact train their stochastic engine with significant (if not dominant) low frequency climate signal rather than purely non-climatic noise and its persistence. [Emphasis in original]
Ammann and Wahl (2007)
…
On the power of the proxy data to actually detect climate change:
This is disturbing: if a model cannot predict the occurrence of a sharp run-up in an out-of-sample block which is contiguous with the insample training set, then it seems highly unlikely that it has power to detect such levels or run-ups in the more distant past. It is even more discouraging when one recalls Figure 15: the model cannot capture the sharp run-up even in-sample. In sum, these results suggest that the ninety-three sequences that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature. See Footnote 12
Footnote 12:
On the other hand, perhaps our model is unable to detect the high level of and sharp run-up in recent temperatures because anthropogenic factors have, for example, caused a regime change in the relation between temperatures and proxies. While this is certainly a consistent line of reasoning, it is also fraught with peril for, once one admits the possibility of regime changes in the instrumental period, it raises the question of whether such changes exist elsewhere over the past 1,000 years. Furthermore, it implies that up to half of the already short instrumental record is corrupted by anthropogenic factors, thus undermining paleoclimatology as a statistical enterprise.
…

We plot the in-sample portion of this backcast (1850-1998 AD) in Figure 15. Not surprisingly, the model tracks CRU reasonably well because it is in-sample. However, despite the fact that the backcast is both in-sample and initialized with the high true temperatures from 1999 AD and 2000 AD, it still cannot capture either the high level of or the sharp run-up in temperatures of the 1990s. It is substantially biased low. That the model cannot capture run-up even in-sample does not portend well for its ability
to capture similar levels and run-ups if they exist out-of-sample.
…
Conclusion.
Research on multi-proxy temperature reconstructions of the earth’s temperature is now entering its second decade. While the literature is large, there has been very little collaboration with universitylevel, professional statisticians (Wegman et al., 2006; Wegman, 2006). Our paper is an effort to apply some modern statistical methods to these problems. While our results agree with the climate scientists findings in some
respects, our methods of estimating model uncertainty and accuracy are in sharp disagreement.
On the one hand, we conclude unequivocally that the evidence for a ”long-handled” hockey stick (where the shaft of the hockey stick extends to the year 1000 AD) is lacking in the data. The fundamental problem is that there is a limited amount of proxy data which dates back to 1000 AD; what is available is weakly predictive of global annual temperature. Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample.
As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has
a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth. Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model.
Our main contribution is our efforts to seriously grapple with the uncertainty involved in paleoclimatological reconstructions. Regression of high dimensional time series is always a complex problem with many traps. In our case, the particular challenges include (i) a short sequence of training data, (ii) more predictors than observations, (iii) a very weak signal, and (iv) response and predictor variables which are both strongly autocorrelated.
The final point is particularly troublesome: since the data is not easily modeled by a simple autoregressive process it follows that the number of truly independent observations (i.e., the effective sample size) may be just too small for accurate reconstruction.
Climate scientists have greatly underestimated the uncertainty of proxy based reconstructions and hence have been overconfident in their models. We have shown that time dependence in the temperature series is sufficiently strong to permit complex sequences of random numbers to forecast out-of-sample reasonably well fairly frequently (see, for example, Figure 9). Furthermore, even proxy based models with approximately the same amount of reconstructive skill (Figures 11,12, and 13), produce strikingly dissimilar historical backcasts: some of these look like hockey sticks but most do not (Figure 14).
Natural climate variability is not well understood and is probably quite large. It is not clear that the proxies currently used to predict temperature are even predictive of it at the scale of several decades let alone over many centuries. Nonetheless, paleoclimatoligical reconstructions constitute only one source of evidence in the AGW debate. Our work stands entirely on the shoulders of those environmental scientists who labored untold years to assemble the vast network of natural proxies. Although we assume the reliability of their data for our purposes here, there still remains a considerable number of outstanding questions that can only be answered with a free and open inquiry and a great deal of replication.
===============================================================
Commenters on WUWT report that Tamino and Romm are deleting comments even mentioning this paper on their blog comment forum. Their refusal to even acknowledge it tells you it has squarely hit the target, and the fat lady has sung – loudly.
(h/t to WUWT reader “thechuckr”)

“If we consider rolling decades, 1997-2006 is the warmest on record; our model gives an 80% chance that it was the warmest in the past thousand years” is completely in line with the analogous IPCC AR4 statement. But this isn’t the thread for this, so let’s leave discussion for when there is a fuller appreciation for what’s been done. – gavin]“
===============================================
This paper is not saying what Gavin wants it to say.
This paper has absolutely nothing to do with proving or disproving temps and has absolutely nothing to do with being in line with anything the IPCC says about temps.
The IPCC statement is based on assuming that certain temp prox are accurate and that modeling of that data are also correct.
This paper is assuming that Mann’s temp prox are accurate.
This paper is just showing what Mann did with his data.
It is showing that of all the models runs that Mann did, he had to pick the one that showed what he wanted it to show.
This paper is showing that even using Mann’s own numbers, they could not reproduce his results.
As I understood the M&W paper (and I’m willing to be corrected, as IANAS), it describes a sophisticated method for analyzing this kind of data. They use what is presumably one of the most comprehensive data sets out there, and proceed to demonstrate what their method can do. The predictions are made simply to demonstrate that their method is capable of making them. The backcasts, similarly, are done because that is an appropriate and important step.
They verify that their results are in several ways consistent with other methods – this is also a necessary step when describing a new method. The big difference compared to previous work is that M&W’s analysis dramatically increases the error bars, showing that the data set in question has no predictive value to speak of.
If it holds up, it is a great contribution. Future work on proxy reconstructions could apply this method and produce analyses with much better predictive force. The thing that is bound to happen is that you double back and re-assess your data and underlying assumptions, when your sophisticated statistical analysis tells you that your results do not match reality.
Generally speaking, this sort of advance does not necessarily cast previous work in disrepute, even though it may overturn their conclusions. Authors of previous work can, OTOH, cast themselves in disrepute by refusing to accept that their results were wrong, even if confronted with convincing evidence.
Dang. Wish I’d seen this earlier. I hate to be at the end of a few hundred comments. Oh well. I do have a couple thoughts.
First, the new graph showing what Mann’s data turns out when the math is done correctly is still a hockey stick. The blade looks like it lost its size enhancer. The shaft is now tilted up from being flat. But it still looks like a hockey stick to me. The end of the shaft at year 1000 appears to be higher than the short blade.
If we keep in mind that this graph shows bad statistics and not reality, we can still have a bit of fun with it. At the bottom of the LIA, if we take the warmist view that the industrial revolution accounts for the upturn, then we might be able to posit that CO2 saved us from a developing ice age. Alternatively, the uptick that is the blade is simply temperatures returning to normal, not the effect of trace amounts of CO2. Another point to make about the graph is that it only shows 1000 years which, in geologic time frames is an extremely tiny period. I’m hoping these guys take on the data selection next as I’d really like to see what they have to say about that.
At this point I would avoid triumphalism. Just as I’d like to see evidence from the warmists that is replicable, I’d like to see what other statisticians have to say about this work. While I’m a skeptic of CAWG because of all the bad science and politics pretending to be science, it still could be true. If so, the so called scientists have really hurt the cause they claim to be supporting by losing the trust of the public with their dishonest techniques for both getting the results they got as well as trying to pass it off as credible. There will be attacks on this new paper, and it will be interesting to see what they are and whether or not they have any credibility. Watching the fat lady sing would be fun, but I’m not sure the CAGW crowd doesn’t have an encore or two first.
And last, I think its fun to contemplate that if this new paper holds up, it will be fun to point out that all the papers that advocate for warming were peer reviewed extensively and nobody caught the problems. It won’t help the credibility of peer review any. The next little while should be lots of fun.
@Latimer Alder says August 15, 2010 at 6:32 am:
This paper shows 2 things:
1. The amount of manipulation done to the numbers has created a(n artificial) dataset that is unusable. This is really remarkable in science. Millions of datapoints, and the data cannot be used to extrapolate anything. The buggery part essentially, according to the paper, is post-1990 period. They point to it again and again as something they simply cannot get to work. This should be running up red flags about the instrument data from after 1990.
2. Mann did not know what he was doing. YES, the “science” at the CRU/Mann level IS 100% statistics, yet – as pointed out in the paper – there are not enough scientific statisticians working in the field of global warming. Climatologists should not be doing their own statistics. (Pay attention to HARRY_READ_ME.txt) Statisticians should not be out collecting tree rings or ice cores.
Much of the heat energy from AGW is in the oceans.
Yet SST has increased less than land surface. Sea level rise is perhaps 8 inches over the last century. (Uplifting/subsiding, eroding/silting areas make this difficult to calculate. And coral atolls tend to go with the flow, literally rising with the tide, as it were.)
The “big six” and other oceanic/atmospheric cycles (PDO, NAO, etc., etc.) appear to be much involved.
I tend to be more of a sea witch than a sun worshiper, myself. So be careful when you look at the trend since the late 1970s. All six (and more) of those cycles were simultaneously in cool phase. From 1979 – 2001, they all went from cool to warm, one at a time. On (natural) schedule. And now one or two are beginning to stagger and revert to cool, the PDO being pack leader.
So the next couple of decades are going to tell us a lot. (But I will also keep an eye on solar cycle 24, just in case!)
Don’t forget the implications this has for co2 records. They are much less certain.
henry@Evan Jones
on what measurements do you base your believe that CO2 is a greehouse gas i.e that its warming properties are greater than its cooling properties?
Mike Roddy says at 7:44 am:
Thanks, Mike, for your uninformed opinion. The fact is, however, that Dr Wegman is an internationally recognized statistician. His C.V. [click on his name] lists his personal interests at the end — none of which is related to climate issues. Dr Wegman is neutral on the subject. But he is not neutral on the improper use of statistics.
One of the central criticisms of Michael Mann’s CAGW clique is their amateurish, incompetent and self-serving use of statistics. They do not understand statistics. Mann refuses to use R because it does not validate his hokey stick chart. He programs in Fortran, which is akin to an English major writing in ancient Sumerian cuneiform.
The fact that Mike Roddy tries to excuse Mann’s shenanigans by referring to the climate pal review system that Mann controls only shows how thoroughly corrupt the climate peer review system and the Michael Mann clique are.
Without proper statistical verification, tree ring proxy studies are not worth the pixels on a computer screen — and that is why Mann and this tax-sucking clique run and hide out from real statisticians, and why the UN/IPCC refuses to allow any unbiased statisticians to review its CAGW sales brochures.
GeoFlynx says:
August 15, 2010 at 9:40 am
“The paper is referred to as McShane and Wyner 2010, but the data on their graphs end at the year 2000. Has the “hottest decade on record” been omitted?”
Geo, remember that this paper’s purpose was to detect if the proxies had and predictive capabilities. They used the instrumental data(CRU N.H.) to determine if the proxy data held true to the temps. The reason for omitting the data beyond 2000 is because there is almost no proxy data after 2000, so one can’t compare instrumental data to proxy data that doesn’t exist. Geo, you and others should note, the paper isn’t stating what was or wasn’t the temps of the past, they were only checking if the proxy data could predict or, conversely, detect temperatures if the proper statistical methods were applied. Apparently the answer is no. This is a pretty innocuous statement. The implications, however, are not innocuous. Specifically, if your name is Mann. But he’s not the only one caught in the “lasso”,(heh, I made a punny!) Any modeling made from the conclusions of the paleo-science specific to recent climatology are in question. So, as our friend Mike Roddy has pointed out, there are about 20 other scientists whose work that is called into question. Mostly because they believed in the validity of statistical methods they employed. At least I hope they believed in them. They probably should have taken some of their work to a statistician. But then, it may have invalidated their studies, so they didn’t. Recollect, one of the hallmarks of a psuedo-science is “Lack of openness to testing by other experts.” There are several other hallmarks, and current CAGW climatology seems to fit perfectly.
Andrew has summarized many of the salient points of this paper. Nevertheless, there is one point that I feel deserves a little more emphasis (from page 38)
“…the fact that the proxies seem unable to capture the
sharp run-up in temperature of the 1990s.”
The overall goal of a proxy is to estimate the temperature series in years before direct records of temperature exist. Once on reconstructs such a series, one can look at it to answer a number of questions. One such question:
Is there evidence in the (reconstructed) temperature series of examples in the past of sharp run-ups in temperature, similar to what has been observed in the last half century.”
Looking that the series for such evidence implicitly accepts that if those run-ups occurred, they would be evident in the reconstruction.
Is that assumption valid?
We have only one period where it is known that such a temperature run-up occurred, and the authors tell us that the proxy measures don’t identify it.
If the only known example of a temperature run-up isn’t manifested in the proxy data, why on earth would you assume that past temperature run-ups would be captured in the proxy data?
Anyone using the proxy data to reject the assumption that there were temperature spikes in the past is guilty of making an assumption expressly rejected by the data.
Now, now, smokes, be nice.
But he’s right about the statistics, Mike.
Wegman is tops in his field. And at the Wegman hearings, Mann (IIRC; might have been one of the others) proudly declaimed he was not a statistician.
That does not bode well for what amounts to an involved statistical study (‘way out of my league).
The simple and acknowledged failure of the proxies to match observed temperature changes from the 60’s onward should have been quite sufficient on it’s own to demonstrate that the proxies were unsuitable for the purpose of comparing the present with the past. It is sad that it has taken so much time and effort to unravel the deceit.
This gives a whole new meaning to ‘hide the decline’ and the ‘nature trick’.
Those strategies were clearly intended to avoid the clear implication that the proxies were an unsuitable starting point from which to assess the significance of current ongoing temperature variations.
If they had then accepted the obvious then their careers and the whole concept of AGW would have ended at that point because without using the available proxy evidence no recent temperature measurements could ever have been said to be in any way unusual.
The truth always gets out and here it is.
sorry but OT.
Niwa sued over data accuracy
NZPA Last updated 16:09 15/08/2010
The country’s state-owned weather and atmospheric research body is being taken to court in a challenge over the accuracy of its data used to calculate global warming.
The New Zealand Climate Science Coalition said it had lodged papers with the High Court asking the court to invalidate the official temperatures record of the National Institute of Water and Atmospheric Research (Niwa).
The lobby of climate sceptics and ACT Party have long criticised Niwa over its temperature data, which Niwa says is mainstream science and not controversial, and the raw data publicly available.
The coalition said the New Zealand Temperature Records (NZTR) were the historical base of NIWA’s advice to the Government on issues relating to climate change.
Coalition spokesman Bryan Leyland said many scientists believed although the earth had been warming for 150 years, it had not heated as much as Government archives claimed.
He said the New Zealand Meteorological Service had shown no warming during the past century but Niwa had adjusted its records to show a warming trend of 1degC. The warming figure was high and almost 50 percent above the global average, said Mr Leyland.
The coalition said the 1degC warming during the 20th century was based on adjustments taken by Niwa from a 1981 student thesis by then student Jim Salinger, a Niwa employee who was later sacked after talking to the media without permission.
The Salinger thesis was subjective and untested and meteorologists more senior to Dr Salinger did not consider the temperature data should be adjusted, it said.
The coalition would ask the court to find Niwa’s New Zealand Temperature Record invalid.
It would also seek a court declaration preventing Niwa from using the NZTR when it advised the Government or any other body on global climate issues. It would also ask the court to order Niwa to produce a full and accurate NZTR.
Mr Leyland said Niwa was refusing to repudiate the NZTR to avoid political embarrassment and loss of public confidence.
A substantive hearing was expected later this year.
http://www.stuff.co.nz/national/4026330/Niwa-sued-over-data-accuracy
Gavin now has a live link to the .pdf download of this paper on Real Climate. I think we’ve caught his attention! I’ve been watching awareness of this paper evolve over there for the past day or so, from “We’ve heard about the paper” to “here’s the link”:
[Response: The M&W paper will likely take some time to look through (especially since it isn’t fully published and the SI does not seem to be available yet), but I’m sure people will indeed be looking. I note that one of their conclusions “If we consider rolling decades, 1997-2006 is the warmest on record; our model gives an 80% chance that it was the warmest in the past thousand years” is completely in line with the analogous IPCC AR4 statement. But this isn’t the thread for this, so let’s leave discussion for when there is a fuller appreciation for what’s been done. – gavin]
See this posting at “Expert Credibility in Climate Change – Responses to Comments”
Filed under: Climate Science skeptics — group @ur momisugly 3 August 2010
Statisticians now emphasize the importance of involving them more
in e.g. proxy reconstructions. Quite rightly.
But, the accusation that Mann and others neglected to do so, just to be
able to manipulate and distort results;I don’t believe it.
It is more a question of tradition in routine science: you would
perhaps consult a statistics expert for general advice, but mostly not really
integrate him/her in the team. For a variety of reasons: (1) you don’t see
all risks of faulty application, not being an expert, (2) you may not have
the funds reserved in the project budget, (3) you can see that the expert
is bugged by some many other teams (personal experience) etc.
I think you have to look at it historically, the science projects have
grown in the past decades, both in complexity, scope and also concerning
the stakes from a societal viewpoint.
So to sum up:
The almighty Hockey Stick was derived from:
1. Manipulated, mangled, cherry picked data, and
2. The statistical methodology it uses is somewhere between highly suspect and very wrong.
As a scientist, I strenuously object to use of the term “climate scientist”, as it suggests these people actually practice real science.
On CA 1, Patrick Hadley had this very interesting comment:
“Posted Aug 15, 2010 at 10:16 AM | Permalink | Reply
Professor Wyner http://climateaudit.org/2010/08/14/mcshane-and-wyner-2010/#comment-239212 tells us that The paper has been accepted, but publication is still a bit into the future as it is likely to be accompanied by invited discussants and comment.
It seems likely that Michael Mann would be one of the invited discussants, and hence that the Hockey Team have been well aware of this paper for some time. If that is the case then one can understand why Gavin et al have been so uninterested in discussions about the proxies recently, and have been playing down the importance of the hockey stick.”
Smokey says:
“You don’t understand. Nothing was ‘omitted.’ The data used was the exact same data that Mann used.This paper corrects the bogus, self-serving ‘statistics’ that Mann has been spoon feeding the credulous believers in CAGW.”
GeoFlynx – Actually I understand quite well. The title of the paper is “A Statistical Analysis of Multiple Temperature Proxies: Are Reconstructions of Surface Temperatures Over the Last 1000 Years Reliable?” and the work addresses “hockey stick” graphs from a variety of NORTH AMERICAN (not global) reconstructions, many with more modern dates than the Mann graph you refer to.
When this paper concludes, “Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model. “, one can only question why the most recent decade was omitted. Given that graphs, where the 2000 data limit occurs, are not direct comparison with the Mann 1998 data and that the change would be slight, I again raise the question.
I was thinking along similar lines that sandyinderby was thinking. That is that Duckster does not know when the medieval warm period (MWP) was. Duckster’s comments on the MWP seemed as if he were directing us to look at the Little Ice Age as the MWP. Maybe history is not interesting to Duckster.
Stephen Wilde says:
August 15, 2010 at 11:10 am
The simple and acknowledged failure of the proxies to match observed temperature changes from the 60′s onward should have been quite sufficient on it’s own to demonstrate that the proxies were unsuitable for the purpose of comparing the present with the past. It is sad that it has taken so much time and effort to unravel the deceit.
================================
Stephen, I agree.
Weren’t tree rings used up until 1960. Then because tree rings showed cooling after 1960, the tree ring data was replaced with thermometers at airports.
And 1960 is where temps show a rapid jump up.
The only questions I have are:
1) Will anyone besides us pay any attention?
2) How long until RC says “it doesn’t matter.”
Is it Christmas again? This week has been just like when Climategate broke last November. Now we have MMH2010 followed by MW2010.
Mark.r says:
August 15, 2010 at 11:18 am
I wish you guys well in that endeavor. We haven’t had much luck in the courts in the U.S. thus far, but that may change shortly.
Anthony!!!!!! Can you put an explanation about the graphs posted? People are coming here looking at the graphs and concluding the authors are validating the hockey stick!!!(While obviously not bothering to read the paper.) Specifically figure 16. Apparently, figure 15 is only visible to people that actually read the paper. We should do a study on that phenomenon. Never mind, its already been done over and over again. People see what they wish to see.
It should not be forgotten that this paper is not only an indictment of Mann’s original papers, it’s also an indictment of the “peer-review” process that allowed such rubbish to be printed. Any publication that relies heavily on complex statistics where the authors are not themselves trained statisticians should be reviewed by one both before submission and as part of the review process. Clearly, this did not happen and the journals responsible, their editorial boards and reviewer panels should hang their heads in shame and consider their positions. The reviewers were clearly making judgements way outside their areas of expertise which any competent editor should have spotted.
Oh, oh, solar science look out, your wasteland is going to peer over the horizon sometime soon.