New paper makes a hockey sticky wicket of Mann et al 98/99/08

NOTE: This has been running two weeks at the top of WUWT, discussion has slowed, so I’m placing it back in regular que.  – Anthony

UPDATES:

Statistician William Briggs weighs in here

Eduardo Zorita weighs in here

Anonymous blogger “Deep Climate” weighs in with what he/she calls a “deeply flawed study” here

After a week of being “preoccupied” Real Climate finally breaks radio silence here. It appears to be a prelude to a dismissal with a “wave of the hand”

Supplementary Info now available: All data and code used in this paper are available at the Annals of Applied Statistics supplementary materials website:

http://www.imstat.org/aoas/supplements/default.htm

=========================================

Sticky Wicket – phrase, meaning: “A difficult situation”.

Oh, my. There is a new and important study on temperature proxy reconstructions (McShane and Wyner 2010) submitted into the Annals of Applied Statistics and is listed to be published in the next issue. According to Steve McIntyre, this is one of the “top statistical journals”. This paper is a direct and serious rebuttal to the proxy reconstructions of Mann. It seems watertight on the surface, because instead of trying to attack the proxy data quality issues, they assumed the proxy data was accurate for their purpose, then created a bayesian backcast method. Then, using the proxy data, they demonstrate it fails to reproduce the sharp 20th century uptick.

Now, there’s a new look to the familiar “hockey stick”.

Before:

Multiproxy reconstruction of Northern Hemisphere surface temperature variations over the past millennium (blue), along with 50-year average (black), a measure of the statistical uncertainty associated with the reconstruction (gray), and instrumental surface temperature data for the last 150 years (red), based on the work by Mann et al. (1999). This figure has sometimes been referred to as the hockey stick. Source: IPCC (2001).

After:

FIG 16. Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD and backcasts 998-1849 AD. The cyan region indicates uncertainty due to t, the green region indicates uncertainty due to β, and the gray region indicates total uncertainty.

Not only are the results stunning, but the paper is highly readable, written in a sensible style that most laymen can absorb, even if they don’t understand some of the finer points of bayesian and loess filters, or principal components. Not only that, this paper is a confirmation of McIntyre and McKitrick’s work, with a strong nod to Wegman. I highly recommend reading this and distributing this story widely.

Here’s the submitted paper:

A Statistical Analysis of Multiple Temperature Proxies: Are Reconstructions of Surface Temperatures Over the Last 1000 Years Reliable?

(PDF, 2.5 MB. Backup download available here: McShane and Wyner 2010 )

It states in its abstract:

We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Furthermore, various model specifications that perform similarly at predicting temperature produce extremely different historical backcasts. Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago.

Here are some excerpts from the paper (emphasis in paragraphs mine):

This one shows that M&M hit the mark, because it is independent validation:

In other words, our model performs better when using highly autocorrelated

noise rather than proxies to ”predict” temperature. The real proxies are less predictive than our ”fake” data. While the Lasso generated reconstructions using the proxies are highly statistically significant compared to simple null models, they do not achieve statistical significance against sophisticated null models.

We are not the first to observe this effect. It was shown, in McIntyre

and McKitrick (2005a,c), that random sequences with complex local dependence

structures can predict temperatures. Their approach has been

roundly dismissed in the climate science literature:

To generate ”random” noise series, MM05c apply the full autoregressive structure of the real world proxy series. In this way, they in fact train their stochastic engine with significant (if not dominant) low frequency climate signal rather than purely non-climatic noise and its persistence. [Emphasis in original]

Ammann and Wahl (2007)

On the power of the proxy data to actually detect climate change:

This is disturbing: if a model cannot predict the occurrence of a sharp run-up in an out-of-sample block which is contiguous with the insample training set, then it seems highly unlikely that it has power to detect such levels or run-ups in the more distant past. It is even more discouraging when one recalls Figure 15: the model cannot capture the sharp run-up even in-sample. In sum, these results suggest that the ninety-three sequences that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature. See Footnote 12

Footnote 12:

On the other hand, perhaps our model is unable to detect the high level of and sharp run-up in recent temperatures because anthropogenic factors have, for example, caused a regime change in the relation between temperatures and proxies. While this is certainly a consistent line of reasoning, it is also fraught with peril for, once one admits the possibility of regime changes in the instrumental period, it raises the question of whether such changes exist elsewhere over the past 1,000 years. Furthermore, it implies that up to half of the already short instrumental record is corrupted by anthropogenic factors, thus undermining paleoclimatology as a statistical enterprise.

FIG 15. In-sample Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD.

We plot the in-sample portion of this backcast (1850-1998 AD) in Figure 15. Not surprisingly, the model tracks CRU reasonably well because it is in-sample. However, despite the fact that the backcast is both in-sample and initialized with the high true temperatures from 1999 AD and 2000 AD, it still cannot capture either the high level of or the sharp run-up in temperatures of the 1990s. It is substantially biased low. That the model cannot capture run-up even in-sample does not portend well for its ability

to capture similar levels and run-ups if they exist out-of-sample.

Conclusion.

Research on multi-proxy temperature reconstructions of the earth’s temperature is now entering its second decade. While the literature is large, there has been very little collaboration with universitylevel, professional statisticians (Wegman et al., 2006; Wegman, 2006). Our paper is an effort to apply some modern statistical methods to these problems. While our results agree with the climate scientists findings in some

respects, our methods of estimating model uncertainty and accuracy are in sharp disagreement.

On the one hand, we conclude unequivocally that the evidence for a ”long-handled” hockey stick (where the shaft of the hockey stick extends to the year 1000 AD) is lacking in the data. The fundamental problem is that there is a limited amount of proxy data which dates back to 1000 AD; what is available is weakly predictive of global annual temperature. Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample.

As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has

a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth. Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model.

Our main contribution is our efforts to seriously grapple with the uncertainty involved in paleoclimatological reconstructions. Regression of high dimensional time series is always a complex problem with many traps. In our case, the particular challenges include (i) a short sequence of training data, (ii) more predictors than observations, (iii) a very weak signal, and (iv) response and predictor variables which are both strongly autocorrelated.

The final point is particularly troublesome: since the data is not easily modeled by a simple autoregressive process it follows that the number of truly independent observations (i.e., the effective sample size) may be just too small for accurate reconstruction.

Climate scientists have greatly underestimated the uncertainty of proxy based reconstructions and hence have been overconfident in their models. We have shown that time dependence in the temperature series is sufficiently strong to permit complex sequences of random numbers to forecast out-of-sample reasonably well fairly frequently (see, for example, Figure 9). Furthermore, even proxy based models with approximately the same amount of reconstructive skill (Figures 11,12, and 13), produce strikingly dissimilar historical backcasts: some of these look like hockey sticks but most do not (Figure 14).

Natural climate variability is not well understood and is probably quite large. It is not clear that the proxies currently used to predict temperature are even predictive of it at the scale of several decades let alone over many centuries. Nonetheless, paleoclimatoligical reconstructions constitute only one source of evidence in the AGW debate. Our work stands entirely on the shoulders of those environmental scientists who labored untold years to assemble the vast network of natural proxies. Although we assume the reliability of their data for our purposes here, there still remains a considerable number of outstanding questions that can only be answered with a free and open inquiry and a great deal of replication.

===============================================================

Commenters on WUWT report that Tamino and Romm are deleting comments even mentioning this paper on their blog comment forum. Their refusal to even acknowledge it tells you it has squarely hit the target, and the fat lady has sung – loudly.

(h/t to WUWT reader “thechuckr”)

Share

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
1.2K Comments
Inline Feedbacks
View all comments
August 15, 2010 12:37 am

Its so nice to see a study saying what everyone can see, but still take it to a higher level.
Some of the problems comparing modern temperatures with medieval temperatures we discussed here at WUWT bac in april:
http://wattsupwiththat.com/2010/04/04/ipcc-how-not-to-compare-temperatures/

August 15, 2010 12:40 am

Can anyone here show me a calibration certificate of a thermometer that is 150 years old?

August 15, 2010 12:50 am

And you expect a global warmer to understand this? It is flying well above their heads!
Global warming is true because a friend of a friend who is a very eminent “scientist” said that he knew someone who did some statistics at University in their first year, and they said that they thought the stats were sound … so it’s got to be true!

Ben G
August 15, 2010 12:52 am

Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago.
Given there are big problems with the quality of the surface data temperature data in the last century after all the adjustments and land use changes, it’s no wonder they struggle forecasting such sharp warming. 😉

Spector
August 15, 2010 12:54 am

This new curve looks more like a scimitar — perfect for slicing sticks in two.

eudoxus
August 15, 2010 12:54 am

McShane and Wyner, 2010, figure 16, illustrates an absolutely (in terms of the sign of slope) unprecedented (over the last millennium) rate of increase in global temperature timed with the onset of the industrial revolution, and its associated CO2 release. It displays no evidence of a medieval warm period during the range 1000-1200 CE, but seems, rather, to predict a dip in temperature during that range of years. Backcast of modern data over last 1000 years predicts no previous years were warmer than the last few observed. Some interested observers are also curious to know the Bayesian forecast of future temperatures based on the “thin black line” of modern observations. Figure 16 illustrates the remarkable feature that, at the onset of the industrial revolution, the increase in the Earth’s temp was so great it created a reversal in its slope. Fascinating.

singularian
August 15, 2010 1:03 am

It’s late Saturday night, this post has been here 4 1/2 hours, and there are 93 comments. Busy night for Anthony and the Moderators.
Sunday evening here – want to know what the weather’s like tomorrow?

Christopher Hanley
August 15, 2010 1:04 am

I’m no mathematician, but the CO2 trend at Mauna Loa over the 1960 – 2010 period does not look linear to me.
http://woodfortrees.org/plot/esrl-co2/from:1960/to:1970/trend/plot/esrl-co2/from:1960/to:1980/trend/offset:1.5/plot/esrl-co2/from:1960/to:1990/trend/offset:%203/plot/esrl-co2/from:1960/to:2000/trend/offset:%204/plot/esrl-co2/from:1960/to:2010/trend/offset:%206
Linear or exponential, it does seem like a debating exercise not unlike how many angels can dance on the head of a pin?

sandyinderby
August 15, 2010 1:08 am

duckster says:
August 14, 2010 at 8:18 pm
@Smokey:
“Duckster, are you friggin’ blind??”
So where exactly would you place a medieval warming period here? Asking me to accept a medieval warming period (which is what I have been asked to do here) means showing how and where it got warmer, and then how and when it got cooler. A steady downward temperature trend is not a warming period.
Duckster if you are from the USA/Canada medieval isn’t the 18th century.
Definition is
The Middle Ages (adjectival form: medieval or mediaeval) is a period of European history from the 5th century to the 15th century. Looking at the graph in question it starts slap bang in the middle of that time frame.
Sandy

baffled24
August 15, 2010 1:11 am

The Hockey Stick; named, debunked, resurrected, debunked, resurrected, debunked and now re-incarnation; sounds like fiction about something that never quite died. Is this an argument about what a hockey stick looks like or how much it can deviate from the basic shape. Who determines how much it can deviate in order to qualify or not for the hockey stick shape? Fig 16 still looks somewhat like a hockey stick to me, albeit a little more curvacious, there’s no denying the upward temperature trend.

Feet2theFire
August 15, 2010 1:12 am

Apologies for writing before I have read the entire post or paper. Wanted to get these thoughts out there, for what they are worth…
…All in all, this was an inevitability, that someone would get around to the second round of multiproxy reconstructions.
I have from the beginning given Michael Mann credit for doing the first one. Consider how monumental a task it is, after all.
That said, anyone thinking that the first one done will be the final word had to be an idiot. Mann, especially. The man’s lack of humility is in itself monumental. There is not one whit of common sense in him believing he had done it perfectly – especially when he had played such games with the data in his homogenizations. EVEN IF THEEY WERE EVENTUALLY FOUND TO BE CORRECT, he should have known that they would be challenged, sooner or later. Once challenged, the cat fight would begin. WHAT ABOUT THIS DID HE NOT UNDERSTAND?
From this vantage point, Mann appears to have thought that if he bullied enough people it would all stand forever. To put it bluntly, what a d***wad. [my censoring – f2f]
But WHAT a relief! To finally have another peer-reviewed AND TRULY INDEPENDENT multiproxy reconstruction.
And now, on to reading the entire article…

August 15, 2010 1:17 am

James Sexton says: August 14, 2010 at 8:10 pm
While breaking from the reading, mainly because Adobe isn’t responding at the moment..

.
I cannot get the pdf page 21 to show up without disrupting Adobe. Unfortunately it’s the nice graphs page. Had to whisk past it. Anyone else had probs??

Robert
August 15, 2010 1:18 am

I notice they didnt discuss wavelet analysis or Moburg 2005 which primarily uses low frequency proxies to do the heavy lifting. From what ive seen analysis wise, this is the best of the reconstructions either way. Plus what does it matter anyways… we know the MWP was caused by increased TSI, low volcanic activity and persistent AMO conditions unlike current warming (1975-current) which is anthropogenically forced.
I wanna see these statisticians tell me that Buntgen et al. 2008s tree rings had a weak signal too…
Either way, it doesnt disprove that we are warmer than the MWP, just that the methods of Mann et al were inaccurate.

Feet2theFire
August 15, 2010 1:25 am

Oh. One more point, now that I’ve read the article and part of the paper:
A few questions that arise:
Was this an outcome of Climategate?
Of M & M’s efforts that woke some other qualified people up to DOING such a reconstruction?
Did any of the FOI’s contribute to this paper, in freeing up the data?

Alexander K
August 15, 2010 1:25 am

This may be OT, but papers have just been submitted to the High Court in New Zealand by the NZ Climate Science Coalition to obtain a hearing in the matter of the ‘upwardly adjusted’ instrumental climate record for the past century by NIWA, the National Institute of Weather and Atmosphere. This was reported seriously by the MSM there, with an actual headline over a leading article, so times may be a’changing!

August 15, 2010 1:26 am

Mike Roddy wrote, “Similarly, climate scientists are getting bored with arguments from untrained individuals that the “trace gas” CO2 does not play the major role in the recent and rapid temperature increases. This role was proven in a laboratory in the 19th century by Arrhenius, and has not been seriously disputed since.”
The laboratory experiment did not include oceans. Long wave radiation only impacts the top few millimeters of the oceans and, therefore, cannot explain the rise in sea surface temperature and ocean heat content.

August 15, 2010 1:27 am

I just luuuuuuuuuuuuurve their graph fig. 8 page 15.
If I’ve understood this graph aright, we have here another diamond, hidden in broad daylight… a ready-made decade-by-decade calibration for UHI, using treering data properly, for once, to flag up the surface stations problem anomaly. Certainly the anomaly here is about 0.5 degC, the same as suggested by McKitrick et al in their very recent (and again, stunning) paper.
Now if this is the next line of investigation, correcting the basic temperature record, then not only does the recent warming disappear still further, but we now have the way clear to re-connect with the solar correlations (yes, Leif, thanks, causations are still eluding us as yet…..)

Mike Edwards
August 15, 2010 1:28 am

duckster says:
August 14, 2010 at 8:18 pm:

…So where exactly would you place a medieval warming period here? Asking me to accept a medieval warming period…

It isn’t just a question of a medieval warm period – there are a whole series of periods in the past 500,000 years that appear to have been warmer than the present. 4 of the previous interglacials have been considerably warmer than the current one, as shown by Ice Core data and by by significantly higher sea levels (~ 6 metres higher in the last interglacial, for example).
If you don’t believe me, Wikipedia has a good discussion and links to much of the data here:
http://en.wikipedia.org/wiki/Ice_core
The question for the climate modellers is whether their models can account for this behaviour of the Earth’s climate WITHOUT resorting to CO2, since the ice cores don’t show CO2 above pre-industrial levels.

Alexej Buergin
August 15, 2010 1:43 am

Sticky wicket:
Some people call the cricket pitch “wicket”, and a sticky wicket would e.g. be a wet pitch which makes it a difficult situation for the batsman.
Another interpretation: The real wicket ist the construction of three stumps with two bails that the bowler is trying to hit. If the bails are sticking to the stumps (in climatology “because of some chewing gum” would come to my mind) it will be difficult for the bowler to make them fall.
PS. Bails can be used a long time, England and Australia play for bails that are almost 130 years old (and burned, too).

Feet2theFire
August 15, 2010 1:45 am

Honesty in science:
M&W2010 (from the abstract):
“We propose our own reconstruction of Northern Hemisphere aver-
age annual land temperature over the last millenium [sic], assess its relia-
bility, and compare it to those from the climate science literature. Our
model provides a similar reconstruction but has much wider standard
errors, reflecting the weak signal and large uncertainty encountered
in this setting.”
See? Tell people that you recognize the weaknesses on your study, and even statistically assess your own statistics.
Having read a good bit on Richard Feynman recently, this from his 1973 CalTech graduation speech (you will all love this):
“…there is one feature I notice is missing in Cargo Cult Science [the topic of his speech]. That is the idea that we all hope you have learned in studying science in school – we never explicitly say what this is, but just hope that you catch on by all the examples of scientific investigation. It is interesting, therefore, to bring it out now and speak of it explicitly. It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty – a kind of leaning over backwards. For example, if you’re doing an experiment, you should report everything you think might make it invalid – not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked – to make sure the other fellow can tell they’ve been eliminated.
Details that could throw doubt on your interpretation must be given, if you know them. You must do the best you can – if you know anything at all wrong, or possibly wrong – to explain it. If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it. There is also a more subtle problem. When you haveout a lot of ideas together into an elaborate theory, you want to make sure, when explaining what it fits, that those things are not just the things that gave you the idea for the theory; but that the finished theory makes something else come out right, in addition.
In summary, the idea is to try to give all the information to help others to judge the value of your contribution; not just the information that leads to judgement in one particular direction or another….”
Nobel Winner Feynman would have been proud of McShane and Wyner.

tonyb
Editor
August 15, 2010 1:48 am

Smokey
Smokey said;
August 14, 2010 at 8:33 pm
Rocky Road,
“Here is the Phil Jones chart…”
To your other natural cycles of rapid temperature rise can be added this one which Phil Jones is very well aware of it. It happened from around 1700 and is captured in CET and alluded to in reords from the time in other countries such as those from the Botanic gardens in Uppsalla Sweden-the home town of Arrhenius.
http://i45.tinypic.com/125rs3m.jpg
As can be seen from the next chart there sems to have been a rise in temperatures commencing from 1690 rather than 1880-James Hansen merely pluged into the latter stages of a centuries old trend which itself appears to have peaked around 1250 with the LIA intervening.
http://homepage.ntlworld.com/jdrake/Questioning_Climate/_sgg/m2_1.htm
Perhaps greater credence will be placed in the future on the actual records that people such as myself post, which get dismissed as ‘anecdotal’ and therfore unreliable.
Funny really isn’t it, an actual observation made at the time is ‘anecdota’l but silly and tortured proxies have become so ‘reliable’ they have become the basis for an attempt to break the worlds economy.
Tonyb

Mikael Pihlström
August 15, 2010 1:49 am

James Sexton says:
August 14, 2010 at 8:24 pm
duckster says:
August 14, 2010 at 7:48 pm
What they are stating is, even if the data are correct, Mann et al. did it wrong(along with a long list of other statistician wannbees), and further, proxies have no predictive properties. Now, work backwards from that. If you require further explanations, just ask, I’d be happy to provide them to you.
……………………….
If proxies have no predictive value, why do the authors persist in
doing their own reconstruction? If paleo reconstructions are universally
dead (I am OK with that) they are dead for everyone. You have to forget
your MWP argument to.

August 15, 2010 1:49 am

James Sexton says: August 14, 2010 at 7:44 pm
This paper doesn’t simply break a hockey stick, it breaks an entire sub-specialty of climatology, specifically paleoclimatology. They will either have to reprint all text books or throw the psuedo-science out the window…

Right on, James

James Sexton says: August 14, 2010 at 7:44 pm
…throw the psuedo-science out the window to the trash heap to lay alongside phrenology, numerology, and astrology…

Not right on, James. Just as we’ve been saying all along at WUWT, CA and the rest, you need to examine BOTH sides of the argument, not just rely on the “official” “debunks”. I did a fair bit of research into CSICOP’s supposed debunk of astrology and it was not a pretty story, in fact the tactics I saw there were remarkably similar to RealClimate et al. And you need to read up about Kepler and Newton too, as seen from the “other side” – fascinating, and good science.
If you’re interested to follow this one up, email me – click my name, etc.

londo
August 15, 2010 1:49 am

It is quite a joy reading this paper, if not for any other reason then, for its educational value and for the strive of the authors to illuminate the complexity of this problem. It has the clear signature of a paper that wants to explain something to the reader instead of trying to overrun the cautious readers by attacking him with numbers and terminology, such as e.g. MBH98. It has the potential of becoming a classical paper that is handed out to students early on in their carriers, perhaps even in the study of paleoclimate. At least when fame, fortune and politics leaves this discipline of science.

martyn
August 15, 2010 1:51 am

UNCORRECTED TRANSCRIPT OF ORAL EVIDENCE To be published as HC 369-ii
SCIENCE AND TECHNOLOGY COMMITTEE
Setting the scene
Tuesday 27 July 2010
LORD REES OF LUDLOW
Q78 Graham Stringer: None of them looked really looked at the science, and where they stepped over the science, as Oxburgh did, he said that he was rather surprised that methods that depended on advanced statistics had not used advanced statisticians; he said that they had also used subjective methods. So I think David Willetts was wrong to say that somehow these had validated the science, because the science was not looked at. One, do you think the science should be looked at? If it was to be looked at, how would it be done?
Lord Rees of Ludlow : I would, to some extent, contest what you have just said. These papers were refereed, but the key thing which the Oxburgh Committee did was to actually go and sit with the scientists and see what they actually did and how they analysed the data. As regards the statistics, Professor Hand from Imperial College, who is one of the UK’s leading statisticians, was put on the Oxburgh Panel precisely because he had that expertise. What the report said was that indeed they had not used the optimum sophisticated techniques but he thought it would not have made any difference to the results. So, again, I do not think the science from that group is severely under question from the techniques they used.
http://www.publications.parliament.uk/pa/cm201011/cmselect/cmsctech/uc369-ii/uc36901.htm

1 4 5 6 7 8 49