New paper makes a hockey sticky wicket of Mann et al 98/99/08

NOTE: This has been running two weeks at the top of WUWT, discussion has slowed, so I’m placing it back in regular que.  – Anthony

UPDATES:

Statistician William Briggs weighs in here

Eduardo Zorita weighs in here

Anonymous blogger “Deep Climate” weighs in with what he/she calls a “deeply flawed study” here

After a week of being “preoccupied” Real Climate finally breaks radio silence here. It appears to be a prelude to a dismissal with a “wave of the hand”

Supplementary Info now available: All data and code used in this paper are available at the Annals of Applied Statistics supplementary materials website:

http://www.imstat.org/aoas/supplements/default.htm

=========================================

Sticky Wicket – phrase, meaning: “A difficult situation”.

Oh, my. There is a new and important study on temperature proxy reconstructions (McShane and Wyner 2010) submitted into the Annals of Applied Statistics and is listed to be published in the next issue. According to Steve McIntyre, this is one of the “top statistical journals”. This paper is a direct and serious rebuttal to the proxy reconstructions of Mann. It seems watertight on the surface, because instead of trying to attack the proxy data quality issues, they assumed the proxy data was accurate for their purpose, then created a bayesian backcast method. Then, using the proxy data, they demonstrate it fails to reproduce the sharp 20th century uptick.

Now, there’s a new look to the familiar “hockey stick”.

Before:

Multiproxy reconstruction of Northern Hemisphere surface temperature variations over the past millennium (blue), along with 50-year average (black), a measure of the statistical uncertainty associated with the reconstruction (gray), and instrumental surface temperature data for the last 150 years (red), based on the work by Mann et al. (1999). This figure has sometimes been referred to as the hockey stick. Source: IPCC (2001).

After:

FIG 16. Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD and backcasts 998-1849 AD. The cyan region indicates uncertainty due to t, the green region indicates uncertainty due to β, and the gray region indicates total uncertainty.

Not only are the results stunning, but the paper is highly readable, written in a sensible style that most laymen can absorb, even if they don’t understand some of the finer points of bayesian and loess filters, or principal components. Not only that, this paper is a confirmation of McIntyre and McKitrick’s work, with a strong nod to Wegman. I highly recommend reading this and distributing this story widely.

Here’s the submitted paper:

A Statistical Analysis of Multiple Temperature Proxies: Are Reconstructions of Surface Temperatures Over the Last 1000 Years Reliable?

(PDF, 2.5 MB. Backup download available here: McShane and Wyner 2010 )

It states in its abstract:

We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Furthermore, various model specifications that perform similarly at predicting temperature produce extremely different historical backcasts. Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago.

Here are some excerpts from the paper (emphasis in paragraphs mine):

This one shows that M&M hit the mark, because it is independent validation:

In other words, our model performs better when using highly autocorrelated

noise rather than proxies to ”predict” temperature. The real proxies are less predictive than our ”fake” data. While the Lasso generated reconstructions using the proxies are highly statistically significant compared to simple null models, they do not achieve statistical significance against sophisticated null models.

We are not the first to observe this effect. It was shown, in McIntyre

and McKitrick (2005a,c), that random sequences with complex local dependence

structures can predict temperatures. Their approach has been

roundly dismissed in the climate science literature:

To generate ”random” noise series, MM05c apply the full autoregressive structure of the real world proxy series. In this way, they in fact train their stochastic engine with significant (if not dominant) low frequency climate signal rather than purely non-climatic noise and its persistence. [Emphasis in original]

Ammann and Wahl (2007)

On the power of the proxy data to actually detect climate change:

This is disturbing: if a model cannot predict the occurrence of a sharp run-up in an out-of-sample block which is contiguous with the insample training set, then it seems highly unlikely that it has power to detect such levels or run-ups in the more distant past. It is even more discouraging when one recalls Figure 15: the model cannot capture the sharp run-up even in-sample. In sum, these results suggest that the ninety-three sequences that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature. See Footnote 12

Footnote 12:

On the other hand, perhaps our model is unable to detect the high level of and sharp run-up in recent temperatures because anthropogenic factors have, for example, caused a regime change in the relation between temperatures and proxies. While this is certainly a consistent line of reasoning, it is also fraught with peril for, once one admits the possibility of regime changes in the instrumental period, it raises the question of whether such changes exist elsewhere over the past 1,000 years. Furthermore, it implies that up to half of the already short instrumental record is corrupted by anthropogenic factors, thus undermining paleoclimatology as a statistical enterprise.

FIG 15. In-sample Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD.

We plot the in-sample portion of this backcast (1850-1998 AD) in Figure 15. Not surprisingly, the model tracks CRU reasonably well because it is in-sample. However, despite the fact that the backcast is both in-sample and initialized with the high true temperatures from 1999 AD and 2000 AD, it still cannot capture either the high level of or the sharp run-up in temperatures of the 1990s. It is substantially biased low. That the model cannot capture run-up even in-sample does not portend well for its ability

to capture similar levels and run-ups if they exist out-of-sample.

Conclusion.

Research on multi-proxy temperature reconstructions of the earth’s temperature is now entering its second decade. While the literature is large, there has been very little collaboration with universitylevel, professional statisticians (Wegman et al., 2006; Wegman, 2006). Our paper is an effort to apply some modern statistical methods to these problems. While our results agree with the climate scientists findings in some

respects, our methods of estimating model uncertainty and accuracy are in sharp disagreement.

On the one hand, we conclude unequivocally that the evidence for a ”long-handled” hockey stick (where the shaft of the hockey stick extends to the year 1000 AD) is lacking in the data. The fundamental problem is that there is a limited amount of proxy data which dates back to 1000 AD; what is available is weakly predictive of global annual temperature. Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample.

As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has

a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth. Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model.

Our main contribution is our efforts to seriously grapple with the uncertainty involved in paleoclimatological reconstructions. Regression of high dimensional time series is always a complex problem with many traps. In our case, the particular challenges include (i) a short sequence of training data, (ii) more predictors than observations, (iii) a very weak signal, and (iv) response and predictor variables which are both strongly autocorrelated.

The final point is particularly troublesome: since the data is not easily modeled by a simple autoregressive process it follows that the number of truly independent observations (i.e., the effective sample size) may be just too small for accurate reconstruction.

Climate scientists have greatly underestimated the uncertainty of proxy based reconstructions and hence have been overconfident in their models. We have shown that time dependence in the temperature series is sufficiently strong to permit complex sequences of random numbers to forecast out-of-sample reasonably well fairly frequently (see, for example, Figure 9). Furthermore, even proxy based models with approximately the same amount of reconstructive skill (Figures 11,12, and 13), produce strikingly dissimilar historical backcasts: some of these look like hockey sticks but most do not (Figure 14).

Natural climate variability is not well understood and is probably quite large. It is not clear that the proxies currently used to predict temperature are even predictive of it at the scale of several decades let alone over many centuries. Nonetheless, paleoclimatoligical reconstructions constitute only one source of evidence in the AGW debate. Our work stands entirely on the shoulders of those environmental scientists who labored untold years to assemble the vast network of natural proxies. Although we assume the reliability of their data for our purposes here, there still remains a considerable number of outstanding questions that can only be answered with a free and open inquiry and a great deal of replication.

===============================================================

Commenters on WUWT report that Tamino and Romm are deleting comments even mentioning this paper on their blog comment forum. Their refusal to even acknowledge it tells you it has squarely hit the target, and the fat lady has sung – loudly.

(h/t to WUWT reader “thechuckr”)

Share

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
1.2K Comments
Inline Feedbacks
View all comments
geronimo
August 15, 2010 1:52 am

Roddy; “Besides… Species are migrating north. Glaciers and Arctic ice are melting at unheard of rates. The ocean is becoming more acidic, and has experienced a 40% decline in fish biomass since 1950 due to CO2′s effect on phytoplankton. ”
Welcome Mike, your thoughts are appreciated, but if you make statements such as the above it is traditional to cite your sources. Where did you get this information from? I should add that reports by the WWF and Greenpeace aren’t seen as citable evidence.
@Duckster: Welcome to you too Duckster, again your input is appreciated, although you seem to have totally misunderstood the papers objectives. Still it’s good to have someone testing the (naturally) self-congratulationary tone of the many posts on here. A word to the wise though, you’re probably better waiting until realclimate prepares an answer that can be parrotted continuously than diving in the deep end with the posters on WUWT, they are a pretty knowledgeable bunch by any standards.
As for the paper, in my view it will be buried by the MSM, we are dealing with religious fervour here and no amount of evidence will prove to the faithful that we aren’t experiencing AGW and even if we are it won’t be catastrophic. However, like the hole in the Titanic the water is slowly filling the hold and it will sink. I have little doubt that in a couple of decades from now people will look back on this time and wonder how anyone could have taken this mumbo jumbo seriously.

joshua corning
August 15, 2010 1:53 am

The real fun will be watching the next IPCC panel doing back flips to keep this out of their next report.

old construction worker
August 15, 2010 2:07 am

Another blow to the EPA. More ammunition for the State of Virginia investigation.
[REPLY – We, er, live for, um, danger. ~ Evan]
Thanks I needed a Sunday morning chuckle.

August 15, 2010 2:11 am

This poor world fears in vain
That fresh ill o’er it lowers;
Let thunder growl again;
Go, crown yourselves with flowers!
Pierre-Jean de Béranger

Mikael Pihlström
August 15, 2010 2:22 am

Hmm. The conclusions seem to twist and bend a lot on the road
from article to these posts.
McShane and Wyner say:
“We see that our model gives a backcast which is very similar to those
in the literature, particularly from 1300 AD to the present.
In fact, our backcast very closely traces the Mann et al. (2008) EIV land
backcast,considered by climate scientists to be among the most skilled.
Though our model provides slightly warmer backcasts for the years
1000-1300 AD,we note it falls within or just outside the uncertainty bands of
the Mann et al. (2008) EIV land backcast even in that period. Hence, our
backcast matches their backcasts reasonably well.”
———————
So Mann et al (2008) is actually the most skilled until now, not e.g.
McIntyre and McKitrick?
BTW Smokey, if you want to link a figure from the article, why not
use fig. 17, which brings it alltogether: the warming of the last decades
is bigger than any backcast, H&W 2010 included.
Having said that, as far as I can judge, H&W 2010 is an important
contribution.

Niels A Nielsen
August 15, 2010 2:39 am

[Response: The M&W paper will likely take some time to look through (especially since it isn’t fully published and the SI does not seem to be available yet), but I’m sure people will indeed be looking. I note that one of their conclusions “If we consider rolling decades, 1997-2006 is the warmest on record; our model gives an 80% chance that it was the warmest in the past thousand years” is completely in line with the analogous IPCC AR4 statement. But this isn’t the thread for this, so let’s leave discussion for when there is a fuller appreciation for what’s been done. – gavin]
Oh yes, Gavin “..our model gives an 80% chance that it was the warmest in the past thousand years”
But..
“our model does not pass ‘statistical significance’ thresholds against savvy null models. Ultimately, what these tests essentially show is that the 1,000 year old proxy record has little power given the limited temperature record” (p. 41)
And then we have the proxy selection and orientation issues..

Jimbo
August 15, 2010 2:42 am

For attention of Anthony / Moderators
Suggetstion: Will you consider creating a “Hockey Stick” page under your Categories pull down menu on the right side of the page?
It would make it easier to find rebuttals to the hockey stick before this page and follow up pages become buried in the site.

Philemon
August 15, 2010 2:44 am

Mikael Pihlström says:
August 15, 2010 at 2:22 am
“…why not use fig. 17, which brings it alltogether: the warming of the last decades
is bigger than any backcast, H&W 2010 included.”
Look at the uncertainty bands.
“In fact, our uncertainty bands are so wide that they envelop all of the other backcasts in the literature. Given their ample width, it is difficult to say that recent warming is an extraordinary event compared to the last 1,000 years. For example, according to our uncertainty bands, it is possible that it was as warm in the year 1200 AD as it is today.” (McShane and Wyner, AOAS 2010, p. 37)

Invariant
August 15, 2010 2:46 am

1. Natural temperature variability may be large.
2. It’s not the sun (thanks Leif!).
3. Increased CO2 increase temperature.
Is it possible to tell magnitude of 3 given 1?

nevket240
August 15, 2010 2:49 am

Smokey says:
August 14, 2010 at 8:12 pm
duckster says:
“Looking at the paper above… No medieval warming period, I see. ”
Duckster, are you friggin’ blind?? ))
Wilfully, I’d suggest.
I wonder if OBummer is going to bailout the CCE after the paper is more widely read & debated?? After all he was a leading light in its formation.
regards

Espen
August 15, 2010 2:51 am

What a relief! Ever since the first time I tried to understand mannian statistics, I thought it was so wrong that I must have missed something. As a mathematician with some statistics experience (it’s not my main branch of math), I’m so relieved that the statisticians are finally entering the scene and give SteveM et al the credit they deserve.
About the sharp 1990s uptick: we know that the temperature record gets highly unreliable just at that point…

jim hogg
August 15, 2010 2:52 am

An important issue here is the accuracy of the recent temperature record. If it isn’t accurate then it wouldn’t be a surprise that the proxy record didn’t predict it . . . Numerous instances of upward bias have been identified on here . . . . Maybe it’s time to go right back to basics and look at only the raw reliable data from non-contaminated sites with equipment known to be accurate – so far as that’s possible. Time will sort this whole mess out and I don’t think the judgement will reflect too well on many of the players.

James Sexton
August 15, 2010 2:53 am

Lucy Skywalker says:
August 15, 2010 at 1:17 am
James Sexton says: August 14, 2010 at 8:10 pm
While breaking from the reading, mainly because Adobe isn’t responding at the moment…
“I cannot get the pdf page 21 to show up without disrupting Adobe. Unfortunately it’s the nice graphs page. Had to whisk past it. Anyone else had probs??”
That’s exactly where mine had problems…… btw, not prepared to argue astrology either way, hope I didn’t offend.

nevket240
August 15, 2010 2:56 am

Mikael Pihlström says:
August 15, 2010 at 1:49 am
James Sexton says:
August 14, 2010 at 8:24 pm
Whooa Neddy. There are plenty of written descriptions of the living conditions at that time to give a reasonable indication of how warm/cool it was. Plant types etc.
regards

Ken Hall
August 15, 2010 3:01 am

Mike Roddy is right. Mann et al do not care about mathematics and statistics, likewise the 20 odd other climatologists who confirm the hockey stick.
That is why they failed to spot the confirmation bias which ruins their science.
They are a bunch of incestuous peers who are seeking to confirm their faith. So their weak statistical analysis renders their science into a wishlist.

Jimbo
August 15, 2010 3:19 am

If you stretch the above graph to include the Roman Warm Period 2 thoudand years ago then what do you see?
10 March 2010
New technique shows Roman Warm Period Warmer than Present Day
A promising new technique to reconstruct past temperatures has been developed by scientists at the University of Saskatchewan, Canada and Durham University, England, using the shells of bivalve mollusks. Writing in the Proceedings of the National Academy of Science the scientists say that oxygen isotopes in their shells are a good proxy measurement of temperature and may provide the most detailed record yet of global climate change.”
http://www.thegwpf.org/the-observatory/653-new-technique-shows-roman-warm-period-warmer-than-present-day.html
http://www.pnas.org/content/early/2010/03/02/0902522107.full.pdf

Jimbo
August 15, 2010 3:20 am

Typo:
…2 thou[s] and years…

Margaret
August 15, 2010 3:24 am

I cannot get the pdf page 21 to show up without disrupting Adobe. Unfortunately it’s the nice graphs page. Had to whisk past it. Anyone else had probs??
My Adobe kept freezing up on page 21 also — but I found that if I used the sidebar to scroll past page 21 and then inched back I could actually get there in the end (about 3 restarts later!).

M White
August 15, 2010 3:26 am

Ah yes graphs. Its all about presentation

and

For thoughs that do not know him David attenborough is a much loved natuarlist and TV personality in the UK.
http://en.wikipedia.org/wiki/David_Attenborough

August 15, 2010 3:34 am

Mikael Pihlström: August 15, 2010 at 2:22 am
So Mann et al (2008) is actually the most skilled until now, not e.g. McIntyre and McKitrick?
Hardly.
“Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample. As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth.”
My emphasis.
Their statement doesn’t jibe with yours, that “Models are always abstractions
of the truth. The question is whether they give the general picture,
well e.g. Hansens scenarios seem to perform in this respect.”
http://wattsupwiththat.com/2010/08/12/target-monckton/#comment-456258
BTW, you still owe me the answer to this: “…since the AGW theory predicts an upper atmospheric tropical hot spot, and since the models predict the existence of that upper atmospheric hot spot, kindly tell us where it is to be found. In the real world, please, not in the truthy abstraction.”

August 15, 2010 3:43 am

Mikael Pihlström: August 15, 2010 at 1:49 am
If paleo reconstructions are universally dead (I am OK with that) they are dead for everyone. You have to forget your MWP argument to.
Hardly. You’re ignoring the fact that the existence of the MWP isn’t based on proxies, it’s based on the evidence of archaeological and geological findings, as well as written records.

DirkH
August 15, 2010 3:48 am

Mikael Pihlström says:
August 15, 2010 at 1:49 am
“[…] If proxies have no predictive value, why do the authors persist in
doing their own reconstruction? If paleo reconstructions are universally
dead (I am OK with that) they are dead for everyone. You have to forget
your MWP argument to. […]”
Not so. We still have historical accounts, for instance of wine grown in England and Greenland being settled by Vikings.

James Sexton
August 15, 2010 3:53 am

eudoxus says:
August 15, 2010 at 12:54 am
Mikael Pihlström says:
August 15, 2010 at 1:49 am
OMG!!! How is it you obviously bright guys can read something and totally, completely miss the point of the exercise? I’m a bit tired, but I’ll try to explain for you guys, too.
Mikael, “If proxies have no predictive value, why do the authors persist in
doing their own reconstruction? If paleo reconstructions are universally
dead (I am OK with that) they are dead for everyone. You have to forget
your MWP argument to.”
First, they did “their own” reconstruction to check the validity of the proxy data to see if it could predict reality. The conclusion is, it can’t. Similarly, (if you’d note figure 15 up in the posted article) you’ll see how the poxies totally missed the significant uptick in temps as we saw in the 1990’s. The authors concluded, (and I believe correctly so) that if the proxies can’t detect this significant warming, there is no reason to believe they could detect significant warming going back in time either. They used these graphs to illustrate the errors in the graphs. They did not use the graphs to attempt to illustrate their perception of reality. Here is what our friends had to say about figure 16…….“We decompose the uncertainty of our model’s backcast by plotting the
curves drawn using each of the methods outlined in the previous three
paragraphs in Figure 16. As can be seen, in the modern instrumental period
the residual variance (in cyan) dominates the uncertainty in the backcast.
However, the variance due to ￿β uncertainty (in green) propagates through
time and becomes the dominant portion of the overall error for earlier periods.
The primary conclusion is that failure to account for parameter uncertainty
results in overly confident model predictions.”
………ok, guys, did you read that? Specifically the last sentence. Our friends McShane and Wyner are not particularly happy with figure 16. They have problems with the results. They don’t perceive it as valid.
The whole paper was really about 2 questions. One would be to see if the statistical methods used by paleo-climatologists were sufficient. It appears they were not. Two, with proper statistical methods could one reconstruct historical temps using the available proxy data. Apparently not.
I know you guys don’t like the answers the paper gives, and I’m sure the team will respond with a rebuttal. But you shouldn’t try to read into the paper something that the paper clearly doesn’t say or imply. To me, the paper looks like it is very well done, but I’m not a statistician. Personally, I had no idea what a Brownian Motion pseudo-proxy was prior to this paper.
Mikael, I don’t think they are killing all paleo (or proxy) reconstructions. I believe what they are saying is one can get a general idea about certain things with stuff like tree rings ect. But, when one gets into specifics and details, such as 1 or 2 notches on a thermometer, proxies lack the ability to retrieve that kind of detailed information. They used the proxy data from Mann 2008 because it was the most comprehensive. When using measure temps, they used CRU data, only up to 2000, because proxy data is virtually non-existent after 2000. The question of the MWP really only arose since the hockey-stick reconstructions left them out. Prior to that the MWP was generally accepted. While some have tried to discern the MWP from proxy data, most of the evidence for the MWP is anecdotal from historians and things like unearthed farms in Greenland that was previously under a few feet of ice for a few hundred years or so.

Curious Canuck
August 15, 2010 3:56 am

Excellent input from people clearly prepared to seperate the work from the motivations. Congratulations to the authors and a reader’s thanks for it’s production and distribution to all involved.
Mike Roddy writes, among other drivel, that “The ocean is becoming more acidic, and has experienced a 40% decline in fish biomass since 1950 due to CO2′s effect on phytoplankton.”
There is no proof to your claim that ‘ocean acidification’ has led to a 40% decline in fish biomass due to CO2. There’s some evidence, but this evidence is skethy and piecemeal and nothing concluside.
On the other hand, the proponderance of evidence on fish stock decline points to overfishing, primarily the introduction of ‘mobile gear’ (purse seine and trawling) technologies and their use in large scale fishing operations. It’s been both practical and scientific lore for a long time that these methods of harvest far outstrip recruitment in the effected species as well as trawling’s (the larger the worse) ability to annihilate bottom habitats.
Absolutely nothing in the research refutes the role of overfishing in biomass depletion and this is further indicated by the selective nature of collapse of target and by-catch species.
Shame on your attempt to mitigate the damaging effects that ‘big steel’ bottom trawling and mobile gear has had on global fish stocks. This is the antithesis of fact and reality and the sort of fluff that has been used as an excuse to systematically destroy so much of the inshore (small boat) cod, hake, herring, redfish and countless other fisheries that made up the lifeblood of so many communities here (in Canada)and abroad.
Your opinions on fisheries management and marine ecosystems demonstrates a blindness to reality. This perception is further reinforced by your attack on ‘untrained’ comment (which you gave example of in your above-mentioned paragraph). Just because Mann et all says 2+2=5 it does not require (logically or academically) another Climate Modeller to explain that the answer is four.
Receding ice? Another matter of debate, with ample evidence for ANY opinion taken as fact. When the science you love matures and discovers its fallibility and its roots out its own charaltans and rogues, then it will be a proper science. Until then, you had best keep playing in the echo-chamber over at RC, where two plus two still equals five.

Stanislav Lem
August 15, 2010 4:09 am

These two authors have rock solid reputations. Essentially they say you can achieve similar results with auto-correlated noise. In light of this conclusion it’s pretty irrelevant which is the most skilled reconstruction, as simple as that.

1 5 6 7 8 9 49