New paper makes a hockey sticky wicket of Mann et al 98/99/08

NOTE: This has been running two weeks at the top of WUWT, discussion has slowed, so I’m placing it back in regular que.  – Anthony

UPDATES:

Statistician William Briggs weighs in here

Eduardo Zorita weighs in here

Anonymous blogger “Deep Climate” weighs in with what he/she calls a “deeply flawed study” here

After a week of being “preoccupied” Real Climate finally breaks radio silence here. It appears to be a prelude to a dismissal with a “wave of the hand”

Supplementary Info now available: All data and code used in this paper are available at the Annals of Applied Statistics supplementary materials website:

http://www.imstat.org/aoas/supplements/default.htm

=========================================

Sticky Wicket – phrase, meaning: “A difficult situation”.

Oh, my. There is a new and important study on temperature proxy reconstructions (McShane and Wyner 2010) submitted into the Annals of Applied Statistics and is listed to be published in the next issue. According to Steve McIntyre, this is one of the “top statistical journals”. This paper is a direct and serious rebuttal to the proxy reconstructions of Mann. It seems watertight on the surface, because instead of trying to attack the proxy data quality issues, they assumed the proxy data was accurate for their purpose, then created a bayesian backcast method. Then, using the proxy data, they demonstrate it fails to reproduce the sharp 20th century uptick.

Now, there’s a new look to the familiar “hockey stick”.

Before:

Multiproxy reconstruction of Northern Hemisphere surface temperature variations over the past millennium (blue), along with 50-year average (black), a measure of the statistical uncertainty associated with the reconstruction (gray), and instrumental surface temperature data for the last 150 years (red), based on the work by Mann et al. (1999). This figure has sometimes been referred to as the hockey stick. Source: IPCC (2001).

After:

FIG 16. Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD and backcasts 998-1849 AD. The cyan region indicates uncertainty due to t, the green region indicates uncertainty due to β, and the gray region indicates total uncertainty.

Not only are the results stunning, but the paper is highly readable, written in a sensible style that most laymen can absorb, even if they don’t understand some of the finer points of bayesian and loess filters, or principal components. Not only that, this paper is a confirmation of McIntyre and McKitrick’s work, with a strong nod to Wegman. I highly recommend reading this and distributing this story widely.

Here’s the submitted paper:

A Statistical Analysis of Multiple Temperature Proxies: Are Reconstructions of Surface Temperatures Over the Last 1000 Years Reliable?

(PDF, 2.5 MB. Backup download available here: McShane and Wyner 2010 )

It states in its abstract:

We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Furthermore, various model specifications that perform similarly at predicting temperature produce extremely different historical backcasts. Finally, the proxies seem unable to forecast the high levels of and sharp run-up in temperature in the 1990s either in-sample or from contiguous holdout blocks, thus casting doubt on their ability to predict such phenomena if in fact they occurred several hundred years ago.

Here are some excerpts from the paper (emphasis in paragraphs mine):

This one shows that M&M hit the mark, because it is independent validation:

In other words, our model performs better when using highly autocorrelated

noise rather than proxies to ”predict” temperature. The real proxies are less predictive than our ”fake” data. While the Lasso generated reconstructions using the proxies are highly statistically significant compared to simple null models, they do not achieve statistical significance against sophisticated null models.

We are not the first to observe this effect. It was shown, in McIntyre

and McKitrick (2005a,c), that random sequences with complex local dependence

structures can predict temperatures. Their approach has been

roundly dismissed in the climate science literature:

To generate ”random” noise series, MM05c apply the full autoregressive structure of the real world proxy series. In this way, they in fact train their stochastic engine with significant (if not dominant) low frequency climate signal rather than purely non-climatic noise and its persistence. [Emphasis in original]

Ammann and Wahl (2007)

On the power of the proxy data to actually detect climate change:

This is disturbing: if a model cannot predict the occurrence of a sharp run-up in an out-of-sample block which is contiguous with the insample training set, then it seems highly unlikely that it has power to detect such levels or run-ups in the more distant past. It is even more discouraging when one recalls Figure 15: the model cannot capture the sharp run-up even in-sample. In sum, these results suggest that the ninety-three sequences that comprise the 1,000 year old proxy record simply lack power to detect a sharp increase in temperature. See Footnote 12

Footnote 12:

On the other hand, perhaps our model is unable to detect the high level of and sharp run-up in recent temperatures because anthropogenic factors have, for example, caused a regime change in the relation between temperatures and proxies. While this is certainly a consistent line of reasoning, it is also fraught with peril for, once one admits the possibility of regime changes in the instrumental period, it raises the question of whether such changes exist elsewhere over the past 1,000 years. Furthermore, it implies that up to half of the already short instrumental record is corrupted by anthropogenic factors, thus undermining paleoclimatology as a statistical enterprise.

FIG 15. In-sample Backcast from Bayesian Model of Section 5. CRU Northern Hemisphere annual mean land temperature is given by the thin black line and a smoothed version is given by the thick black line. The forecast is given by the thin red line and a smoothed version is given by the thick red line. The model is fit on 1850-1998 AD.

We plot the in-sample portion of this backcast (1850-1998 AD) in Figure 15. Not surprisingly, the model tracks CRU reasonably well because it is in-sample. However, despite the fact that the backcast is both in-sample and initialized with the high true temperatures from 1999 AD and 2000 AD, it still cannot capture either the high level of or the sharp run-up in temperatures of the 1990s. It is substantially biased low. That the model cannot capture run-up even in-sample does not portend well for its ability

to capture similar levels and run-ups if they exist out-of-sample.

Conclusion.

Research on multi-proxy temperature reconstructions of the earth’s temperature is now entering its second decade. While the literature is large, there has been very little collaboration with universitylevel, professional statisticians (Wegman et al., 2006; Wegman, 2006). Our paper is an effort to apply some modern statistical methods to these problems. While our results agree with the climate scientists findings in some

respects, our methods of estimating model uncertainty and accuracy are in sharp disagreement.

On the one hand, we conclude unequivocally that the evidence for a ”long-handled” hockey stick (where the shaft of the hockey stick extends to the year 1000 AD) is lacking in the data. The fundamental problem is that there is a limited amount of proxy data which dates back to 1000 AD; what is available is weakly predictive of global annual temperature. Our backcasting methods, which track quite closely the methods applied most recently in Mann (2008) to the same data, are unable to catch the sharp run up in temperatures recorded in the 1990s, even in-sample.

As can be seen in Figure 15, our estimate of the run up in temperature in the 1990s has

a much smaller slope than the actual temperature series. Furthermore, the lower frame of Figure 18 clearly reveals that the proxy model is not at all able to track the high gradient segment. Consequently, the long flat handle of the hockey stick is best understood to be a feature of regression and less a reflection of our knowledge of the truth. Nevertheless, the temperatures of the last few decades have been relatively warm compared to many of the thousand year temperature curves sampled from the posterior distribution of our model.

Our main contribution is our efforts to seriously grapple with the uncertainty involved in paleoclimatological reconstructions. Regression of high dimensional time series is always a complex problem with many traps. In our case, the particular challenges include (i) a short sequence of training data, (ii) more predictors than observations, (iii) a very weak signal, and (iv) response and predictor variables which are both strongly autocorrelated.

The final point is particularly troublesome: since the data is not easily modeled by a simple autoregressive process it follows that the number of truly independent observations (i.e., the effective sample size) may be just too small for accurate reconstruction.

Climate scientists have greatly underestimated the uncertainty of proxy based reconstructions and hence have been overconfident in their models. We have shown that time dependence in the temperature series is sufficiently strong to permit complex sequences of random numbers to forecast out-of-sample reasonably well fairly frequently (see, for example, Figure 9). Furthermore, even proxy based models with approximately the same amount of reconstructive skill (Figures 11,12, and 13), produce strikingly dissimilar historical backcasts: some of these look like hockey sticks but most do not (Figure 14).

Natural climate variability is not well understood and is probably quite large. It is not clear that the proxies currently used to predict temperature are even predictive of it at the scale of several decades let alone over many centuries. Nonetheless, paleoclimatoligical reconstructions constitute only one source of evidence in the AGW debate. Our work stands entirely on the shoulders of those environmental scientists who labored untold years to assemble the vast network of natural proxies. Although we assume the reliability of their data for our purposes here, there still remains a considerable number of outstanding questions that can only be answered with a free and open inquiry and a great deal of replication.

===============================================================

Commenters on WUWT report that Tamino and Romm are deleting comments even mentioning this paper on their blog comment forum. Their refusal to even acknowledge it tells you it has squarely hit the target, and the fat lady has sung – loudly.

(h/t to WUWT reader “thechuckr”)

Share

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
1.2K Comments
Inline Feedbacks
View all comments
Dr. Dave
August 15, 2010 12:12 pm

GeoFlynx,
You still don’t get it, do you? The purpose of this paper was not to infer anything from Mann’s data, it was to demonstrate that Mann, et al employed faulty statistical methods. They used the same (probably corrupted and cherry picked) data that Mann used only they applied the correct statistical analysis and got startlingly different results. Their results are irrelevant, but they have proven that Mann’s results are, at best, invalid.
——————————-
Jimbo,
I followed your link over to RC. I don’t go there often because I always feel like I need to shower after I leave. They’re not quite yet foaming at the mouth but they’re getting a little frothy around the lips.
——————————-
Smokey and James Sexton,
It’s worth it to read the comments just to read your eloquent smackdown of climate trolls. Thank you and well done, gentlemen.

James Sexton
August 15, 2010 12:20 pm

Paul K2 says:
August 15, 2010 at 10:24 am
“Please clear up some confusion on my part:
The graphs above only cover the Northern Hemisphere proxy data. The authors decided NOT to include Southern Hemisphere proxy data in their analysis. Why? How can they reach this conclusion on global annual temperatures (in their Conclusions section) without looking at global proxy data? :”
Paul, they weren’t seeking to reach a conclusion of the global annual temperatures, they were seeking to know if one could with the proxy data. It is a fine distinction, but an important one. What they were stating was the proxy data isn’t useful in that regard.

August 15, 2010 12:20 pm

TerryS: August 15, 2010 at 8:15 am
Please get your quotes right. I did not say that, I was quoting eudoxus
Mea maxima culpa. I did a cut-paste and snipped the wrong tag.
Ummmm — when I was distracted by a camel spider running across my keyboard.
Yup. Camel spider. That’s the ticket…

Patrik
August 15, 2010 12:20 pm

Mikael Pihlström>> Sounds like you reject the hypothesis of neglection (from Mann et al) and want to substitute it with nonchalance, is that correct?

RoyFOMR
August 15, 2010 12:35 pm

This post worries me. Aren’t we laying our flanks exposed to the danger of double-dipped and robust recession?
If it wasn’t bad enough to discover that the arithmetical skills of fiscal logicians, although considerably greater than their strategic judgement, still registered an F, we are now facing another crisis of confidence!
OK, I could handle the issue of upside down temperature proxies. We’re only Human after all.
I even managed to swallow those informative, albeit time consuming, interludes with Gav and Secular Amimist and DoughBoyo and all the rest of the RC stalwarts. PS, guys, sorry if I didn’t mention you by name, you’re all still lovingly ‘membered.
Nope, what really stuck in my craw was that, despite the overwhelmingly over-stocked war-chests, the grateful acceptance of your findings by tax-hungry western politicians and the crusading zeal of belief-blinded journalists, was that you got shafted by part-time, curious, indefatigable and gifted amateurs.
And their arsenal comprised of what exactly?
Truth, scepticism, science, for sure, but when spiced and flavoured by an inherent distrust of hubristic certainty and garnished with an appreciation that a talented scientist who came up with the physical principles of the GHE, later modified his findings.
Indeed. ‘Tis chastening that his most ardent supporters conveniently ignore his more recent caveats, the inconvenience of poor data collection, the statistical prestidigitations of the most senior in the field of climate science that created the belief driven Procrustean fit that resulted in the unprecedented, HS.
Guys, you came in as big, hungry sharks and you got shredded by minnows. Your backers must be losing patience. Take care.

James Sexton
August 15, 2010 12:36 pm

Dr. Dave says:
August 15, 2010 at 12:12 pm
Thanks, I appreciate the compliment! And you’re not bad yourself. And, I agree, Smokey does a great job!
Really, its usually, it’s a pretty easy and fun endeavor. The trolls don’t attack the paper in a valid scientific manner. (Most probably can’t and probably haven’t read the paper.)

August 15, 2010 12:38 pm

Prediction: Mann will claim this paper has already been debunked, and is part of a fossil-fuel funded conspiracy to weaken the public’s confidence in the overwhelming consensus of credible scientists.
In short, without even needing to read it, it’s already wrong.
By the way, McShane and Wyner are about to have their past checked for any faint evidence of oil, tobacco or right-wing opinions on any subject (“right-wing” in this case being anything to the right of Trotsky). Because as everyone knows, its not the math that makes the paper correct, its the purity of heart of the person doing the math that counts
[REPLY – It’s the vast wight-ring conspiracy. ~ Sauron]

Brad
August 15, 2010 12:43 pm

The conclusion shows that hockey stick sucks, and the data is horrendous historically. Watch out, solar science is the next to go down based on historical crap data.
“6. Conclusion. Research on multi-proxy temperature reconstructions
of the earth’s temperature is now entering its second decade. While the
literature is large, there has been very little collaboration with universitylevel,
professional statisticians (Wegman et al., 2006; Wegman, 2006). Our
paper is an effort to apply some modern statistical methods to these problems.
While our results agree with the climate scientists findings in some
respects, our methods of estimating model uncertainty and accuracy are in
sharp disagreement.
On the one hand, we conclude unequivocally that the evidence for a
”long-handled” hockey stick (where the shaft of the hockey stick extends
to the year 1000 AD) is lacking in the data. The fundamental problem is
that there is a limited amount of proxy data which dates back to 1000 AD;
what is available is weakly predictive of global annual temperature. Our
backcasting methods, which track quite closely the methods applied most
recently in Mann (2008) to the same data, are unable to catch the sharp run
up in temperatures recorded in the 1990s, even in-sample.”

LearDog
August 15, 2010 12:51 pm

Am really glad to see this effort – but hope that it is a beginning – not an end of the discussion.
I particularly look forward to the next steps (in light of Gavins comments) and Professors McShane and Wyner reply to the comments (‘stay tuned….’). Get ready gents – they’re going to ‘bring it’.
I would also LOVE to see a McPaper from McIntyre, McShane and McKittrick with an analysis WITHOUT bristlecones, upside down lake sediments or Gaspe series. And thereby nail the entire enterprise.
[Reply – Try Loehle, McCulloch (2008) ~ Evan]

Anders L.
August 15, 2010 12:53 pm

The first sentence of the paper states: “Predicting historic temperatures based on tree rings, ice cores, and other natural proxies is a difficult endeavor.”
If predicting the historic temperatures is difficult, I guess it is really difficult to predict the future.

Stephen Wilde
August 15, 2010 12:55 pm

Brad said:
“Watch out, solar science is the next to go down based on historical crap data.”
Oo,er. What about Leif’s ‘reconstructions’ then ?

Jimbo
August 15, 2010 1:00 pm

Mikael Pihlström: August 15, 2010 at 1:49 am
If paleo reconstructions are universally dead (I am OK with that) they are dead for everyone. You have to forget your MWP argument to.
—————–
There are farms and tree trunks in the permafrost of Greenland.

latitude
August 15, 2010 1:03 pm

Chris H says:
August 15, 2010 at 11:58 am
It should not be forgotten that this paper is not only an indictment of Mann’s original papers, it’s also an indictment of the “peer-review” process that allowed such rubbish to be printed.
=========================================
Chris, I don’t look at the peer review process that way at all.
In my field, it’s little more than spealchex and does the paper have merit. Does it bring up anything new, a different angle, etc.
We don’t look at peer review as a grade.
Peer review puts the paper out there, where it either gets trashed, or stands on it’s own merit.
I think a lot of people are confused about that.
After the paper is published, then everyone has access to it.
That’s when it’s debated, tested, run through the wringer.
If it stands, it stands until someone else comes up with something better.
If it’s proven wrong, that way everyone benefits from it and someone else will publish something until they are proven wrong.
But justing being reviewed and published really means nothing.

geo
August 15, 2010 1:08 pm

I’ve seen some people at Steve’s arguing this study basically leaves “the blade” intact and suggests the 1990s are still likely the hottest decade of the last millinium.
However, what that argument misses is that the power of the hockey stick was always in the handle, not in the blade. If the handle disappears into lumpiness, then the AGWers lose a major piece of their arsenal for arguing near 100% causation for C02 in modern warming. Now they will have to admit that natural variability could play a larger role than they have been willing to admit up to now.
The 0% vs 100% argument for C02 has always been a barren exercize for all about the hardest of the hard core on either side. Everyone else has intuitively understood reality was almost certain to be more nuanced than that. Now that the “hockey stick handle” is dead, perhaps we can get on with a more realistic argument about the real % causation for C02 in modern warming.
The Hockey Stick handle is dead, so now the supposed non-contribution of UHI moves front and center as the biggest dragon remaining to be slain to enter a new phase of realistic debates about C02’s contributions to warming in the past, and the future.

Huub Bakker
August 15, 2010 1:13 pm

Having read the error analysis and seen how wide the confidence limits are, I wonder what such an analysis of the instrumental temperature record would show. After all, many large-scale adjustments seem to have been required over the years and no plot I’ve ever seen includes any confidence limits at all.

jason
August 15, 2010 1:17 pm

Another hole below the waterline. Shame I burned my lifeboat to keep warm in the unseasonal 16 degree UK summer……

John Baltutis
August 15, 2010 1:26 pm
Slabadang
August 15, 2010 1:28 pm

They sure have one extreme skill back att Realclimate!
Deleting!! They have special extra delete button replacement kits. They are investing in a special delete robot to cut costs.

August 15, 2010 1:28 pm

Over at Tamino’s I just had this exchange about an ad hominem there against McShane and Wyner :

Commenter A – “One is disappoointed to see that some well known denialists, McShane and Wyner, have managed to scrape a paper through the peer review process which is critical of Michael Mann’s work.
Bayseianism as employed here is the last refuge of statistical scoundrels, mostly ferocious right wing neo liberals.”
John Whitman replied to Commenter A – “I am interested to hear about the track record of denialism which you say that McShane and Wyner have.”
Commenter B replied to me – “I’m smelling a rat, John. Just look at the idiotic comment about bayesian statistics.”

————–
I didn’t know what “idiotic comment about Bayesian statistics” Commenter B was talking about, but I am assuming it is the use of Bayesian modeling by McShane and Wyner in their new paper. Note Commenter A implies use of Bayesian statistics is a moral/political issue.
So, I take from these two commenterss that they think just very use of Bayesian statistic on Mann’s work implies denier status for McShane and Wyner.
But, I am sincerely interested on further responses to my questions about ad hominem on McShane and Wyner at Tamino’s place. So will try to inquire more.
John

Grumbler
August 15, 2010 1:31 pm

“August 15, 2010 at 10:31 am
duckster…
You need a theory to explain what is happening now.”
Erm, no we don’t. This paper could be our ‘black swan’.
cheers David

Evan Jones
Editor
August 15, 2010 1:43 pm

on what measurements do you base your believe that CO2 is a greehouse gas i.e that its warming properties are greater than its cooling properties?
So far, not on a lot. Too many unknowns. There is behavior under lab conditions. And there has been some measurable warming. But I am guessing the trend is exaggerated by a factor of two between spurious adjustments and various site biases (UHI, microsite, TOBS, what have you).
Then there is natural recovery from the LIA and non-CO2 anthropogenic issues such as land use and particulates (i.e., “dirty snow”).
Not to mention the mysteries of how the atmosphere behaves in practice and all the oceanic and interactive variables (clouds, pressure variables, what have you).
To say nothing of radiation, which is what CO2 GH theory is all about.
Then there are all the unknown factors. Since we don’t know them, we can’t list them.
Between all that, there is still room for CO2, though not a heck of a lot. Possibly the raw effect is significant , but damped down by negative feedback.
Thank goodness we have microwave proxies for lower troposphere or we’d not only be shooting in the dark, but aiming at a raindrop while standing on a revolving platform.

Wijnand
August 15, 2010 1:45 pm

*takes another handfull of pocorn*

Grumbler
August 15, 2010 1:45 pm

“Mike Roddy says:
August 15, 2010 at 7:44 am
A reader questioned my comment that the oceans have 40% less fish biomass. This is actually only a logical assumption, since it’s impossible to measure fish biomass, due to their dispersion. The study in question measures phytoplankton, which form the basis of the oceanic food chain. I should have noted that in my comment. Here is the study:
http://www.cleveland.com/world/index.ssf/2010/07/oceans_phytoplankton_drops_40.html

You’re talking tosh mate. So you can’t measure fish mass but you can measure phytoplankton that accurately? You better tell all the fisheries authorities who seem to know exactly how many fish there are. The other fallacy is that it’s a ‘logical assumption’ that fish reduce at the same rate as the food. What if there was excess food to begin with? i.e. 40% more plankton than they needed? Also the article is about phytoplankton. Fish eat zooplankton as well and there is probably as much of that as ever. Think critically please.
cheers David.

August 15, 2010 1:46 pm

John Whitman says:
August 15, 2010 at 1:28 pm
My reading of it was that Commenter A was having a bit of fun at Tamino’s – and his folowers’ – expense, and that Commenter B had got it.
Surely Commenter A was not being serious???

Ben U.
August 15, 2010 1:48 pm

Jobnls says:
August 15, 2010 at 9:32 am
“What he is really saying is quite astonishing. I his opinion you can not simply say that the data and the analysis are crap since this would be unscientific. You have to try and find a better way of massaging the crap data in order to produce science.”
Actually it might make some sense if he were talking only about trying to progress in theoretical research. Even a bad (but non-frivolous) theory can be better than having no theory if one’s purpose is theoretical progress – one needs to start from somewhere, even if only to move to somewhere else. It’s not unusual in science to work on a theory that one knows not be in full accordance with reality, if one has no alternative theories nearly as good. The least bad scientific theory often gets to get worked on. People work both inside the box and outside the box of the theory, in hopes of ending up with a better theory.
But it’s crazily wrong to cram common practice into the box of a bad scientific theory, even if the other scientific theories are worse.
The yawning fallacy is to hold that the least bad theory is automatically, by magic default, ipso facto, willy-nilly, pell-mell sufficient basis for practical action and harsh choices and, say, revolutionizing the world under grand central government controlling all means of production and making us all poor and the poor among us even poorer. This is the fallacy that the least bad scientific theory automatically gets to steer common practice into places no matter how strange or destructive.
To the contrary, one does not need to present an alternative theory in order to show that a given theory is too weak or too contrary to observations to be a basis for forcing massive changes in practice.
I suspect that we’ll hear plenty of it (which is why I’m going on at this length), of how we must act (and massively) on the basis of the lousy scientific theory because of lack of a better one, as if that were the same thing as working theoretically on a bad scientific theory for lack of a better one (work that often involves trying to improve the theory by, umm, changing it).

1 10 11 12 13 14 49