Guest Post by Willis Eschenbach
As Anthony discussed here at WUWT, we have yet another effort to re-animate the long-dead “hockeystick” of Michael Mann. This time, it’s Recent temperature extremes at high northern latitudes unprecedented in the past 600 years, by Martin P. Tingley and Peter Huybers (paywalled), hereinafter TH2013.
Here’s their claim from the abstract.
Here, using a hierarchical Bayesian analysis of instrumental, tree-ring, ice-core and lake-sediment records, we show that the magnitude and frequency of recent warm temperature extremes at high northern latitudes are unprecedented in the past 600 years. The summers of 2005, 2007, 2010 and 2011 were warmer than those of all prior years back to 1400 (probability P > 0.95), in terms of the spatial average. The summer of 2010 was the warmest in the previous 600 years in western Russia (P > 0.99) and probably the warmest in western Greenland and the Canadian Arctic as well (P > 0.90). These and other recent extremes greatly exceed those expected from a stationary climate, but can be understood as resulting from constant space–time variability about an increased mean temperature.
Now, Steve McIntyre has found some lovely problems with their claims over at ClimateAudit. I thought I’d take a look at their lake-sediment records. Here’s the raw data itself, before any analysis:
So what’s not to like? Well, a number of things.
To start with, there’s the infamous Korttajarvi record. Steve McIntyre describes this one well:
In keeping with the total and complete stubbornness of the paleoclimate community, they use the most famous series of Mann et al 2008: the contaminated Korttajarvi sediments, the problems with which are well known in skeptic blogs and which were reported in a comment at PNAS by Ross and I at the time. The original author, Mia Tiljander, warned against use of the modern portion of this data, as the sediments had been contaminated by modern bridgebuilding and farming. Although the defects of this series as a proxy are well known to readers of “skeptical” blogs, peer reviewers at Nature were obviously untroubled by the inclusion of this proxy in a temperature reconstruction.
Let me stop here a moment and talk about lake proxies. Down at the bottom of most every lake, a new layer of sediment is laid down every year. This sediment contains a very informative mix of whatever was washed into the lake during a given year. You can identify the changes in the local vegetation, for example, by changes in the plant pollens that are laid down as part of the sediment. There’s a lot of information that can be mined from the mud at the bottom of lakes.
One piece of information we can look at is the rate at which the sediment accumulates. This is called “varve thickness”, with a “varve” meaning a pair of thin layers of sediment, one for summer and one for winter, that comprise a single year’s sediment. Obviously, this thickness can vary quite a bit. And in some cases, it’s correlated in some sense with temperature.
However, in one important way lake proxies are unlike say ice core proxies. The daily activities of human beings don’t change the thickness of the layers of ice that get laid down. But everything from road construction to changes in farming methods can radically change the amount of sediment in the local watercourses and lakes. That’s the problem with Korttajarvi.
And in addition, changes in the surrounding natural landscape can also change the sediment levels. Many things, from burning of local vegetation to insect infestation to changes in local water flow can radically change the amount of sediment in a particular part of a particular lake.
Look, for example, at the Soper data in Figure 1. It is more than obvious that we are looking at some significant changes in the sedimentation rate during the first half of the 20th Century. After four centuries of one regime, something happened. We don’t know what, but it seems doubtful a gradual change in temperature would cause a sudden step change in the amount of sediment combined with a change in variability.
Now, let me stop right here and say that the inclusion of this proxy alone, ignoring the obvious madness of including Korttajarvi, this proxy alone should totally disqualify the whole paper. There is no justification for claiming that it is temperature related. Yes, I know it gets log transformed further on in the story, but get real. This is not a representation of temperature.
But Korttajarvi and Soper are not the only problem. Look at Iceberg, three separate records. It’s like one of those second grade quizzes—”Which of these three records is unlike the other two?” How can that possibly be considered a valid proxy?
How does one end up with this kind of garbage? Here’s the authors’ explanation:
All varve thickness records publicly available from the NOAA Paleolimnology Data Archive as of January 2012 are incorporated, provided they meet the following criteria:
• extend back at least 200 years,
• are at annual resolution,
• are reported in length units, and
• the original publication or other references indicate or argue for a positive association with summer temperature.
Well, that all sounds good, but these guys are so classic … take a look at Devon Lake in Figure 1, it’s DV09. Notice how far back it goes? 1843, which is 170 years ago … so much for their 200 year criteria.
Want to know the funny part? I might never have noticed, but when I read the criteria, I thought “Why a 200 year criteria”? It struck me as special pleading, so I looked more closely at the only one it applied to and said huh? Didn’t look like 200 years. So I checked the data here … 1843, not 200 years ago, only 170.
Man, the more I look, the more I find. In that regard, both Sawtooth and Murray have little short separate portions at the end of their main data. Perhaps by chance, both of them will add to whatever spurious hockeystick has been formed by Korttajarvi and Soper and the main players.
So that’s the first look, at the raw data. Now, let’s follow what they actually do with the data. From the paper:
As is common, varve thicknesses are logarithmically transformed before analysis, giving distributions that are more nearly normally distributed and in agreement with the assumptions characterizing our analysis (see subsequent section).
I’m not entirely at ease with this log transformation. I don’t understand the underlying justification or logic for doing that. If the varve thickness is proportional in some way to temperature, and it may well be, why would it be proportional to the logarithm of the thickness?
In any case, let’s see how much “more nearly normally distributed” we’re talking about. Here are the distributions of the same records, after log transformation and standardization. I use a “violin plot” to examine the shape of a distribution. The width at any point indicates the smoothed number of data points with that value. The white dot shows the median value of the data. The black box shows the interquartile range, which contains half of the data. The vertical “whiskers” extend 1.5 times the interquartile distance at top and bottom of the black box.
Note the very large variation between the different varve thickness datasets. You can see the problems with the Soper dataset. Some datasets have a fairly normal distribution after the log transform, like Big Round and Donard. Others, like DV09 and Soper, are far from normal in distribution even after transformation. Many of them are strongly asymmetrical, with excursions of four standard deviations being common in the positive direction. By contrast, often they only vary by half of that in the negative direction, two standard deviations. When the underlying dataset is that far from normal, it’s always a good reason for further investigation in my world. And if you are going to include them, the differences in which way they swing from normal (excess positive over negative excursions) affects both the results and their uncertainty.
In any case, after the log transformation and standardization to a mean of zero and a standard deviation of one, the datasets and their average are shown in Figure 3.
As you can see, the log transform doesn’t change the problems with e.g. the Soper or the Iceberg records. They still do not have internal consistency. As a result of the inclusion of these problematic records, all of which contain visible irregularities in the recent data, even a simple average shows an entirely spurious hockeystick.
In fact, the average shows a typical shape for this kind of spurious hockeystick. In the “shaft” part of the hockeystick, the random variations in the chosen proxies tend to cancel each other out. Then in the “blade”, the random proxies still cancel each other out, and all that’s left are the few proxies that show rises in the most recent section.
My conclusions, in no particular order, are:
• The authors are to be congratulated for being clear about the sources of their data. It makes for easy analysis of their work.
• They are also to be congratulated for the clear statement of the criteria for inclusion of the proxies.
• Sadly, they did not follow their own criteria.
•The main conclusion, however, is that clear, bright-line criteria of the type that they used are a necessary but not sufficient part of the process. There are more steps that need to be followed.
The second step is the use of the source documents and the literature to see if there are problems with using some parts of the data. For them to include Korttajarvi is a particularly egregious oversight. Michael Mann used it upside-down in his 2008 analysis. He subsequently argued it “didn’t matter”. It is used upside-down again here, and the original investigators said don’t use it after 1750 or so. It is absolutely pathetic that after all of the discussion in the literature and on the web, including a published letter to PNAS, that once again Korttajarvi is being used in a proxy reconstruction, and once again it is being used upside-down. That’s inexcusable.
The third part of the proxy selection process is the use of the Mark I eyeball to see if there are gaps, jumps in amplitude, changes in variability, or other signs of problems with the data.
The next part is to investigate the effect of the questionable data on the final result.
And the final part is to discuss the reasons for the inclusion or the exclusion of the questionable data, and its effects on the outcome of the study.
Unfortunately, they only did the first part, establishing the bright-line criteria.
Look, you can’t just grab a bunch of proxies and average them, no matter if you use Bayesian methods or not. The paleoproxy crowd has shown over and over that you can artfully construct a hockeystick by doing that, just pick the right proxies …
So what? All that proves is yes indeed, if you put garbage in, you will assuredly get garbage out. If you are careful when you pack the proxy selection process, you can get any results you want.
Man, I’m tired of rooting through this kind of garbage, faux studies by faux scientists.