Guest Post by Willis Eschenbach
The discussion of the 1998 Mann “Hockeystick” seems like it will never die. (The “Hockeystick” was Dr. Michael Mann’s famous graph showing flatline historical temperatures followed by a huge modern rise.) Claims of the Hockeystick’s veracity continue apace, with people doggedly wanting to believe that the results are “robust”. I thought I’d revisit something I first posted and then expanded on at ClimateAudit a few years ago, which are the proxies in Michael Mann et al.’s 2008 paper, “Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia” (M2008). This was another salvo in Mann’s unending attempt to revive his fatally flawed 1998 “Hockeystick” paper. I used what is called “Cluster Analysis” to look at the proxies. Cluster analysis creates a tree-shaped structure called a “dendrogram” that shows the similarity between the individual datasets involved. Figure 1 shows the dendrogram of the 95 full-length proxies used in the M2008 study:
Figure 1. Cluster Dendrogram of the 95 proxies in the Mann 2008 dataset which extend from the year 1001 to 1980. The closer together two proxies are in the dendrogram, the more similar they are. Absolute similarity is indicated by the left-right position of the fork connecting two datasets. The names give the dataset abbreviation as used by Mann2008, the type (e.g. tree ring, ice core) the location as lat/long, the name of the princiipal investigator, and if tree rings the species abbreviation (e.g. PIBA, PILO).
What can we learn from this dendrogram showing the results of the cluster analysis of the Mann 2008 proxies?
First let me start by describing how the dendrogram is made. The program compares all possible pairs of proxies, and measures their similarity. It selects the most similar pair, and draws a “fork” that connects the two.
Take a look at the “forks” in the dendrogram. The further to the left the fork occurs, the more similar are the pairs. The two most similar proxy datasets in the whole bunch are ones that are furthest to the left. They turn out to be the Tiljander “lightsum” and “thicknessmm” datasets.
Once these two are identified, they are then averaged. The individual proxy datasets are replaced by the average of the two. Then the procedure is repeated. This time it compares all possible remaining pairs, including the average of the first two as a single dataset. Again the most similar pair is selected, marked with a “fork” (slightly to the right of the first fork), and averaged. In the dataset above, the most similar pair is again among the Tiljander proxies. In this case, the pair consists of the “darksum” proxy on the one hand, and the average of the two Tiljander proxies from the first step on the other hand. These two are then removed and replaced with their average.
This procedure is repeated over and over again, until all of the available proxies have been averaged together and added to the dendrogram and it is complete.
In this case, the clustering is clearly not random. In general a cluster is composed of measurements of similar things in a single geographical area (e.g. Argentinian Cypress tree rings). In addition, the proxies tend to cluster by proxy type (e.g. speleothems and sediments vs. tree rings).
Next, the dendrogram can be read from the bottom up to show which groups of proxies are most dissimilar to the others. The more outlying and more unusual group a group is, the nearer it is to the top of the dendrogram.
Next, note that many of the groups of proxies are much more similar to each other than they are to any of the other proxies. In particular the bristlecone “stripbark pines” end up right at the top of the dendrogram, because they are the most atypical group of the lot. Only when there is absolutely no other choice are the bristlecones at the top of the dendrogram added to the dendrogram.
So how does this type of analysis clarify whether the “Hockeystick” is real? The question at issue all along has been, is the “hockeystick” shape something that can be seen in a majority of the proxies, or is it limited to a few proxies? This is usually phrased as whether the results are “robust” to the removal of subsets of the proxies. And as usual in climate science, there are several backstories to this question of “robustness”.
The first backstory on this question is that well prior to this study, the National Research Council (2006) “Surface Temperature Reconstructions for the Last 2,000 Years” recommended that the bristlecone and related “stripbark” pines not be used in paleotemperature reconstructions. This recommendation had also been made previously by other experts in the field. The problem for Mann, of course, is that the hockeystick signal doesn’t show up much when one leaves out the bristlecones. So like a junkie unable to resist going back for one last fix, Dr. Mann and his adherents have found it almost impossible to give up the bristlecones.
The next backstory is that a number of the bristlecone proxy records collected by Graybill have failed replication, as shown by the Ababneh Thesis. Not only that, but one of the authors of M2008 (Malcolm Hughes) had to have known that, because he was on her PhD committee … so the M2008 study used proxies that were not only not recommended for use, but proxies not recommended for use that they knew had failed replication. Bad scientists, no cookies.
The final backstory is that the Tiljander proxies a) were said by the original authors to be hopelessly compromised in recent times and who advised against their use as temperature proxies, and b) were used upside-down by Mann (what he called warming the proxies actually showed as cooling).
With all of that as prologue, Figure 2 shows the average signals of the clusters of normalized proxies (averaged after each proxy is normalized to an average of zero and a standard deviation of one). See if you can tell where the Hockeystick shaped signal is located …
You can see the problems with the various Tiljander series, which are obviously contaminated … they go off the charts in the latter part of the record. In addition, if the Tiljander data were real it would be saying record cold, not record hot, but the computational method of Mann et al. flipped it over.
The reason for the unending addiction of Mann and his adherents to certain groups of proxies becomes obvious in this analysis. The hockeystick shape is entirely contained in a few clusters—the Greybill bristlecones and related stripbark species, the upside-down Tiljander proxies, and a few Asian tree ring records. The speleothems and lake sediments tell a very different story, one of falling temperatures … and in most of the clusters, there’s not much of a common signal at all. Which is why the attempts to rescue the original 1998 “hockeystick” have re-used and refuse to stop re-using those few proxies, proxies which are known to be unsuitable for use in paleotemperature reconstructions. They refuse to stop recycling them for a simple reason … you can’t make hockeysticks without those few proxies.
To sum up. Is the mining of “hockeystick” shaped climate reconstructions from this dataset a “robust” finding?
Not for me, not one bit. While you can get a hockeystick if you waterboard this data long enough, the result is a chimera, a false result of improper analysis. The hockeystick shaped signal is far too localized, and occurs in far too few of the clusters, to call it “robust” in any sense of the word.
PS – The entire saga of the Ababneh Thesis, along with lots and lots of other interesting information, is available over at ClimateAudit. People who want to improve their knowledge about things like the proxy records and the Climategate FOI requests and the whole climate saga should certainly do their homework at ClimateAudit first … because in the marvelous world of Climate Science, things are rarely what they seem.
[UPDATE] Some commenters asked for the data, my apologies for not providing it. It is located at the NOAA Paleoclimate repository here.