Guest Post by Willis Eschenbach
The discussion of the 1998 Mann “Hockeystick” seems like it will never die. (The “Hockeystick” was Dr. Michael Mann’s famous graph showing flatline historical temperatures followed by a huge modern rise.) Claims of the Hockeystick’s veracity continue apace, with people doggedly wanting to believe that the results are “robust”. I thought I’d revisit something I first posted and then expanded on at ClimateAudit a few years ago, which are the proxies in Michael Mann et al.’s 2008 paper, “Proxy-based reconstructions of hemispheric and global surface temperature variations over the past two millennia” (M2008). This was another salvo in Mann’s unending attempt to revive his fatally flawed 1998 “Hockeystick” paper. I used what is called “Cluster Analysis” to look at the proxies. Cluster analysis creates a tree-shaped structure called a “dendrogram” that shows the similarity between the individual datasets involved. Figure 1 shows the dendrogram of the 95 full-length proxies used in the M2008 study:
Figure 1. Cluster Dendrogram of the 95 proxies in the Mann 2008 dataset which extend from the year 1001 to 1980. The closer together two proxies are in the dendrogram, the more similar they are. Absolute similarity is indicated by the left-right position of the fork connecting two datasets. The names give the dataset abbreviation as used by Mann2008, the type (e.g. tree ring, ice core) the location as lat/long, the name of the princiipal investigator, and if tree rings the species abbreviation (e.g. PIBA, PILO).
What can we learn from this dendrogram showing the results of the cluster analysis of the Mann 2008 proxies?
First let me start by describing how the dendrogram is made. The program compares all possible pairs of proxies, and measures their similarity. It selects the most similar pair, and draws a “fork” that connects the two.
Take a look at the “forks” in the dendrogram. The further to the left the fork occurs, the more similar are the pairs. The two most similar proxy datasets in the whole bunch are ones that are furthest to the left. They turn out to be the Tiljander “lightsum” and “thicknessmm” datasets.
Once these two are identified, they are then averaged. The individual proxy datasets are replaced by the average of the two. Then the procedure is repeated. This time it compares all possible remaining pairs, including the average of the first two as a single dataset. Again the most similar pair is selected, marked with a “fork” (slightly to the right of the first fork), and averaged. In the dataset above, the most similar pair is again among the Tiljander proxies. In this case, the pair consists of the “darksum” proxy on the one hand, and the average of the two Tiljander proxies from the first step on the other hand. These two are then removed and replaced with their average.
This procedure is repeated over and over again, until all of the available proxies have been averaged together and added to the dendrogram and it is complete.
In this case, the clustering is clearly not random. In general a cluster is composed of measurements of similar things in a single geographical area (e.g. Argentinian Cypress tree rings). In addition, the proxies tend to cluster by proxy type (e.g. speleothems and sediments vs. tree rings).
Next, the dendrogram can be read from the bottom up to show which groups of proxies are most dissimilar to the others. The more outlying and more unusual group a group is, the nearer it is to the top of the dendrogram.
Next, note that many of the groups of proxies are much more similar to each other than they are to any of the other proxies. In particular the bristlecone “stripbark pines” end up right at the top of the dendrogram, because they are the most atypical group of the lot. Only when there is absolutely no other choice are the bristlecones at the top of the dendrogram added to the dendrogram.
So how does this type of analysis clarify whether the “Hockeystick” is real? The question at issue all along has been, is the “hockeystick” shape something that can be seen in a majority of the proxies, or is it limited to a few proxies? This is usually phrased as whether the results are “robust” to the removal of subsets of the proxies. And as usual in climate science, there are several backstories to this question of “robustness”.
The first backstory on this question is that well prior to this study, the National Research Council (2006) “Surface Temperature Reconstructions for the Last 2,000 Years” recommended that the bristlecone and related “stripbark” pines not be used in paleotemperature reconstructions. This recommendation had also been made previously by other experts in the field. The problem for Mann, of course, is that the hockeystick signal doesn’t show up much when one leaves out the bristlecones. So like a junkie unable to resist going back for one last fix, Dr. Mann and his adherents have found it almost impossible to give up the bristlecones.
The next backstory is that a number of the bristlecone proxy records collected by Graybill have failed replication, as shown by the Ababneh Thesis. Not only that, but one of the authors of M2008 (Malcolm Hughes) had to have known that, because he was on her PhD committee … so the M2008 study used proxies that were not only not recommended for use, but proxies not recommended for use that they knew had failed replication. Bad scientists, no cookies.
The final backstory is that the Tiljander proxies a) were said by the original authors to be hopelessly compromised in recent times and who advised against their use as temperature proxies, and b) were used upside-down by Mann (what he called warming the proxies actually showed as cooling).
With all of that as prologue, Figure 2 shows the average signals of the clusters of normalized proxies (averaged after each proxy is normalized to an average of zero and a standard deviation of one). See if you can tell where the Hockeystick shaped signal is located …
Figure 2. Left column shows average signals of the clusters of proxies shown in Figure 1, from the year 1001 to 1980. Averages are of the cluster to which each is connected by a short black line.
You can see the problems with the various Tiljander series, which are obviously contaminated … they go off the charts in the latter part of the record. In addition, if the Tiljander data were real it would be saying record cold, not record hot, but the computational method of Mann et al. flipped it over.
The reason for the unending addiction of Mann and his adherents to certain groups of proxies becomes obvious in this analysis. The hockeystick shape is entirely contained in a few clusters—the Greybill bristlecones and related stripbark species, the upside-down Tiljander proxies, and a few Asian tree ring records. The speleothems and lake sediments tell a very different story, one of falling temperatures … and in most of the clusters, there’s not much of a common signal at all. Which is why the attempts to rescue the original 1998 “hockeystick” have re-used and refuse to stop re-using those few proxies, proxies which are known to be unsuitable for use in paleotemperature reconstructions. They refuse to stop recycling them for a simple reason … you can’t make hockeysticks without those few proxies.
To sum up. Is the mining of “hockeystick” shaped climate reconstructions from this dataset a “robust” finding?
Not for me, not one bit. While you can get a hockeystick if you waterboard this data long enough, the result is a chimera, a false result of improper analysis. The hockeystick shaped signal is far too localized, and occurs in far too few of the clusters, to call it “robust” in any sense of the word.
w.
PS – The entire saga of the Ababneh Thesis, along with lots and lots of other interesting information, is available over at ClimateAudit. People who want to improve their knowledge about things like the proxy records and the Climategate FOI requests and the whole climate saga should certainly do their homework at ClimateAudit first … because in the marvelous world of Climate Science, things are rarely what they seem.
[UPDATE] Some commenters asked for the data, my apologies for not providing it. It is located at the NOAA Paleoclimate repository here.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.
DesertYote says:
May 30, 2011 at 1:46 pm
You’re welcome. I did the analysis in R, based on the R code provided by a commenter on my earlier thread cited in the opening of the head post.
w.
MrX says:
May 30, 2011 at 2:11 pm
Thanks for asking. I’ve put an update at the end of the head post with a link to the data.
It’s annual data 1001-1980. To compare them I standardized them (set their mean to zero and standard deviation to one). Then you can compare them and average them directly.
The scale is in units of standard deviation. I created the whole graphic of Figure 2 as kind of a kludge, because I don’t speak “R” (the computer language I use) very well. So I made up the dendrogram and each of the individual graphs separately in R, and then screenshot each one. I then assembled them in my graphics program of choice (VectorWorks) and added the lines from each graph to the corresponding cluster and the like.
w.
It’s very inappropriate to associate waterboarding with such tortured data. Ask William F. Buckley what torture is.
What amazes me most in debating AGW enthusiasts is something they don’t seem to comprehend that cannot really be forcefully argued since it’s basically common sense and pure logic: that if a statistical mash up of a bunch of, say, tide gauge records shows a recent surge then that result is meaningless if it relies on a few outliers (or lots of records too short to contain historical trend information) when in fact the vast majority of tide gauge records fail to show any trend change at all, worldwide. If the claimed effect is a sudden global surge in T and sea level, then it’s amazing how they can glibly assert that “it’s global not local, you idiot!” when I ask them why such a huge surge in two variables doesn’t not show long single site records scattered all over the place.
I believe they simply can’t think clearly.
It’s also amazing how they keep pushing hockey sticks while pompously asserting that commenters cannot possibly provide proxy studies that show a hotter MWP. When I link to a hundred or so of them, they twirl back around and claim they are all just “local” even though they are from all over the globe (http://www.co2science.org/data/timemap/mwpmap.html).
In Cook’s book on “denial” he bastardizes the term “cherry picking” to describe the nefarious practice of harping on research papers that fail to support AGW. He writes: “They cherry-pick one contradictory study and they promote it relentlessly.” In other words he has turned Feynman era scientific rigor into something to mock.
He also writes: “Just because there a professor of something denying climate change does not mean it is not true, it just that the professor is in denial. This is why one must make use of the preponderance of evidence in science, the collective view.”
So here we have a group of very vocal AGW supporters who downright reject the very idea that a single result can topple a whole theory. They really do believe that science works by paper counting and academy proclamations rather than according to ideas. Thus there is no reasoning with them using specific results. They describe numerous falsifications of CAGW as merely being “some inconsistencies”.
They are proudly promoting a false view of the very nature of science itself. I wonder how this factor could be better exposed as being a form of corruption? What started for me as climate homework has turned into a psychology puzzle.
-=NikFromNYC=- Ph.D. in Chemistry (Columbia/Harvard)
Rob R says:
May 30, 2011 at 2:49 pm
Dunno, but Malcolm Hughes and the rest of her committee signed off on her thesis in July of 2006.
w.
Tilo Reber says:
May 30, 2011 at 4:36 pm
Hasn’t given one as far as I know.
Of course, Steve McIntyre, who else. More sleight of hand going on there as well.
w.
johnnythelowery says:
May 30, 2011 at 7:05 pm
Y’know, Johnny, I had a lovely afternoon. My sweet wife Ellie told me she wanted to take me somewhere as a surprise. We got in the car, and she drove me to Armstrong Woods Redwood State Park. Despite living in the redwoods and also living nearby, I’d never been there. We went, and it was a living cathedral, a symphony of green. Awesome.
Then we drove over to her dad’s place. Billy’s 82 and effectively blind. I did some work around his place while Ellie read his bills and letters to him and wrote checks. Then we all got in the car and drove about four blocks to the nursing home where Ellie’s mom is. She’s a sweet lady, half her body paralyzed from a stroke. Ellie rubbed some cream on her face. Billy and I stood around and cracked jokes. All in all, it was a lovely afternoon.
Then we came home, and what do I find? … someone’s upset because I haven’t answered a question yet.
In any case, the answer is that the distance measure I used was (1 – absolute(covariance)). Or you could use (1 – absolute(correlation)) and get the same answer. Or 1 – r^2, for that matter.
w.
Willis,
I can’t describe how useful a non-expert like me has found your posting, especially figure 2. It brings so many things together that so far hadn’t quite gelled in my mind. This post has gone into my bookmarks.
The Asian tree rings include Yamal and Jacoby’s Mongolia seris, both used over and over. Briffa’s Yamal series is, oddly, included in the series labeled “Tornetrask”, which isn’t Tornetrask but an average of Tornetrask, Yamal and Taymir – something that is impossible to guess, but if you know the data, you watch for Mannianisms.
From eyeballing the graphs, what stands out to me is that if you throw away the discredited proxies, what you’re left with looks like a bunch of meaningless noise. It’s difficult to see any underlying pattern that might represent “global temperature”.
Is there any reason to believe that any of these proxies are up to the job?
Another one out of the park. If Willis doesn’t yet hold the all-time home run record, he will soon. 🙂
This is the clearest and most concise presentation of the issue I have seen. Very well put together.
I was aware of all the issues but having it on one sheet with the mini plots on the left makes it instantly assimilable and very clear how much depends on use of unsuitable bristlecone and inverted Tiljander including the damaged part of the record.
It would be interesting to see if any kind of signal can be dug out of all this noise once the bristlecone is removed and Tiljander is cropped to conserve the valid data and used the right way up !
It’s hard to see with that level of s/n ratio but it looks like MWP and LIA may well emerge from the mud.
Has no-one ever done this?
Re
James Evans says:
May 30, 2011 at 11:19 pm
“Is there any reason to believe that any of these proxies are up to the job?”
No.
There’s some theoretical stuff, involving Oxygen isotopes, but sometimes these are cited as “evidence” of increased precipitation, othertimes as temperature proxies.
These isotopes are found in both ice cores and speliotherms (stalagtites & stalagmites & their kin).
The idea of lake sediments, is that in warmer waters, there’s more plant growth & thus more to decay & form thicker/denser layers of mud.
The concept behind “treemometers”, is that (all other things equal), trees on the polewards extremes of their ranges, will make the most of a warmer & longer summer to put on growth.
Big, nay enormous, holes can be picked in all of these theories!
Re
Alex says:
May 30, 2011 at 1:43 pm
“Did they use the data upside down? How come this is not widely known?”
Yes, they did & for those who’ve read the posts on the subject at “Climate Audit” & Andrew Montford’s book “The Hockey Stick Illusion”, it is well-known.
It’s yet another area, where the inconvenient truth was raised & the incestuous climatatology community closed ranks, either went “La la la, we’re not listening” or denied it.
http://wattsupwiththat.com/2009/11/27/told-ya-so-more-upside-down-mann-in-his-latest-paper/
It is quite obvious that the signal to noise ratio in this tree-ring data is quite low, especially if someone goes out of their way to select noise in preference to the signal.
It is so easy to assume that everyone is aware of the lack of good evidence supporting the CAGW theory that we often forget how many people still believe that it is a true and they have a duty to stop it. (cf. Bill Nye) Even the President recently spoke of forging an international agreement to control carbon emissions.
He says “The sign doesn’t matter.”
So let me get this right–the world is in an uproar, spending billions and billions of dollars all because some ego-saturated would-be “scientist” fudged the data, got himself a cushy way to grab grants by doing it, and can’t even be honest enough about it to say he was wrong? Sheesh–this Mann character is really something. Somebody needs to yank his credentials. He’s one dangerous dude! (The people granting him all that money are just as guilty as Mann, by the way–they should be taken to task too.)
Thanks boss, that seems to take care of that little bit of chicanery.
Thanks for this it’s very insightful, and good timing too as Anthony has just posted a picture of Al Gore scaling a ladder next to the giant stick. Looks like Al could have used a foot stool if it was done right.
Thanks for keeping this afloat Willis. I thought the thorough gutting of this beast by McIntyre was sufficient. The fact that the hockey stick is still propped up seems to prove this point from NikFromNYC:
NikFromNYC says: May 30, 2011 at 7:48 pm […]
So here we have a group of very vocal AGW supporters who downright reject the very idea that a single result can topple a whole theory. They really do believe that science works by paper counting and academy proclamations rather than according to ideas. Thus there is no reasoning with them using specific results. They describe numerous falsifications of CAGW as merely being “some inconsistencies”.
I got a little lost at CA trying to understand all of this, but you laid it out in a very plain, and easy way to understand !!!
Another great post Willis !!!!
to recap why bristlecones are not good to use: the extreme hockey stick shape is found mostly in “strip bark” trees. These are trees that have been damaged on one side by frost, drought, fire or something. The bark on the other side of the trunk starts to grow extra fast to compensate for the lost bark on the damaged side. The extra growth lasts for a century or so and has nothing to do with climate–it is a healing response. The Ababneh thesis showed that if you pick trees that are not damaged, you get no hockey stick shape. No cookies indeed. This is so simple and obvious that even a cave man should be able to understand. Using these trees is like trying to predict running speed using only a population of amputees, without mentioning that fact in the study.
I have just started to dig in to the hockey stick controversy in depth. This was a very helpful summary post for me on Mann’s proxies.
I don’t want to create extra work for you, but it would be very helpful to create a similar analysis for all the proxies used on the spaghetti graphs. I find that Mann’s defenders, when backed into a corner, inevitably try to switch topics to all the other “indpendent” studies that reach the same conclusions as Mann. It would be nice to have a succinct summary chart such as the one you provide here that could demonstrate that they all rely on the same flawed proxies for the hockey stick shapes.
@PAUL DELLEVIGNE
I think you’ll find that the so-called “independent” studies aren’t independent in any important sense of the word. They are all conducted by co-authors of Mann, or by co-authors of his co-authors, use the same defective proxies and largely the same defective statistical methods. They just shuffle things around and hope no one will notice.
@- Willis Eschenbach
I understand you can find inumerable caveats about the dendro-proxy data, it is noisy and u8nreliable over the most recent century. The variance over past centuries does seem to implie a rather flat climate but with big uncertainty/error bars – exactly as the original MBH98 graph.
But in circumstances when one source of data is uncertain the usual response is to look for alternative proxy sources and try and achieve a consilience.
Or at least see if an alternative proxy measure gives a radically different, or largely similar result.
How do climate reconstructions based on all the other paleoclimate proxies found at –
http://www.ncdc.noaa.gov/paleo/data.html
compare?
There have been a fair number of paleoclimate reconstructions, not all are based on tree ring data, but AFAIK none refute the pattern of some past variability with a rapid rise at the end during the last century.
http://www.ncdc.noaa.gov/paleo/recons.html
Tilo Reber asks: ” What was Mann’s explanation for using the Tiljander data upside down?”
The response by Mann that I am aware of is simply this: “The claim that “upside down” data were used is bizarre. Multivariate regression methods are insensitive to the sign of predictors.” I don’t have the link, but Mann published this in response to a published comment by M&M.
This response is completely inadequate. Here is a excellent explanation of the issue that I found in the comments on another blog:
“The sort of multivariate regression techniques that Mann used are effectively data mining: you take a bunch of datasets that you have some vague reason for believing might be proxies for temperature, and then regress them against an instrumental record for some calibration period. This reveals the small number of “proxies” which did in fact correlate well over that period, and you then attempt to reconstruct other periods by assuming that the same datasets are similarly good proxies over other periods, effectively weighting them by their correlation coefficients.”
“Like any data mining approach this is fraught with dangers; in essence you can just pick out a bunch of random noise sequences which happen to correlate by chance. This can be partly guarded against by checking the reconstruction against a verification period, though you have to be extremely careful how you do this, and many criticisms of Mann’s verification statistics have been made.”
“There is, however, a simpler check, which is just to look at the sign of the correlation. If a “proxy” anti-correlates with the instrumental temperature, going down when temperature goes up, then the regression program will give a negative correlation coefficient, and so it contributes to the reconstruction with a negative weight. Nothing wrong with that. However for most plausible proxies it is possible to assign the sign of the correlation coefficient a priori (indeed some people would argue that if you don’t even know the sign a priori, then the chance of something being a genuine proxy rather than just fortuitous noise is low).”
“Now we turn to Tiljander’s sediment data. This data does have a well defined a priori sign, but it turns out that when placed into Mann’s program it comes out with the wrong sign for the correlation coefficient, so that in the reconstructions it is turned upside down. Why did this happen? Because Mann’s correlation is run against the modern period, where Tiljander’s data is badly contaminated by the effects of bridge building, and this modern contamination correlates well with temperature IF you turn the dataset upside down!”
“So yes, multivariate regression methods are insensitive to the sign of the predictors, but no this doesn’t mean that it can’t end up using datasets “upside down”; in the case of Tiljander’s data it can and it did.” ” http://andyrussell.wordpress.com/2010/06/15/the-hockey-stick-evolution/#comment-540
From me: This point seems obvious. But go over the and read the comments at climateaudit.org on this very issue and read some of the comments on some of the “team” blogs. It is absolutely amazing how confused some scientists are on this elementary point and it is amazing that Mann and his team won’t concede this point.