More broken hockey stick fallout: Audit of an Audit of an Auditor

http://www.blisstree.com/files/232/2008/11/magnifying-glass.jpgFor those that don’t read a lot of the WUWT comments closely, there has been a scholarly argument going on between  Tom P of the UK and several WUWT commentators over the methodology Steve McIntyre used to illustrate the “breathtaking difference” between the plot of  the hand picked set of 12 Yamal trees and the larger Schweingruber tree ring data set also from Yamal. Tom P. reworked Steve’s R-code script (which he posted on WUWT) to include both the 12 excluded and the Schweingruber and  thought he found “insensitivity to additional data”, saying “There is no broken hockeystick”.

Jeff Id audited the auditor of an auditor and found that Steve’s work still holds up “robustly”. – Anthony


Jeff Id writes on The Air Vent

Just a short post tonight I hope. Tom P, an apparent believer in the hockey stick methods posted an entertaining reply to Steve McIntyre’s recent discoveries on Yamal. He used R code to demonstrate a flaw in SteveM’s method. His post was on WUWT, brought to my attention by Charles the moderator and is copied here where he declares victory over Steve.

Tom P writes on WUWT:

===========

Steve McIntryre’s [sic] reconstructions above are based on adding an established dataset, the Schweingruber Yamal sample instead of the “12 trees used in the CRU archive”. Steve has given no justification for removing these 12 trees. In fact they probably predate Briffa’s CRU analysis, being in the original Russian dataset established by Hantemirov and Shiyatov in 2002.

One of Steve’s major complaint about the CRU dataset was that it used few recent trees, hence the need to add the Schweingruber series. It was therefore rather strange that towards the end of the reconstruction the 12 living trees were excluded only to be replaced by 9 trees with earlier end dates.

I asked Steve what the chronology would look like if these twelve trees were merged back in, but no plot was forthcoming. So I downloaded R, his favoured statistical package, and tweaked Steve’s published code to include the twelve trees back in myself. Below is the chronology I posted on ClimateAudit a few hours ago.

TomP s plot. Source: http://img80.yfrog.com/img80/1808/schweingruberandcrud.png
TomP' s plot. Click to enlarge Source: http://img80.yfrog.com/img80/1808/schweingruberandcrud.png

The red line is the RCS chronology calculated from the CRU archive; black is the chronology calculated using the Schweingruber Yamal sample and the complete CRU archive. Both plots are smoothed with 21-year gaussian, as before. The y-axis is in dimensionless chronology units centered on 1.

It looks like the Yamal reconstruction published by Briffa is rather insensitive to the inclusion of the additional data. There is no broken hockeystick.

=============
Jeff Id writes:
He did a fantastic job in reworking R code to create an improved hockey stick graph. To see his code the link is here.
.
tomp
Jeff Id’s version of TomP’s graph – Click to expand
I spent some time tonight looking at his results. Time planned for analyzing Antarctic sea ice. I found that essentially the only difference in the operating functions of the code is the following line.
.
Steve M —- tree=rbind(yamal[!temp,],russ035)
Tom P —– tree=rbind(yamal,russ035)
.

The !temp in Steve’s line removes 12 series of Yamal for the average while Tom’s version includes it. I’m all for inclusion of all data, but I am a firm believer that Briffa’s data is probably a cherry picked set of trees to match temp or something. Therefore by inclusion of the sorted Briffa Yamal version, we have an automatic exclusion of data which would otherwise balance the huge trend. However, this is not the problem with Tom’s result. The problem lies in this plot, also created by Tom P’s code.

tompcntTom P’s Yamal Reconstruction – Count per Year. Click to Expand

Here is the zoomed in version:

tompcntzoom

Above we can see that everything in TomP’s curve after 1990 is actually 100% Briffa Yamal data.

So the question becomes – What does the series look like if the Yamal data doesn’t create the ridiculous spike at the end the curve?

I truncated the black line at 1990 below.

tomptruncsh

The black line is truncated at the end of the Schweingruber data and it looks pretty similar to the graph presented in the green line by Steve McIntyre again below.

rcs_merged_rev[1]

Don’t be too hard on Tom P, he honestly did a great job and took the time to work with the R script which is more than most are willing to. Steve is a very careful worker though and it’s damn near impossible to catch him making mistakes. Trust a serious skeptic, it’s not easy to find mistakes in his work and some of us check him just as I spent over an hour checking Tom’s work. In my opinion Tom deserves congratulations for his efforts and checking, this way we all learn.

I’ve now been all the way through SteveM’s scripts from beginning to end and can’t find any problems with the script, maybe others can!


Steve McIntyre adds in WUWT comments :

Steve McIntyre (21:35:13)

Here is some conclusive evidence in respect to the following misrepresentation by Tom:

Steve McIntyre said they may well have been just the most recent part of Hantemirov and Shiyatov’s dataset and no selection would have been made.

In my first post in this sequence http://www.climateaudit.org/?p=7142, I identified a common pattern to the IDs for cores and observed:

There are 252 distinct series in the CRU archive. There are 12 IDs consisting of a 3-letter prefix, a 2-digit tree # and 1-digit core#. All 12 end in 1988 or later and presumably come from the living tree samples. The nomenclature of these core IDs url (POR01…POR11; YAD04…YAD12; JAH14…JAH16 – excluding the last digit of the ID here as it is a core #) suggests to me that there were at least 11 POR cores, 12 YAD cores and 16 JAH cores.

It is “possible” that they skipped ID numbers, but this is a farfetched theory even for Tom. As surmised here, the missing ID numbers are “evidence” of at least 39 cores and that the present archive is not only too small, but incomplete.

=========

and also this comment:

Steve McIntyre (20:13:22)

I am online too much, but I am not online 24/7. I’ve been out playing squash. Surely I’m allowed to be offline occasionally without a poster commenting adversely on this.

While I was out, CA crashed as well. Thus, it was “quiet.

Contrary to Tom’s speculations and misrepresentation of my statements, it is my opinion that there is considerable evidence that the 12 cores are not a complete population i.e. that they have been picked form a larger population. Rather than quote form actual text, Tom puts the following words in my mouth that I did not say:

Steve McIntyre said they may well have been just the most recent part of Hantemirov and Shiyatov’s dataset and no selection would have been made.

This is not my view.

The balance of Tom’s argument is:

No, they are the twelve most recent cores. There’s been no evidence provided to suggest they are in any way suspect. ..There is no obvious reason to exclude them.

I disagree. I do not believe that they constitute a complete population of recent cores. As a result, I believe that the archive is suspect. There is every reason to exclude them in order to carry out a sensitivity as I did. The sensitivity study showed very different results. I do not suggest that the sensitivity run be used as an alternative temperature history. Right now, there are far too many questions attached to this data set to propose any solution to the sampling conundrum. It’s only been a couple of days since the lamentable size of the CRU sample became known and it will take a little more time yet to assess things.

Reasons why I “suspect” that a selection was made from a larger population include the following. A field dendro could take 12 cores in an hour. We took a lot more than that at Mt Allegre and a field dendro could be far more efficient. Thus, it seems very unlikely that the entire population of cores from the Yamal program is only 12 cores and on this basis, it is my surmise that a selection was taken from the cores. Standard dendro procedures use all crossdated cores and definitely use more than 10 cores if they are available.

This doesn’t “prove” that a selection was made, but it is reasonable to “suspect” that a selection was made and to ask CRU and their Russian associates to provide a clear statement of their protocols. There’s no urgency to do anything prior to receiving a statement of their sampling protocols. For this purpose, it doesn’t matter a whit whether the selection was made by the Russians or at CRU or a combination. In my first post on this matter – which Tom appears not to have read, I canvass the limited evidence for and against. There is certainly evidence supporting the idea that the 12 cores were among 17 selected by the Russians, but in other parts of the data set, the CRU population is larger than that used in the Hantemirov and Shiyatov chronology. The construction of the CRU data set is not described in any literature; the description in Hantemirov and Shiyatov has something to do with it, but doesn’t yield the CRU data data set. Some sort of reconciliation is required.

In addition, the age distribution of the CRU 12 is very different than the age distribution from the nearby Schweingruber population. In my opinion, the uniformly high age of the CRU12 relative to the Schweingruber population is suggestive of selection – in this respect, perhaps and even probably by the Russians. Again this isnt proof. Maybe they were just lucky 12 straight times and, unlike Schweingruber, they got very long-lived trees with every core. Without documentaiton, no one knows. In any event, this doesn’t help the Briffa situation. If these things are temperature proxies, the results from two different nearby populations should not be so different and protocols need to be established for ensuring that the age distribution of the modern sample is relatively homogeneous with the subfossil samples (and they aren’t.)

The prevailing dendro view is that an RCS chronology requires a much larger population than a “conventional” standardization. Thus, even if the data set had been winnowed down to 10 cores in 1990 and 5 cores at the end, this is an absurdly low population for modern cores, which are relatively easily obtained. Use of such small replication is inconsistent with Briffa’s own methodological statements.

Tom also misses a hugely important context. There is a nearby site (Polar Urals) with an ample supply of modern core. Indeed, at one time, Briffa used Polar Urals to represent this region. My original question was whether there was a valid reason for substituting Yamal for Polar Urals. The microscopic size of the modern record suggests that there was not a valid reason. However, this tiny sample size was not known to third parties until recently due to Briffa’s withholding of data, not just from me, but also to D’Arrigo, Wilson et al.

Until details of the Yamal selection process are known, my sense right now is that one cannot blindly assume – as Tom does – that what we see is a population. Maybe this will prove to be the case, but personally I rather doubt it. A better approach is to use the Polar Urals data set as a building block.

As to Tom’s argument that none of this “matters”, the Yamal data set has a bristlecone-like function in a number of reconstructions. While the differences between the versions may not seem like a lot to Tom, as someone with considerable experience with this data, it is my opinion that the revisions will have a material impact on the medieval-modern difference in the multiproxy studies that do not depend on strip bark bristlecones.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
126 Comments
Inline Feedbacks
View all comments
kim
October 5, 2009 4:10 am

Pay attention to Jean S’s comment at the new Tom P Climate Audit thread.
=============================================

1 4 5 6