More broken hockey stick fallout: Audit of an Audit of an Auditor

http://www.blisstree.com/files/232/2008/11/magnifying-glass.jpgFor those that don’t read a lot of the WUWT comments closely, there has been a scholarly argument going on between  Tom P of the UK and several WUWT commentators over the methodology Steve McIntyre used to illustrate the “breathtaking difference” between the plot of  the hand picked set of 12 Yamal trees and the larger Schweingruber tree ring data set also from Yamal. Tom P. reworked Steve’s R-code script (which he posted on WUWT) to include both the 12 excluded and the Schweingruber and  thought he found “insensitivity to additional data”, saying “There is no broken hockeystick”.

Jeff Id audited the auditor of an auditor and found that Steve’s work still holds up “robustly”. – Anthony


Jeff Id writes on The Air Vent

Just a short post tonight I hope. Tom P, an apparent believer in the hockey stick methods posted an entertaining reply to Steve McIntyre’s recent discoveries on Yamal. He used R code to demonstrate a flaw in SteveM’s method. His post was on WUWT, brought to my attention by Charles the moderator and is copied here where he declares victory over Steve.

Tom P writes on WUWT:

===========

Steve McIntryre’s [sic] reconstructions above are based on adding an established dataset, the Schweingruber Yamal sample instead of the “12 trees used in the CRU archive”. Steve has given no justification for removing these 12 trees. In fact they probably predate Briffa’s CRU analysis, being in the original Russian dataset established by Hantemirov and Shiyatov in 2002.

One of Steve’s major complaint about the CRU dataset was that it used few recent trees, hence the need to add the Schweingruber series. It was therefore rather strange that towards the end of the reconstruction the 12 living trees were excluded only to be replaced by 9 trees with earlier end dates.

I asked Steve what the chronology would look like if these twelve trees were merged back in, but no plot was forthcoming. So I downloaded R, his favoured statistical package, and tweaked Steve’s published code to include the twelve trees back in myself. Below is the chronology I posted on ClimateAudit a few hours ago.

TomP s plot. Source: http://img80.yfrog.com/img80/1808/schweingruberandcrud.png
TomP' s plot. Click to enlarge Source: http://img80.yfrog.com/img80/1808/schweingruberandcrud.png

The red line is the RCS chronology calculated from the CRU archive; black is the chronology calculated using the Schweingruber Yamal sample and the complete CRU archive. Both plots are smoothed with 21-year gaussian, as before. The y-axis is in dimensionless chronology units centered on 1.

It looks like the Yamal reconstruction published by Briffa is rather insensitive to the inclusion of the additional data. There is no broken hockeystick.

=============
Jeff Id writes:
He did a fantastic job in reworking R code to create an improved hockey stick graph. To see his code the link is here.
.
tomp
Jeff Id’s version of TomP’s graph – Click to expand
I spent some time tonight looking at his results. Time planned for analyzing Antarctic sea ice. I found that essentially the only difference in the operating functions of the code is the following line.
.
Steve M —- tree=rbind(yamal[!temp,],russ035)
Tom P —– tree=rbind(yamal,russ035)
.

The !temp in Steve’s line removes 12 series of Yamal for the average while Tom’s version includes it. I’m all for inclusion of all data, but I am a firm believer that Briffa’s data is probably a cherry picked set of trees to match temp or something. Therefore by inclusion of the sorted Briffa Yamal version, we have an automatic exclusion of data which would otherwise balance the huge trend. However, this is not the problem with Tom’s result. The problem lies in this plot, also created by Tom P’s code.

tompcntTom P’s Yamal Reconstruction – Count per Year. Click to Expand

Here is the zoomed in version:

tompcntzoom

Above we can see that everything in TomP’s curve after 1990 is actually 100% Briffa Yamal data.

So the question becomes – What does the series look like if the Yamal data doesn’t create the ridiculous spike at the end the curve?

I truncated the black line at 1990 below.

tomptruncsh

The black line is truncated at the end of the Schweingruber data and it looks pretty similar to the graph presented in the green line by Steve McIntyre again below.

rcs_merged_rev[1]

Don’t be too hard on Tom P, he honestly did a great job and took the time to work with the R script which is more than most are willing to. Steve is a very careful worker though and it’s damn near impossible to catch him making mistakes. Trust a serious skeptic, it’s not easy to find mistakes in his work and some of us check him just as I spent over an hour checking Tom’s work. In my opinion Tom deserves congratulations for his efforts and checking, this way we all learn.

I’ve now been all the way through SteveM’s scripts from beginning to end and can’t find any problems with the script, maybe others can!


Steve McIntyre adds in WUWT comments :

Steve McIntyre (21:35:13)

Here is some conclusive evidence in respect to the following misrepresentation by Tom:

Steve McIntyre said they may well have been just the most recent part of Hantemirov and Shiyatov’s dataset and no selection would have been made.

In my first post in this sequence http://www.climateaudit.org/?p=7142, I identified a common pattern to the IDs for cores and observed:

There are 252 distinct series in the CRU archive. There are 12 IDs consisting of a 3-letter prefix, a 2-digit tree # and 1-digit core#. All 12 end in 1988 or later and presumably come from the living tree samples. The nomenclature of these core IDs url (POR01…POR11; YAD04…YAD12; JAH14…JAH16 – excluding the last digit of the ID here as it is a core #) suggests to me that there were at least 11 POR cores, 12 YAD cores and 16 JAH cores.

It is “possible” that they skipped ID numbers, but this is a farfetched theory even for Tom. As surmised here, the missing ID numbers are “evidence” of at least 39 cores and that the present archive is not only too small, but incomplete.

=========

and also this comment:

Steve McIntyre (20:13:22)

I am online too much, but I am not online 24/7. I’ve been out playing squash. Surely I’m allowed to be offline occasionally without a poster commenting adversely on this.

While I was out, CA crashed as well. Thus, it was “quiet.

Contrary to Tom’s speculations and misrepresentation of my statements, it is my opinion that there is considerable evidence that the 12 cores are not a complete population i.e. that they have been picked form a larger population. Rather than quote form actual text, Tom puts the following words in my mouth that I did not say:

Steve McIntyre said they may well have been just the most recent part of Hantemirov and Shiyatov’s dataset and no selection would have been made.

This is not my view.

The balance of Tom’s argument is:

No, they are the twelve most recent cores. There’s been no evidence provided to suggest they are in any way suspect. ..There is no obvious reason to exclude them.

I disagree. I do not believe that they constitute a complete population of recent cores. As a result, I believe that the archive is suspect. There is every reason to exclude them in order to carry out a sensitivity as I did. The sensitivity study showed very different results. I do not suggest that the sensitivity run be used as an alternative temperature history. Right now, there are far too many questions attached to this data set to propose any solution to the sampling conundrum. It’s only been a couple of days since the lamentable size of the CRU sample became known and it will take a little more time yet to assess things.

Reasons why I “suspect” that a selection was made from a larger population include the following. A field dendro could take 12 cores in an hour. We took a lot more than that at Mt Allegre and a field dendro could be far more efficient. Thus, it seems very unlikely that the entire population of cores from the Yamal program is only 12 cores and on this basis, it is my surmise that a selection was taken from the cores. Standard dendro procedures use all crossdated cores and definitely use more than 10 cores if they are available.

This doesn’t “prove” that a selection was made, but it is reasonable to “suspect” that a selection was made and to ask CRU and their Russian associates to provide a clear statement of their protocols. There’s no urgency to do anything prior to receiving a statement of their sampling protocols. For this purpose, it doesn’t matter a whit whether the selection was made by the Russians or at CRU or a combination. In my first post on this matter – which Tom appears not to have read, I canvass the limited evidence for and against. There is certainly evidence supporting the idea that the 12 cores were among 17 selected by the Russians, but in other parts of the data set, the CRU population is larger than that used in the Hantemirov and Shiyatov chronology. The construction of the CRU data set is not described in any literature; the description in Hantemirov and Shiyatov has something to do with it, but doesn’t yield the CRU data data set. Some sort of reconciliation is required.

In addition, the age distribution of the CRU 12 is very different than the age distribution from the nearby Schweingruber population. In my opinion, the uniformly high age of the CRU12 relative to the Schweingruber population is suggestive of selection – in this respect, perhaps and even probably by the Russians. Again this isnt proof. Maybe they were just lucky 12 straight times and, unlike Schweingruber, they got very long-lived trees with every core. Without documentaiton, no one knows. In any event, this doesn’t help the Briffa situation. If these things are temperature proxies, the results from two different nearby populations should not be so different and protocols need to be established for ensuring that the age distribution of the modern sample is relatively homogeneous with the subfossil samples (and they aren’t.)

The prevailing dendro view is that an RCS chronology requires a much larger population than a “conventional” standardization. Thus, even if the data set had been winnowed down to 10 cores in 1990 and 5 cores at the end, this is an absurdly low population for modern cores, which are relatively easily obtained. Use of such small replication is inconsistent with Briffa’s own methodological statements.

Tom also misses a hugely important context. There is a nearby site (Polar Urals) with an ample supply of modern core. Indeed, at one time, Briffa used Polar Urals to represent this region. My original question was whether there was a valid reason for substituting Yamal for Polar Urals. The microscopic size of the modern record suggests that there was not a valid reason. However, this tiny sample size was not known to third parties until recently due to Briffa’s withholding of data, not just from me, but also to D’Arrigo, Wilson et al.

Until details of the Yamal selection process are known, my sense right now is that one cannot blindly assume – as Tom does – that what we see is a population. Maybe this will prove to be the case, but personally I rather doubt it. A better approach is to use the Polar Urals data set as a building block.

As to Tom’s argument that none of this “matters”, the Yamal data set has a bristlecone-like function in a number of reconstructions. While the differences between the versions may not seem like a lot to Tom, as someone with considerable experience with this data, it is my opinion that the revisions will have a material impact on the medieval-modern difference in the multiproxy studies that do not depend on strip bark bristlecones.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
126 Comments
Inline Feedbacks
View all comments
TomLama
September 30, 2009 7:10 am

Buying a snow blower
5 Tips: Finding the snow blower that’s right for you.
http://money.cnn.com/2004/11/22/pf/saving/willis_tips/index.htm

Person of Choler
September 30, 2009 7:14 am

Step back from the minutiae a bit. The arguments about statistical arcana are interesting, but the fact remains that politicians are pushing for huge dislocations in the world economy, justified in large part by plugs drilled from some trees somewhere in Siberia.
Does this not seem a bit strange?

September 30, 2009 7:20 am

P Wilson (06:21:37) :
“Tom P. The concept of using tree rings to show nothing more than temperature is like showing how much coke there is in someone’s fridge to indicate how wealthy they are.
Lets not be silly”
this is the quote of the year as far as i am concerned

P Wilson
September 30, 2009 7:24 am

So lets take if rom the c02 thesis standpoint. Trees respond very well to elevated c02, and not necessarily temperature. Trees grow better with more c02 as they absorb it . If c02 levels were pre industrial lower during the MWP then no matter how higher the temp, c02 wasn’t plentiful enough to cause extra growth. So since the correlation between c02 and temp is loose, (often contrary to each other) higher temp lower c02 is possible, whereas today we have higher c02 and lower temp.
bang goes the tree proxy

September 30, 2009 7:27 am

Tom P (06:12:58) :
Please tell me you’re kidding!!

Editor
September 30, 2009 7:41 am

Regarding modern era dendro proxies, IMHO every scientific publication should require ANY dendro paper to publish the actual modern thermometer temp record plot on any dendro proxy chart and compare the degree of correlation before the proposed dendro record can be accepted as a proxy.

Henry chance
September 30, 2009 7:45 am

Now is time to take the high road. Be fair. Has Mann been contacted to share his explanation?
What about the most recent contact with Briffa?
Stalling and delay is a defense mechanism. But we still need their prompt response.

Dodgy Geezer
September 30, 2009 7:53 am

I would like to congratulate Tom P (and Steve too, though that goes without saying) for actually doing science as opposed to politics. If we could have relied on the world’s scientists to do science as opposed to politics, we would not be in the mess we are in today.
Having said that, my understanding of ‘robustness’ in statistical calculation is that a final answer should not critically depend on the inclusion of one set of data. Obviously one set of data will contain the largest signal, but even if you take that set out, the remainder should still show the signal you are searching for, though less obviously.
If this is true (and I wait to be corrected) then all that Tom P has done is show that, so long as the ‘Briffa 12’ are included, the result is a hockey stick. If they are excluded, the hockey stick completely disappears. So the finding is not robust with reference to the Briffa 12. This is enough to render these chronologies suspicious, and should have been prima facie grounds for exclusion….

September 30, 2009 8:01 am

If the Yamal cores “prove” one thing and the Polar Urals “prove” another quite different thing and they are nearby one another, one has to begin to think that there is some “science” involved with tree ring analysis for temperature reconstruction that has not been thought through.
There are a number of variables that affect tree ring width and they all vary in different ways on different sites at different times (after all these are living organisms and “the time they are a changin'”– on a continual basis). All of these myriads of changes give first one variable and then another precedence in influence on the width of a ring and take place in an organism that lives for 10’s to 100’s of years.
This leads to an immensely complex pathway of influences akin to the web of vegetation successional pathways described by Botonist Henry Chandler Cowles in 1901 as,”a variable converging on a variable.”
Put me in the column with the non-believers.

Bill P
September 30, 2009 8:07 am

Therefore by inclusion of the sorted Briffa Yamal version, we have an automatic exclusion of data which would otherwise balance the huge trend.

Huh?

September 30, 2009 8:07 am

Tim Clark (06:59:46) :
I just left an ugly comment on CA about this very disingenuous reply by Tom. He’s rescaled the vertical axis and made sure the noise level is high enough to confuse people. He then follwed it with the claim that SteveM is wrong while not admitting his first errors or distortions.
The original graph was much much higher at the end than this and my guess is that pre-1800 is much higher.
He’s just playing games now, this is not an honest post in my opinion.

September 30, 2009 8:32 am

Bill P (08:07:05) :
Tree rings have a typical variance. The briffa series has very likely been sorted to extract a high variance signal – too few trees sampled for a normal study, high variance, non-consecutive core sample numbering. If there’s enough time and information in the data, I’m planning a post on this tonight.
Consider that it would be a very unusual set of trees if all samples agreed with this increased growth rate. It would be equally unusual for agreeing trees to not be included in enhancing the robustness a climate study.
Therefore it’s pretty simple to understand that by including these sorted trees, we are adding a signal into the reconstruction which would otherwise be muted by the pretty clearly missing Yamal trees.
Trees make lousy thermometers.

Tom P
September 30, 2009 8:42 am

Jeff Id (08:07:36) :
Your tirade is groundless:
The “noise” is a direct result of reducing the smoothing window to avoid contamination from the post 1990 data.
I have not changed the vertical scale.
I have not removed the x axis.
I show a plot of the pre-1800 data above.
I presume by mistake, you mean not terminating the series at 1990. That wasn’t a mistake – as I made clear the reconstruction is built on the CRU and Schweingruber dataset in its entirety. Of course it is identical to the CRU reconstruction after 1990.
But why should the CRU dataset be disregarded after 1990 if it was valid before then? There seem to be no strong a priori reasons to exclude either the CRU or Schweingruber datasets at all. It was Steve who discarded the live cores from the CRU dataset, replacing them with the Schweingruber series, but if the reason is incomplete label sequences then the Schweingruber live cores are out as well.
I have invoked no selection criteria in including data from either series. If you, Steve or anyone else wish to deselect part of any series, please make the criteria plain so they can be validated. Anything else is cherry picking.

MattN
September 30, 2009 8:59 am

So what really has Tom P accomplished? It appears to me from his comments both here and at CA he’s clearly trying to justify the Team practice of ignoring valid chronology data that doesn’t correlate to temperature.

Jeff in Ctown (Canada)
September 30, 2009 9:14 am

Considering recent insite into plant growth and all the related veriables, 12 tree ring cores is hoplesly inadiquate, even if they weren’t cherry picked. I think that to have any hope of accuracy, you would need 100s or 1000s of cores from many locations around the globe with knowledge of all the other veriables.

September 30, 2009 9:19 am

Tom has done a few disingenuous things now and I’m not a patient guy. First, he has failed to admit that you cannot debunk SteveM’s sensitivity study by using original data only. A clear mistake in his first post.
For his first post to be correct the logic reads like this:
The hockey stick is correct, because when we use the original data only after 1990 it looks almost the same as the original data only after 1990.
Insanity.
His latest post has other unique properties where apparently his 3 year filter pulls in data from the yamal only series to make the endpoint stand up for only a couple of years. — Equally insane.
If people request I’ll spend some time tonight bashing on these latest changes. Otherwise I prefer to ignore it.

September 30, 2009 9:19 am

tallbloke (01:37:40) :
Michael (23:41:49) :
It should be called the “Yamal Briffa Affair”. Maybe for the movie?
“The Dendro Dozen”

No No. I have it – “The Blade of Briffa!”

September 30, 2009 9:23 am

SteveM got this one for me.
From CA
While Tom may think that his code accomplished a 1990 cutoff, it didn’t. As I originally observed, the closing portion of the Tom’s curve looks like Briffa’s data because it is Briffa’s data – apoint that I had avoided this error in my original Figure 3.

Don S.
September 30, 2009 9:24 am

Rhys Jaggar (01:02:40) :
“I have to say that the world can wait 50 years for 100 years of reliable direct measurements, be they by satellite or by land-based measurement (which should happen anyway to validate/detect deviations in the sensor-based satellite approach), rather than tear itself to pieces when the evidence, viewed dispassionately, of runaway global warming, simply is not sufficiently clear to justify such a premature battle between a jungle lioness and a pack of wilderbeest, being eagerly awaited, anticipated and fed upon by a pack of rapacious vultures.
Does anyone else agree with me??”
I do. When can we get started? I think this is where I came in. (see USHCN project)

Frederick Michael
September 30, 2009 9:32 am

To all,
History is being made here this week and all comments will be part of the record. Tom P deserves a medal for being willing to stand alone, disagreeing with a room full of intellectuals. That is very difficult to do and any hostility interferes with full function. Even the most minor personal comment is counter-productive.
Please treat him the way you would if you were a student and he were a famous professor with control over your grade.

September 30, 2009 9:34 am

Tom: I am not well versed in some of the technical aspects used to plot these graphs, so I hope my question makes some sense. On this graph you posted, does the smoothing you used take a percentage of the data trends from previous years and include that rise in the last year (i.e. immediate past trends affect the outcome of a selected year on the graph) therefore making the rise in the last two years a product of the inclusion previous years rise?

DR
September 30, 2009 9:52 am

ROM, Mosh, jeff id and anyone else seeing the obvious…..
Tom P = Beaker incognito 🙂 JK
It’s time to ‘move on’.

Michael
September 30, 2009 10:07 am

For those of you who are not informed David Rockefeller is a US Senator.

richcar
September 30, 2009 10:25 am

Even if the AGW cherrypickers are right about elevated atmospheric temps in the arctic with respect to the MWP, what does this have to due with a paleo global temperature reconstruction? It may just demonstrate their sought after local polar amplification. Surely we must be looking at new reconstructions like the Woods Hole reconstruction of SST’s which demonstrate that current SST’s are similar to the MWP. After all we know that global heat must reside primarly in the ocean.
http://www.whoi.edu/page.do?pid=7545&tid=282&cid=59106&ct=162

dearieme
September 30, 2009 11:38 am

Long ago, on another blog, I tried to have a scientific discussion with a Warmmonger. Then I realised that he was not arguing in good faith. Still, he was a “Stuart” not a “Tom”.