Note: Steve McIntyre is also quite baffled by the Marcott et al paper, finding it currently unreproducible given the current information available. I’ve added some comments from him at the bottom of this post – Anthony
Guest Post by Willis Eschenbach
I don’t know what it is about proxies that makes normal scientists lose their senses. The recent paper in Science (paywalled of course) entitled A Reconstruction of Regional and Global Temperature for the Past 11,300 Years” (hereinafter M2012) is a good example. It has been touted as the latest hockeystick paper. It is similar to the previous ones … but as far as I can see it’s only similar in how bizarre the proxies are.
Nowhere in the paper do they show you the raw data, although it’s available in their Supplement. I hate it when people don’t show me their starting point. So let me start by remedying that oversight:

Figure 1. All of the proxies from M2012. The colors are only to distinguish individual records, they have no meaning otherwise.
I do love the fact that from that collection of temperature records they draw the conclusion that:
Current global temperatures of the past decade have not yet exceeded peak interglacial values but are warmer than during ~75% of the Holocene temperature history.
Really? Current global temperature is about 14°C … and from those proxies they can say what the past and present global average temperatures are? Well, let’s let that claim go for a moment and take a look at the individual records.
Here’s the first 25 of them:
Figure 2. M2012 proxies 1 to 25. Colors as in Figure 1. Note that each panel has its own vertical axis. Numbers to the left of each title are row/column.
Well … I’d start by saying that it seems doubtful that all of those are measuring the same thing. Panel 3/1 (row 3, column 1) shows the temperature decreasing for the last ten thousand years. Panels 4/4 and 4/5 show the opposite, warming for the last ten thousand years. Panel 4/3 shows four thousand years of warming and the remainder cooling.
Let’s move on to the next 25 contestants:
Figure 3. M2012 proxies 26 to 50. Colors as in Figure 1. Note that each panel has its own vertical axis. Numbers to the left of each title are row/column.
Here we see the same thing. Panels 1/1 and 4/1 show five thousand years of warming followed by five thousand years of cooling. Panel 1/5 shows the exact opposite, five thousand cooling years followed by five thousand of warming. Panel 4/5 show steady warming, panel 5/2 shows steady cooling, and panel 2/2 has something badly wrong near the start. Panel 2/4 also contains visible bad data.
Onwards, we near the finish line …
Figure 4. M2012 proxies 51 to 73. Colors as in Figure 1. Note that each panel has its own vertical axis. Numbers to the left of each title are row/column.
Panel 2/1 shows steadily rising temperatures for ten thousand years, as does panel 3/4. Panels 4/1 and 5/1, on the other hand, show steadily decreasing temperatures. Panel 4/2 has a hump in the middle. but panel 1/2 shows a valley in the middle.
Finally, here’s all the proxies, with each one shown as anomalies about the average of its last 2,000 years of data:

Figure 5. All Marcott proxies, expressed as anomalies about their most recent 2,000 years of record. Black line shows 401-point Gaussian average. N=9,288.
A fine example of their choice of proxies can be seen in the fact that they’ve included a proxy which claims a cooling about nine degrees in the last 10,000 years … although to be fair, they’ve also included some proxies that show seven degrees of warming over the same period …
I’m sorry, guys, but I’m simply not buying the claim that we can tell anything at all about the global temperatures from these proxies. We’re deep into the GIGO range here. When one proxy shows rising temperatures for ten thousand years and another shows dropping temperatures for ten thousand years, what does any kind of average of those two tell us? That the temperature was rising seven degrees while it was falling nine degrees?
And finally, their claim of turning that dogs breakfast shown in Figure 1 into an absolute global temperature and comparing it to the current 14°C average temperature estimate?
Don’t make me laugh.
I say the reviewers of this paper didn’t use their Mark I eyeball. The first thing to do when dealing with a multi-proxy study is to establish ex-ante criteria for the selection of the proxies (“ex-ante” meaning choose your criteria before looking at the proxies). Here are their claimed criteria …
This study is based on the following data selection criteria:
• Sampling resolution is typically better than ~300 yr.
• At least four age-control points span or closely bracket the full measured interval.
• Chronological control is derived from the site itself and not primarily based on tuning to other sites. Layer counting is permitted if annual resolution is plausibly confirmed (e.g., ice-core chronologies). Core tops are assumed to be 1950 AD unless otherwise indicated in original publication.
• Each time series spans greater than 6500 years in duration and spans the entire 4500 – 5500 yr B.P. reference period.
• Established, quantitative temperature proxies
• Data are publicly available (PANGAEA, NOAA-Paleoclimate) or were provided directly by the original authors in non-proprietary form.
• All datasets included the original sampling depth and proxy measurement for complete error analysis and for consistent calibration of age models (Calib 6.0.1 using INTCAL09 (1)).
Now, that sounds all very reasonable … except that unfortunately, more than ten percent of the proxies don’t meet the very first criterion, they don’t have sampling resolution that is better than one sample per 300 years. Nice try, but eight of the proxies fail their own test.
I must say … when a study puts up its ex-ante proxy criteria and 10% of their own proxies fail the very first test … well, I must say, I don’t know what to say.
In any case, then you need to LOOK AT EACH AND EVERY PROXY. Only then can you begin to see if the choices make any sense at all. And in this case … not so much. Some of them are obviously bogus. Others, well, you’d have to check them one by one.
Final summary?
Bad proxies, bad scientists, no cookies for anyone.
Regards,
w.
==============================================================
Steve McIntyre writes in a post at CA today:
Marcott et al 2013 has received lots of publicity, mainly because of its supposed vindication of the Stick. A number of commenters have observed that they are unable to figure out how Marcott got the Stick portion of his graph from his data set. Add me to that group.
The uptick occurs in the final plot-point of his graphic (1940) and is a singleton. I wrote to Marcott asking him for further details of how he actually obtained the uptick, noting that the enormous 1920-to-1940 uptick is not characteristic of the underlying data. Marcott’s response was unhelpful: instead of explaining how he got the result, Marcott stated that they had “clearly” stated that the 1890-on portion of their reconstruction was “not robust”. I agree that the 20th century portion of their reconstruction is “not robust”, but do not feel that merely describing the recent portion as “not robust” does full justice to the issues. Nor does it provide an explanation.
Read Steve’s preliminary analysis here:
[UPDATE] In the comments, Steve McIntyre suggested dividing the proxies by latitude bands. Here are those results:
Note that there may be some interesting things buried in there … just not what Marcott says.
Also, regarding the reliability of his recent data, he describes it as “not robust”. It is also scarce. Only 0.6% of the data points are post 1900, for example. This raises the question of how he compared modern temperatures to the proxies, since there is so little overlap.
Finally, about a fifth of the proxies (14 of 73) have the most recent date as exactly 1950 … they said:
Core tops are assumed to be 1950 AD unless otherwise indicated in original publication.
Seems like an assumption that is almost assuredly wrong. I don’t know if that’s a difference that makes a difference, depends on how wrong it is. If we take the error as half the distance to the next data point for each affected proxy, it averages about ninety years … pushing 1950 back to 1860 … yeah, I’ll go with “not robust” for that.
[UPDATE 2] Yes, I am shoveling gravel, one ton down, six to go … and I do get to take breaks. Here’s the result of my break, the Marcott proxies by type:
And here’s a picture of yr. unbending author playing what we used to call the “Swedish Banjo”.
Best to all,
w.


@Nick
In other words they are guessing.
How can you compare “accurate” instrumental temperature with a temperature reconstructed via a proxy via a proxy, unless you can verify the accuracy of said recon. temperature?
You are guessing!
Lesley,
How can you compare “accurate” instrumental temperature with a temperature reconstructed via a proxy via a proxy, unless you can verify the accuracy of said recon. temperature?”
They say:
“Unlike the reconstructions of the past millennium, our proxy data are converted quantitatively to temperature before stacking, using independent core-top or laboratory culture calibrations with no post-hoc adjustments in variability.”
Willis,
I think I see a few bars of a Beethoven piano sonata in one of those green lines. Are these graphs drawn by a chimp; or an orangutan. Sometimes it’s hard to know the difference.
This has to be one of your all time classic posts. These people must think we are all stupid.
Converted, based on what?
What are those “laboratory culture calibrations ” based on?
If you know please answer.
Anyone can make up a standard and calibrate to that.
Notice I prefer not to say they cheated but I’d like to know how they arrived at the temperature they arrived at?
I’ve read as much as is available and is not clear to me.
“”””…..Skiphil says:
March 13, 2013 at 11:55 pm
R says:
March 13, 2013 at 11:43 pm
“….Yes there are regions in the planet that have experienced large amplitudes of warming/cooling in the last 10,000 years. This is the particularly plausible at high latitudes….”
R, how do you know? And how do you know that Marcott et al. obtained a statistically sound sampling of all the earth’s surface?…..””””
So just what the hey means a “statistically sound sampling”. There is nothing even remotely statistical about sampling.
Sampling means you read the value of some continuous function at a specific “point”; i.e. some specific values of all of the variables the function has. For “climate” related things, this would typically be time and location at least .
The only requirement the samples must satisfy, is the Nyquist sampling theorem. If they don’t do that, then they aren’t valid data of anything, and no amount of statistication will extract meaningful information from them.
Statistics is a process of throwing away information, to derive pseudo information that was never observed by anybody, anywhere, anytime.
Nick writes “That doesn’t mean they aren’t measuring temps way back, and we do know what temperatures are now. Of course they can be compared – why not?”
The question is …why should they be included and compared if they cant/dont reproduce modern temperatures, not the other way around.
I dont have access to the paper but were the individual proxies detrended for whatever selection criteria that was made based on temperature?
PS, and if the independent core-top or laboratory culture calibrations are accurate why can’t they use recent actual sediments?
Reblogged this on Climate Ponderings.
Nick Stokes says: March 14, 2013 at 2:22 am “The discussion on p 6-7 of the SI of the cited reference seems to focus on “age model error” limiting the resolution.”
My apologies here. I’m commenting before finishing reading. Question is, Nick, if the problem is as stated, surely the visual effect would be a series of curves of similar shape merely displaced to left or right?
At this stage I have no confidence that sufficient of the measurements are related in a direct way to temperature with enough confidence to reconstruct temperature.
As for the spike, in the absence of a direct explanation and the presence of an excuse about non-robust data, it would not be unfair to call it invented. Would you agree?
If you were a reviewer, would you pass this paper? I suspect I would not, on what I have read to date, especially after a dish of spaghetti marinara with a strong odour of fish.
“Mount Honey is the highest point on Campbell Island, one of New Zealand’s subantarctic outlying islands”
Hmmm, so what kind of proxy were they using there? Because Google images show it to be a pretty barren place.
@Skiphill “He strikes me (whatever the weaknesses of the 2013 article) as a real human who should be amenable to sincere scientific discussion”
I think anybody that takes the data that Willis has presented it here and then claims it has anything useful to tell us about climate other than “trees don’t grow well on ice sheets”, whether it is aimed at getting those all important letters “Dr” in front of your name or to claim government funding for research is an irredeemable liar and if he says something you agree with to your face he will say the exact opposite the next second if it suits him.
I realise since my last post that the proxies are from plankton samples. Which begs the questions “how do you get plankton from a lake that is frozen?”. Several of the proxies seem to be close to the poles or above the snow line. How reliable is a proxy that can only grow in the summer when compared against a proxy nearer the tropics that can grow all year?
Is there any science in this paper at all?
Clive,
“Finally you make a global area weighted average on a 5×5 grid. Did they do that ? “
As I understand, each proxy is converted to °C without reference to other proxy or air measurements. Then anomalizing is essentially the same as for thermometer readings, but the actual base interval, 1961-90, is approached indirectly:
“To compare our Standard5×5 reconstruction with modern climatology, we aligned the stack’s mean for the interval 510 to 1450 yr B.P. (where yr B.P. is years before 1950 CE) with the same interval’s mean of the global Climate Research Unit error-in-variables (CRU-EIV) composite temperature record (2), which is, in turn, referenced to the 1961–1990 CE instrumental mean (Fig. 1A).”
So it’s the same process to get an anomaly base – there is some loss of accuracy in aligning it exactly to 1961-90 because of the two-step, but the main thing is to get the base the same.
Geoff S,
“If you were a reviewer, would you pass this paper?”
I probably would have objected to the spiky part, on the basis of what I understand so far. I just think it’s unhelpful. The proxies don’t have the resolution to be reliable, and we have much better information from thermometers. Leave it out.
But there are many good things about the paper. Low freq info is bad on a decade scale, but useful for millennia.
Ok, so now I see how this works. We have a spike that comes only from relatively high resolution data from another source. This spike is actually shown in only one of the proxies in Marcott, but it has been stitiched on to the proxy data from Marcott. In reality, the Marcott proxies, such as they are, when treated as the same resolution throughout, show a more or less flat line throughout the entire period 0-10,000yrs bp.
The same trick is being used as before. Take the adjusted temperature date from 1950 to 1990 which happens to show a spike – because it was adjusted to give a spike. Then stitch the tree ring data on to that lowest part of that to get an estimate from 1950 back to 1450. Then stitch the Marcott proxies onto the lowest part of that data to extend the same load of old cobblers back a further 10,000 yrs. But still the reality is that the only data that actually shows an uptick is the adjusted thermometer data from 1950 to 1990. Everything else has been simply stuck on the bottom end of that gradient to give the impression they know what temps were going back 10,000 years. The reality is the Marcott proxies don’t have the resolution to tell you anything useful about temperature trends today compared to the last 10,000 yrs. Short duration spikes could be quite common for all we know – what we are living through now (if it is real) could be just one of them. It is yet another mathematical falsehood dressed up to look like real science.
You would only believe that Marcott tells you anything if you are absolutely certain already that temperatures since 1950 have been hotter than at any time for 10,000 yrs. If you are a true believer than the Marcott process of simply projecting their data going backwards from the 1950 temperature record is a perfectly reasonable thing to do. And that is how this big pile of BS is getting through review. It is a religion based on circular arguments. Anybody else can see this pile of poop proves nothing, other than how corrupt publicly funded science has become.
The only conclusions that I can make from this are
A: At least of half of the proxies are completely invalid.
and/or
B: The concept of global temperature is fundamentally flawed.
Since the basic trend of each proxy is so different, either global average temperature is not a dominant player in the local or regional temperatures measured by the proxies, or they’re a steaming pile of manure.
Of course averaging these proxies will get a flat line, just as averaging white noise or red noise will get a straight line.
I was working with the appendixes proxy data, and get those problems. Are they normal?
Proxy 46 – TN057-17
Published temperature is the mean between warm and cold season only until 236 cm depth, 7182 years BP. Deeper (and older) samples have an added step of -1.34+-.15
Proxy 40 – BJ8 13GGC
There are certain noise in the relation between published proxy and published temperature, in any case it doesn’t seem to have a great influence
Proxy 20 – 74KL (TEX86)
I cannot see any relation between proxy values and temperature values. Are the values in the excel file really correspondent to those of the original publication?.
Nick Stokes says:
March 14, 2013 at 10:30 pm
They say:
“Unlike the reconstructions of the past millennium, our proxy data are converted quantitatively to temperature before stacking, using independent core-top or laboratory culture calibrations with no post-hoc adjustments in variability.”
You must be joking. what they had in the proxy data was showing totally different trends for current times:
http://suyts.wordpress.com/2013/03/14/hockey-stick-found-in-marcott-data/
The splicing of some thermometer data is giving a totally different calibration and totally different trends.
I’m descending to earth in Apollo 11. All is fine and dandy. Only 50 feet till landing. But then suddendly I recall: NASA said the data for the last 50 feet was not ‘robust’…
“aboard Apollo 11” for Pete’s sake… been living in France far too long.
Willis,
great detailed work as usual.
I want to take another tack with this: I have looked at the 73 individual proxies, and only 4 or 5 have an uptick in the most recent data (Radiolaria dominated?) Any equally weighted combination of the 73 cannot result in the published Marcott graph, unless one of the uptick proxies is given very heavy weighting, probably >50%.
Therefore I think the published graph has been fabricated.
The single diatom record does not show a variation for the end if the ice age, the start and end of the Younger Dryas or any other climate event since then. How is this a temperature proxy?
Apologies for the offending word, but that was not my subject. I wrote about how confirmation bias prompted historians–along with popular media and the most prestigious historical publications–to champion a charlatan whose revisionist ‘research’ was later proved to be a complete fabrication. Unlike climate scientists, the cabal of historians admitted their bad judgment and threw their lying fellow under the proverbial bus. But perhaps they wouldn’t have been so relatively virtuous if endless grant money had been offered to promote the historical revisionism.
Maybe not relevant enough to this thread and worthy of snipping, but it was not an attempted hijacking.
Best regards,
nutso
What is REALLY interesting is that it appears realclimate (Gavin and Co) have not even discussed this paper on their website? (to be corrected if they posted it earlier etc) but it appears not so.
Dave R “Therefore I think the published graph has been fabricated” There is a site where fraud publications are listed.
http://retractionwatch.wordpress.com/ maybe it should be reported