Throwing down the gauntlet on reproducibility in Climate Science – Forest et al. (2006)

After spending a year trying to get the data from the author without success, Nic Lewis has sent a letter to the editor of Geophysical Research Letters (GRL) and has written to me to ask that I bring attention to his letter published at Judith Curry’s website, and I am happy to do so.  He writes:

I would much appreciate it if you could post a link at WUWT to an article of mine (as attached) that has just been published at Climate Etc. It concerns the alteration of data used in an important climate sensitivity study, Forest 2006, with a radical effect on the resulting climate sensitivity estimated PDF.

I’m including the foreword here (bolding mine) and there is a link to the entire letter to the editor of GRL.

Questioning the Forest et al. (2006) sensitivity study

By Nicholas Lewis

Re:  Data inconsistencies in Forest, Stone and Sokolov (2006)  GRL paper 2005GL023977 ‘Estimated PDFs of climate system properties including natural and anthropogenic forcings

In recent years one of the most important methods of estimating probability distributions for key properties of the climate system has been comparison of observations with multiple model simulations, run at varying settings for climate parameters.  Usually such studies are formulated in Bayesian terms and involve ‘optimal fingerprints’. In particular, equilibrium climate sensitivity (S), effective vertical deep ocean diffusivity (Kv) and total aerosol forcing (Faer) have been estimated in this way. Although such methods estimate climate system properties indirectly, the models concerned, unlike AOGCMs, have adjustable parameters controlling those properties that, at least in principle, are calibrated in terms of those properties and which enable the entire parameter space to be explored.

In the IPCC’s Fourth Assessment Report (AR4), an appendix to WGI Chapter 9, ‘Understanding and attributing climate change’[i], was devoted to these methods, which provided six of the chapter’s eight estimated probability density functions (PDFs) for S inferred from observed changes in climate. Estimates of climate properties derived from those studies have been widely cited and used as an input to other climate science work. The PDFs for S were set out in Figure 9.20 of AR4 WG1, reproduced below.

The results of Forest 2006 and its predecessor study Forest 2002 are particularly important since, unlike all other studies utilising model simulations, they were based on direct comparisons thereof with a wide range of instrumental data observations – surface, upper air and deep-ocean temperature changes – and they provided simultaneous estimates for Kv and Faer as well as S. Jointly estimating Kv and Faer together with S is important, as it avoids dependence on existing very uncertain estimates of those parameters. Reflecting their importance, the IPCC featured both Forest studies in Figure 9.20. The Forest 2006 PDF has a strong peak which is in line with the IPCC’s central estimate of S = 3, but the PDF is poorly constrained at high S.

I have been trying for over a year, without success, to obtain from Dr Forest the data used in Forest 2006. However, I have been able to obtain without any difficulty the data used in two related studies that were stated to be based on the Forest 2006 data. It appears that Dr Forest only provided pre-processed data for use in those studies, which is understandable as the raw model dataset is very large.

Unfortunately, Dr Forest reports that the raw model data is now lost. Worse, the sets of pre-processed model data that he provided for use in the two related studies, while both apparently deriving from the same set of model simulation runs, were very different. One dataset appears to correspond to what was actually used in Forest 2006, although I have only been able to approximate the Forest 2006 results using it. In the absence of computer code and related ancillary data, replication of the Forest 2006 results is problematical. However, that dataset is compatible, when using the surface, upper air and deep-ocean data in combination, with a central estimate for climate sensitivity close to S = 3, in line with the Forest 2006 results.

The other set of data, however, supports a central estimate of S = 1, with a well constrained PDF.

I have written the below letter to the editor-in-chief of the journal in which Forest 2006 was published, seeking his assistance in resolving this mystery. Until and unless Dr Forest demonstrates that the model data used in Forest 2006 was correctly processed from the raw model simulation run data, I cannot see that much confidence can be placed in the validity of the Forest 2006 results. The difficulty is that, with the raw model data lost, there is no simple way of proving which version of the processed model data, if either, is correct. However, so far as I can see, the evidence points to the CSF 2005 version of the key surface temperature model data, at least, being the correct one. If I am right, then correct processing of the data used in Forest 2006 would lead to the conclusion that equilibrium climate sensitivity (to a doubling of CO2 in the atmosphere) is close to 1°C, not 3°C, implying that likely future warming has been grossly overestimated by the IPCC.

This sad state of affairs would not have arisen if Dr Forest had been required to place all the data and computer code used for the study in a public archive at the time of publication. Imposition by journals of such a requirement, and its enforcement, is in my view an important step in restoring trust in climate science amongst people who base their beliefs on empirical, verifiable, evidence.

Nic Lewis

==============================================================

Just let me say that there’s movement afoot to address the issues brought up about reproducibility in journal publications in the last paragraph. I’ll have more on this at a future date.

Here’s the foreword and letter to the GRL editor in PDF form:  Post on Forest 2006 GRL letter final

This figure from that letter by Lewis suggests a lower climate sensitivity to a doubling of CO2 than the original:

-Anthony

Advertisements

  Subscribe  
newest oldest most voted
Notify of

I’ve been doing my best to help this along. This should be brought to the attention of Tom Hammond on the House Science Committee. We had a long discussion about exactly this sort of thing, and since the US government almost always pays or helps pay for the work, it isn’t crazy to insist on it.
rgb

Doug

The Journal of Irreprocible Results was always one of my favorites!

Does this mean GRL paper 2005GL023977 should now be considered “grey literature”?

Doug

Make that Irreproducible. It still exists!
http://www.jir.com/

Kaboom

You can’t show the data means you don’t have a paper.

Interstellar Bill

As if there was such a thing as global climate sensitivity,
calculated by a mere spatial average of local sensitivities,
but valid for predicting ‘global average temperature’ (whatever that is)
under future emission scenarios.
Talk about far-fetched.
Worse yet, they average various bogus ‘sensitivites’ to get a ‘likely’ sensitivity.
This garbage is as totally removed from climatic reality
as Keynsian economics is from economic reality.

timetochooseagain

Hehe, looks like the high sensitivity “fat tail” is a phantom.
Keep in mind that the scariest climate scenarios are dependent on the “fat tail” for their plausibility, that is, they require a non-negligible probability of sensitivities greater than 4 K per doubling of CO2.

I no longer bother to read or try to understand the information published in these journals as you simply can;t believe anything they tell you anymore. I make it a point to tell this to everyone I know and as it is known that I’m always interested in the science end of things , people I know are also doubtful of things science published in the msm. So I do get some revenge on the liers afterall. But it is a sad and bad time for scientific advancement now.

Kaboom
You can’t show the data means you don’t have a paper.
… as a last resort – the paper challenged use the inside of the toilet roll!

Louis Hooffstetter

“Dr Forest reports that the raw model data is now lost.”
The dog ate my homework defense! – A classic ‘Team’ response!:
http://rogerpielkejr.blogspot.com/2009/08/we-lost-original-data.html
Dr. Forest is obviously a ‘Team’ player!

Baa Humbug

Unfortunately, Dr Forest reports that the raw model data is now lost.

Quite so.

Taphonomic

“Unfortunately, Dr Forest reports that the raw model data is now lost.”
Makes me wonder if the “dog ate my homework” excuse worked for Forest in grammar school, or if it now comes down to “…the accumulation of the raw model data is left as an exercise for the reader.”.

Alex Heyworth

Lost in the Forest?

timetochooseagain

If I understand the IPCC’s chart, the “lines” at the bottom are meant to be the PDFs collapsed into central estimates with confidence intervals. Notice how the central estimates are pretty much all to the right (higher sensitivity) than the peaks of the PDFs. It seems to me that they may have used inappropriate methods for determining central estimates, given “fat tailed” (ie skewed) distributions.

Kaboom

I’d feel for those who used Forest’s work as a key ingredient for their own studies as this pretty much invalidates anything they’ve come up with. But alas they’re likely to be cut from the same cloth of preconcept-dictates-research academics and thus have not added to the body of science with their papers anyway.

Pamela Gray

Once again I don’t get it with the lost data. I am just a poedunk nothin in terms of research and have only a decade’s old master’s level research endeavor archived at Oregon State University plus an article of that research in a major journal. I have no Ph.D. attached to my name, just a bachelor and two masters degrees and my resume does not come with a vitae. Yet I still have my raw data. I have kept it to this day. I still have a drawing of the electrical components used to generate the stimulus I used. I still have a poloroid of the stimulus captured on a spectral analyser. I still have the original typed on a Wang computer masters article that was then copied into the archive volume at OSU. And I no longer practice in that field.
Did this guy skip research 101 class?

Dodgy Geezer

What happens in the academic world if you accuse someone of altering data?

Geoff

I note in passing that Cris Forest is now a colleague of Michael Mann at Penn State. Maybe they can undertake a joint project on data management.

Nic Lewis

timetochooseagain:
The filled circles on the 5-95% ranges (which aren’t true confidence intervals) in the bottom section of the IPCC figure are the medians, which as you say are to the right of the peaks of the PDFs. That is actually to be expected, because errors in estimating changes in forcings and ocean heat uptake greatly exceed errors in temperature data. But almost all the distributions are more skewed than that effect would account for: their tails are indeed too fat. In the case of Forster/Gregory 06, that is because the IPCC changed the distribution onto an incorrect basis. And the very bumpy and other strange shaped distributions are self-evidently flawed.

Jason Calley

Losing your data is simply the most extreme version of altering your data.

RobW

“Team Dictionary”
Lost- under no circumstances release raw data to anyone not on the team. All they want to do is find something wrong with it.
And real science takes another hit from the team.

Luther Wu

What, no trolls?

more soylent green!

Reproducibility? Every time we run the same model (ie, computer program) with the same starting points and same data, we reproduce the same results.
/sarc

Hot under the collar

At least climategate didn’t show anyone losing raw data eh!
If one of Dr Forests students lost their coursework data would it be a fail?

Alex

It just amazes me that “scientists” don’t use any source controll programs, try that in any software company nowdays and you would be viewed as a clueless n00b.

tonyb

Pamela Gray
Phil Jones also seemed to have lost his data so Forster is in good company.
A journalist described Hansens office as ‘comically cluttered’ and he was concerned enough to email her saying it was much better than it used to be.
It seems the higher up the food chain the more haphazard the treatment of the data. Personally I’m not sure I could rely on a paper produced by someone who tries to work in a ‘comically cluttered’; office.
tonyb

G. Karst

“Dr Forest reports that the raw model data is now lost.”

Someone has been reading Climategate E-mails, as an instruction manual. GK

I continue to be amazed that neither the journals nor their peers mandate the release of the data used to support a climate researcher’s or group’s resulting paper. The golden rule of auditing (in any discipline) is that if you don’t write it down, it never happened. In today’s over-hyped Information Age, this presumption that the journal or peer should trust the researcher or group is antiquated to the point of naiveté.
I’m reminded what Stephen J. Gould stated in his own (controversial) book “The Mismeasure of Man” says it best, “Phony psychics like Uri Geller have had particular success in bamboozling scientists with ordinary stage magic, because only scientists are arrogant enough to think that they always observe with rigorous and objective scrutiny, and therefore could never be so fooled – while ordinary mortals know perfectly well that good performers can always find a way to trick people.” The same, I think, could be said of some climate scientists.
They seemingly and desperately want to believe that humankind is solely responsible for this “catastrophe,” which has been predicted by the models. At day’s end, it’s really rather sad to witness such educated persons failing so publicly, while they remain “eyes wide shut” to the last.

Data that have been lost, destroyed, secreted or otherwise unavailable are no different than data that is non-existent. Conclusions based on non-existent data are useless.

What a surprise! How utterly original!
Phil can’t figure out what he did with the original data.
UVA claims they lost the emails, oops, unfortunately (for UVA) they were found. It appears that delete key doesn’t always work.
Now the trees (data) can’t be seen and the forest got lost. One does wonder if the data ever really existed. If it did exist, one then wonders if the original data suffered from a reaction to delete key pressing or if it got lost in the recycling (strictly paper bound).
The trouble with the data going the recycle route, just what/when was it electronic data so that computer manipulation was possible? I bet those darn backup servers still have copies.

RHS

I don’t think the dog ate the homework, I think his virus ate the data…

kakatoa

I assume that Dr. Forest, et al. will be using their models to predict, make that simulate via a few scenarios, the effect of CO2 levels for AR5. I hope that more robust means of data management will be followed this time around.
I can’t imagine Mr. Putin agreeing to modify his countries behavior in regards to CO2 if the scientific experts in his government can’t review the details……..

Calm down everyone, data is right here.

timetochooseagain

Nic Lewis- Is there a reason to prefer the median of these distributions as a “central estimate” to the mode?
Well, could be worse, they could have gone with the mean, which would really skew right.

björn

You do not lose your data, period!
It is impossible to tell when and if or why you want to use them again!
Besides, ir feels really good to have a huge stack of data, makes you proud of your efforts.
I would feel terrible losing all that work, even if only for sentimenral reasons.

Nic Lewis

timetochooseagain: ‘Is there a reason to prefer the median of these distributions as a “central estimate” to the mode?’
I suppose that the median reflects the full distribution to a greater extent than the mode does. But I’m not sure that any ‘central estimate’ is that useful with wide, skewed distributions like these. I prefer to see the full PDF. That has an added advantage: if its shape is peculiar, it warns you to regard the study involved with some suspicion.

Follow the Money

Don’t be so harsh, people. They lose and forget lots of things at Penn State.

So Dr Forrest faffs for a year… and after a year of faffing, admits claims he’s lost the data…
What did he gain by his paper? Quoting in IPCC and consequent kudos…
Dog-Ate-My-DataGate

Mike Jowsey

Excellent research Mr. Lewis. Clearly you have put an enormous amount of (unpaid) work into this study. We await with interest a response from the GRL editor.

timetochooseagain

Nic Lewis says: “I prefer to see the full PDF. That has an added advantage: if its shape is peculiar, it warns you to regard the study involved with some suspicion.”
The shape of the distributions is not surprising, or suspicious, in and of itself, I think. Sensitivity scales with the feedback factor as 1/(1-f), so if the estimate of f is normally distributed, and the mean is greater than zero, you inevitably get the fat tail for the distribution for estimated sensitivity. But even if we aren’t suspicious, we should check studies to see if the distributions that can be derived from the data really meet the conditions which would lead to a fat tail (mean of estimated f greater than zero is, I think, the crucial condition, or close enough, but it depends on the variance of the estimate of estimates of f) and if they don’t, the fat tail should disappear.

Latimer Alder

Its probably just slipped down the back of the sofa and will turn up soon. Or maybe Forest put it somewhere safe and forgot where it was.
After all he’s not expected to be a well-organised and analytical professional scientist or anything is he? Anybody can lose the data associated with the most important paper they’ll ever write. It’s just so forgettable. And you only remember you’ve lost it when somebody asks to see it…….

from Judith Curry’s comment:
Nic Lewis’ academic background is mathematics, with a minor in physics, at Cambridge University (UK). His career has been outside academia. Two or three years ago, he returned to his original scientific and mathematical interests and, being interested in the controversy surrounding AGW, started to learn about climate science. He is co-author of the paper that rebutted Steig et al. Antarctic temperature reconstruction (Ryan O’Donnell, Nicholas Lewis, Steve McIntyre and Jeff Condon, 2011)…
I have been discussing this issue with Nic over the past two weeks. Particularly based upon his past track record of careful investigation, I take seriously any such issue that Nic raises. Forest et al. (2006) has been an important paper, cited over 100 times and included in the IPCC AR4...
This particular situation raises some thorny issues, that are of particular interest especially in light of the recent report on Open Science from the Royal Society:
.. assuming for the sake of argument that there is a serious error in the paper: should a paper be withdrawn from a journal, after it has already been heavily cited?..

Nic Lewis

timetochooseagain: “The shape of the distributions is not surprising, or suspicious, in and of itself”
I agree that a fat tail distribution is to be expected. I don’t regard a fat tail in itself as a peculiarity, but I do regard multiple peaks and strange bumps and shoulders in the PDF as being peculiar.
Only one of the distributions in the IPCC figure, Gregory 02, is genuinely consistent with a normally distributed estimate of f – and the Gregory 02 is missing nearly half of its probability mass, due to being cut off at f=1. The Forster/Gregory 06 PDF represents a normally distributed estimate for f, but the IPCC experts decided to multiply the resulting climate sensitivity PDF by sensitivity squared – supposedly to make it comaprable to the other PDFs!

Berényi Péter

“Unfortunately, Dr Forest reports that the raw model data is now lost.”
Unfortunate indeed. For Dr. Forest the honest course of action to follow at this point is
1. withdraw the paper from GLR immediately, as results described in it are irreproducible
2. remove all references to it from the IPCC AR4 report retroactively
3. have all other researchers withdraw their papers, who have relied on it
4. pay back all grant money gained for this and subsequent research
5. serve proper jail term for animal abuse, letting the dog eat raw data instead of cooked ones

Stacey

Is it worth trying his co authors Messrs Stone and Sokolov surely they must have a copy of the data?

I work with volumes of clinical research data all the time. In the course of a research project I might have several subsets of the original data as queries of the original data produce output that looks at how the experimental variable(s) affect different stratifications of the sample group. Each dataset must be properly validated, versioned, systematically stored according to “best practices.” Further, all data is mirrored and stored in two data centers and warehoused with a data vault company. Such data is considered so precious that such controls are an absolute requirement.
Anyone who outright looses the original data has such bad organization and lack of controls in place that any results of their work must be called into question. I continue to be astounded at the shoddy research practices of these climatologists and even more astounded that their work is not thrown in the waste bin by the publishing journal when such gross negligence is discovered.

Manfred

Small planet
Dr. Forest is now with the Department of Meteorology at the Pennsylvania State University.
Before that, he was with MIT, his thesis advisors were Kerry A. Emanuel and Peter Molnar.
http://ploneprod.met.psu.edu/people/cef13/

Nic Lewis

Stacey: “Is it worth trying his co authors Messrs Stone and Sokolov surely they must have a copy of the data?”
I have tried. I understand Dr Stone was seriously ill when I emailed last year, so I have let him be, poor chap.
I have failed to obtain any response from Dr Sokolov, who is the expert on the MIT 2D climate model. Maybe he thinks that it is entirely Dr Forest’s responsibility to respond, or perhaps he doesn’t like a non-academic poking his nose in.

timetochooseagain

Nic Lewis-Yes, you are correct, I should have said I don’t regard the fat tail itself as suspicious, but like you I do find the odd shoulders or extra local maxima (secondary “modes”) as curious and suspicious. In this regard the worst offender appears to be “Knutti 02” which gives the most outrageous estimate for sensitivity of all of them, surely!

Hot under the collar

@Stacey says,
I suspect the dogs paw accidentally hit the delete button on the co authors computer.