CRUTEM3 "…code did not adhere to standards one might find in professional software engineering"

Those of us who have looked at GISS and CRU code have been saying this for months. Now John Graham-Cumming has posted a statement with the UK Parliament about the quality and veracity of CRU code that has been posted, saying “they have not released everything”.

http://popfile.sourceforge.net/jgrahamc.gif

I found this line most interesting:

“I have never been a climate change skeptic and until the release of emails from UEA/CRU I had paid little attention to the science surrounding it.”

Here is his statement as can be seen at:

http://www.publications.parliament.uk/pa/cm200910/cmselect/cmsctech/memo/climatedata/uc5502.htm

=================================

Memorandum submitted by John Graham-Cumming (CRU 55)

I am writing at this late juncture regarding this matter because I have now seen that two separate pieces of written evidence to your committee mention me (without using my name) and I feel it is appropriate to provide you with some further information. I am a professional computer programmer who started programming almost 30 years ago. I have a BA in Mathematics and Computation from Oxford University and a DPhil in Computer Security also from Oxford. My entire career has been spent in computer software in the UK, US and France.

I am also a frequent blogger on science topics (my blog was recently named by The Times as one of its top 30 science blogs). Shortly after the release of emails from UEA/CRU I looked at them out of curiosity and found that there was a large amount of software along with the messages. Looking at the software itself I was surprised to see that it was of poor quality. This resulted in my appearance on BBC Newsnight criticizing the quality of the UEA/CRU code in early December 2009 (see http://news.bbc.co.uk/1/hi/programmes/newsnight/8395514.stm).

That appearance and subsequent errors I have found in both the data provided by the Met Office and the code used to process that data are referenced in two submissions. I had not previously planned to submit anything to your committee, as I felt that I had nothing relevant to say, but the two submissions which reference me warrant some clarification directly from me, the source.

I have never been a climate change skeptic and until the release of emails from UEA/CRU I had paid little attention to the science surrounding it.

In the written submission by Professor Hans von Storch and Dr. Myles R. Allen there are three paragraphs that concern me:

“3.1 An allegation aired on BBC’s “Newsnight” that software used in the production of this dataset was unreliable. It emerged on investigation that the neither of the two pieces of software produced in support of this allegation was anything to do with the HadCRUT instrumental temperature record. Newsnight have declined to answer the question of whether they were aware of this at the time their allegations were made.

3.2 A problem identified by an amateur computer analyst with estimates of average climate (not climate trends) affecting less than 1% of the HadCRUT data, mostly in Australasia, and some station identifiers being incorrect. These, it appears, were genuine issues with some of the input data (not analysis software) of HadCRUT which have been acknowledged by the Met Office and corrected. They do not affect trends estimated from the data, and hence have no bearing on conclusions regarding the detection and attribution of external influence on climate.

4. It is possible, of course, that further scrutiny will reveal more serious problems, but given the intensity of the scrutiny to date, we do not think this is particularly likely. The close correspondence between the HadCRUT data and the other two internationally recognised surface temperature datasets suggests that key conclusions, such as the unequivocal warming over the past century, are not sensitive to the analysis procedure.”

I am the ‘computer analyst’ mentioned in 3.2 who found the errors mentioned. I am also the person mentioned in 3.1 who looked at the code on Newsnight.

In paragraph 4 the authors write “It is possible, of course, that further scrutiny will reveal more serious problems, but given the intensity of the scrutiny to date, we do not think this is particularly likely.” This has turned out to be incorrect. On February 7, 2010 I emailed the Met Office to tell them that I believed that I had found a wide ranging problem in the data (and by extension the code used to generate the data) concerning error estimates surrounding the global warming trend. On February 24, 2010 the Met Office confirmed via their press office to Newsnight that I had found a genuine problem with the generation of ‘station errors’ (part of the global warming error estimate).

In the written submission by Sir Edward Acton there are two paragraphs that concern the things I have looked at:

“3.4.7 CRU has been accused of the effective, if not deliberate, falsification of findings through deployment of “substandard” computer programs and documentation. But the criticized computer programs were not used to produce CRUTEM3 data, nor were they written for third-party users. They were written for/by researchers who understand their limitations and who inspect intermediate results to identify and solve errors.

3.4.8 The different computer program used to produce the CRUTEM3 dataset has now been released by the MOHC with the support of CRU.”

My points:

1. Although the code I criticized on Newsnight was not the CRUTEM3 code the fact that the other code written at CRU was of low standard is relevant. My point on Newsnight was that it appeared that the organization writing the code did not adhere to standards one might find in professional software engineering. The code had easily identified bugs, no visible test mechanism, was not apparently under version control and was poorly documented. It would not be surprising to find that other code written at the same organization was of similar quality. And given that I subsequently found a bug in the actual CRUTEM3 code only reinforces my opinion.

2. I would urge the committee to look into whether statement 3.4.8 is accurate. The Met Office has released code for calculating CRUTEM3 but they have not released everything (for example, they have not released the code for ‘station errors’ in which I identified a wide-ranging bug, or the code for generating the error range based on the station coverage), and when they released the code they did not indicate that it was the program normally used for CRUTEM3 (as implied by 3.4.8) but stated “[the code] takes the station data files and makes gridded fields in the same way as used in CRUTEM3.” Whether

3.4.8 is accurate or not probably rests on the interpretation of “in the same way as”. My reading is that this implies that the released code is not the actual code used for CRUTEM3. It would be worrying to discover that 3.4.8 is inaccurate, but I believe it should be clarified.

I rest at your disposition for further information, or to appear personally if necessary.

John Graham-Cumming

March 2010

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

162 Comments
Inline Feedbacks
View all comments
John Wright
March 4, 2010 10:01 am

“Mr Lynn (09:31:39) :
I am the ‘computer analyst’ mentioned in 3.2 . . .
Dr. Graham-Cumming diplomatically leaves out the word ‘amateur’ with which Professor Hans von Storch and Dr. Myles R. Allen denigrated his analysis. ‘Nuff said.”
There’s a world of difference between diligent analysis carried out by an amateur and the amateurish practice of state-funded professionals we’ve seen at CRU.

Jim
March 4, 2010 10:01 am

This isn’t “post-normal” science – it is abnormal science. I never liked that “post-normal” BS anyway.

Jim
March 4, 2010 10:04 am

The term “CRU standards” is an oxymoron.

Jeremy Poynton
March 4, 2010 10:06 am

Seems there was no QA, no source control, no independent auditing; all of which one would expect in any commercial environment of any size. I spent 25 years in IT, in a company which grew from small to huge. When I started, we programmed by the seat of our pants, documentation was scarce, commenting scarce, any testing done, we did, and maybe the customer when they went live!
By the time I left, we had full and comprehensive inter-office (global) source code control, QA controlled documentation, a requirement for ISO9001 accreditation, and regular visits from an independent auditing company.
Frankly, what went on at UAE would be laughable were it not so appalling.

Wren
March 4, 2010 10:09 am

Mike (09:27:15) :
I worked as a programmer for a physics research group as an undergraduate and in industry before going to grad school in the 1980’s. It never occurred to the physicists to publish or make original code or even the data available. If someone wanted to replicate their work they would do the experiment themselves and write their own code. The papers described the mathematical methods used so that anyone could write a program to do the number crunching. In industry I often had to deal with very poorly documented code. The standards in industry have likely improved, but a lot a valid work was done in the bad old days.
Since climatology is such a hot area, I too would like to see greater openness. This will cost more money. Granting agencies will need to provide budget lines for professional programmers (rather than cheap students) and web servers for data archives. But, there is no basis for demonizing researchers who have been following the standards in their field or ignoring past work. If you want to redo someone’s chemistry experiment should you be able to demand the use of their test tubes? Maybe some day chemists will be required to archive all their old test tubes. But, that won’t invalidate all the chemistry that was done before.
========
I would prefer people go over my work before I present it as finished. If I show everything as I progress, it would give people the opportunity to be helpful. But I doubt transparency would put an end to FOI requests.

DirkH
March 4, 2010 10:10 am

I wonder whether Professor Hans von Storch and Dr. Myles R. Allen bothered to try to find out anything about the person they called an “amateur computer analyst”. And whether they are always so pedantic in their work.

Larry Geiger
March 4, 2010 10:14 am

Mike (09:27:15) :
This is different.
To rerun the experiment you need the data and the code.
In a physical experiment, the experiment is described, the methods described, and then others may perform the same steps. If lots of others perform the same experiment and get different results, then the original result may not be valid for some reason not described in the process.
I recently read a book about 20th century astronomy. It described the process by which astronomers shared their plates (photos). The astronomers replicated each others work by inspecting the exposed plates. They shared the exact data that was used to produce the results.
In another post, someone suggested that if people didn’t agree with a certain finding, they could just go to Antartica and drill their own cores. That’s all fine and good if you do the research on your own nickel, but if the US or British government is funding the research, then not just the results, but the data, the code and the results belong to all of us. We should be able to get as much mileage as possible out of every research project we fund, including results by “amateurs”.
Lastly, in this day and age it’s cheap and easy to publish all the data and code. Some of the blogs referenced from WUWT do that every day.

RockyRoad
March 4, 2010 10:14 am

Mike (09:27:15) :
(…)
“If you want to redo someone’s chemistry experiment should you be able to demand the use of their test tubes? Maybe some day chemists will be required to archive all their old test tubes. But, that won’t invalidate all the chemistry that was done before.”
———
Reply:
We don’t need their computers (aka “test tubes”); others are available. What we want is their UNadjusted raw data and their final numbers. Then we’ll see how they tortured the data. If the new results are different than “climate science” results, we’ll certainly show them (and the courts) our algorithms!
But will “climate scientists”, barring some subpoena, ever show us their algorithms?
No worries… subpoenas are on the way.

TIm
March 4, 2010 10:17 am

In light of all that has been revealed about the fallacy of AGW and experience with alternative energy sources in the EU, and Obama is still preaching wind turbines and solar cells!! I’d pull my hair out if I had any!!

March 4, 2010 10:20 am

Graham-Cumming is such an ‘amateur’, with those degrees and real world experience. I agree (having architected, designed and managed large software systems for NASA and the DoD) the code we have seen is garbage. And it is clear in testimony the code used to generate CRUs products HAS NOT been released. When it is, with the data, it MUST produce the same numbers as before.
It won’t of course, because there is no way that buggy code could replicate an error message.

Gareth Phillips
March 4, 2010 10:20 am

I note that the Met office have stated that this year will be the 5th warmest on record. This is interesting considering the northern hemispheres winter.
However they qualify their claim by adding in small print, “the warmest taking into account:
El Nino
La Nina
Increase greenhouse gas concentrations
the cooling influence of aerosol particles
Solar effects
Volcanic cooling ( if Known)
and natural variation in the oceans.
And in conjunction with the University of East Anglia.
Well there we are, all facts piled into a questionable computer model which generates the headline.
This is all highly amusing and very sweet in a strange way, but I suspect in reality the behaviour of my cat is about as valid a predictor as the above.

Rebivore
March 4, 2010 10:22 am

Hotrod asked: “Is there a UK based software developers professional association, or similar group?”
There’s the British Computer Society (BCS) – http://www.bcs.org.

max
March 4, 2010 10:30 am

Mike @9:27,
depends on what physics (or other) you are doing. When it is practical to reproduce the experiment it is expected that people wishing to reproduce the work will do so and it serves as good check against experimental error, but information about how the experiment was conducted must be provided so that the experiment can be replicated. When it is impractical to reproduce the experiment (a need to book the CERN supercollider to run the experiment, for example) it is expected that the raw data of the experiment it is based upon will be made available to those wishing to attempt reproduce your results. When the “experiment” starts with large and multiple data sets and consists of drawing conclusions from the results of manipulating some of those data sets (climate science), it is incumbent upon the “experimenter” to identify the data sets used and the manipulations to them so that the “experiment” can be replicated.

March 4, 2010 10:32 am

wind turbines and solar cells are costly and impractical for most countries.
the code was ‘..unreliable’, hmm now why does that not surprise me?

Eric
March 4, 2010 10:32 am

hmm
having written code myself in an academic environment (a well respected research university that shall remain unnamed) I am not a bit surprised that the quality of software engineering is poor… nor should anybody else be.
I was in an applied mathematics department modeling very complex but much better understood (than climate science) physics problems. Basically the complex and long term code is modified and maintained by a bunch of undergrads and first yr grad students who were studying applied mathematics and NOT software engineering and who spend a semester or a year on it and then move on. Poor quality code ensued.
“Climate science” being even much further from engineering than applied mathematics it is only to be expected that there would be significant issues with the code at UEA.
This is not a scandal, it is just the way it is with code in academia. BUT is another good reason to be very skeptical of anybody who sells the results of their academic computer models as being definitive.

Mike H
March 4, 2010 10:33 am

Here’s the logical problem with the Dr Jones testimony where he claimed it was not ‘standard practice’ to release data and computer models so other scientists could check and challenge research.
There are three pieces to the work of Dr Jones.
First, we know and he argues that almost all the weather data is posted all over the world on the Internet so that anyone can get at it. Second, he argues that his papers contain all of the methodology that demonstrates that his number crunching code is valid. Sandwiched in the middle is the third fact that the code itself has not been made publicly available. But we all know how easy it is to put that code up and make it available just like the other data and papers are.
So there is an established precedent that exists for two items, to wit, the publishing of data and the papers. And, since we all know that they seem to be able to afford to put that information on the Internet, there is simply NO valid reason to hide the code (by not publishing it) unless Dr Jones is hiding something. Enough with the copyright and proprietary ownership red herrings.
Put the three data sets out there and let the chips fall where they may, Dr Jones.

Rob uk
March 4, 2010 10:47 am

Mike (09:27:15) :
I worked as a programmer for a physics research group as an undergraduate and in industry before going to grad school in the 1980’s. It never occurred to the physicists to publish or make original code or even the data available. If someone wanted to replicate their work they would do the experiment themselves and write their own code. The papers described the mathematical methods used so that anyone could write a program to do the number crunching. In industry I often had to deal with very poorly documented code. The standards in industry have likely improved, but a lot a valid work was done in the bad old days.
Since climatology is such a hot area, I too would like to see greater openness. This will cost more money. Granting agencies will need to provide budget lines for professional programmers (rather than cheap students) and web servers for data archives. But, there is no basis for demonizing researchers who have been following the standards in their field or ignoring past work. If you want to redo someone’s chemistry experiment should you be able to demand the use of their test tubes? Maybe some day chemists will be required to archive all their old test tubes. But, that won’t invalidate all the chemistry that was done before.
Mike, would you say that if your life depended on it.

Vincent
March 4, 2010 10:51 am

Dr Graham-Cumming has opened a can of worms here. I’m not referring specifically to the bugs in the code, but the complete lack of even the most basic code quality procedures.
Having worked in commercial applications program development for 30 years, I can only reiterate the disbelief that others have mentioned. In the institutions I have worked, (and they are many), all program code must go through a rigourous process of unit testing and an audit log kept of all bugs and fixes, and the test plans filed away for future reference. The same goes for system testing. This way, if a bug is found, you can check the test plan to see if a particular logic pathway has actually been tested, or tested with all relevant conditions. However, once the code “goes live” the process is tightened up even more.
While pre-live, some corners may be cut in documenting fixes, post-live, every change is logged in stone. The source code is usually kept under some kind of development management system so that it can’t be checked out by more than one programmer at the same time. Any fixes must have an audit trail that ties it to a documented problem. Another test plan is created to test the fix, which is documented in the code and cross referenced to the original problem document. Once the fixed program is re-released live, it becomes the new version. This is the basic version control which Dr Graham-Cumming says was missing. There would normally be an archive for each version of the program code. This is the only way you can deal with unexpected problems caused by adding the fixes themselves and allows you to go back to earlier versions to see what effect these fixes have had on the code.
The situation here seems to be closer to what one would expect from a home coder with no controls or quality management at all.

regeya
March 4, 2010 10:53 am

“In light of all that has been revealed about the fallacy of AGW and experience with alternative energy sources in the EU, and Obama is still preaching wind turbines and solar cells!!”
FAIL

George Turner
March 4, 2010 10:58 am

John Graham-Cumming,
I’ve been a professional programmer in the industrial control sector for almost 30 years. I don’t know if the terminology is different between the US and UK, but over here a piece of squirrely code that nets you a couple billion in government dollars is a feature, not a bug. 😉
Of course, all this code should just boil down to Tglobal = f(CO2), where
f(CO2) is a fairly simple formula or a look-up table. Even a bad programmer should be able to pound it out in an hour, but instead everyone wants to milk the problem for a couple of decades.

Methow Ken
March 4, 2010 10:58 am

Great job by Dr. Graham-Cumming.
As someone who spent much of a 30-year career doing software engineering on a in-house data system project (>100,000 lines of 4GL code), what most jumped out at me from thread start was this:
”The code had easily identified bugs, no visible test mechanism, was not apparently under version control and was poorly documented.”
Easy ID bugs, NO test system, poor documentation, and NO VERSION CONTROL ?!?!
For a software system of any size and complexity that undergoes many revisions by multiple programmers over time:
Even if all you know about it are these 4 deficiencies, you can pretty much conclude that a lot of it is junk spaghetti-code; and that the results you get from it are largely trash.
Yup: Starting to feel like ”ClimateGate: The Sequel”. . . .

wayne
March 4, 2010 11:00 am

I will be very curious where this branch of climate science in the area of software leads. John Graham-Cumming says he has almost 30 years of experience in software, I have written software since 1976. Anyone who has spent that amount of time in the intricacies of proper software generally know what they are speaking, degree or not. However, there are multiple branches in software science and design that immediately bring in certain flavors of expertise; you can no longer know it all.
For many times, the most readable and beautiful code is not necessarily the best software. It usually hinges in the area of efficiency or speed. If you don’t care how slow the software runs, object oriented design will usually produce the most readable and logical code but I have seen it cost a 10, 100, or ever 1000 times slowdown. However, if huge amounts of data are involved, especially when the data’s influence spreads across many sub-areas, structured and logical design tend to no longer apply, you get into the area of spatial integrity and locality. This mainly has to do with the internal layout of memory within the computer and jumps directly into multiple-layer cache layouts and pipeline lengths and stalls.
Inside purely efficient code, usually even the base equations are broken into multiple pieces, especially when integrals are concerned, when performing numeric analysis. A long equation will deal with one mathematical operation across the arrays at a time. This can cause the software to look rather bizarre but it doesn’t necessary mean the software is sub-par quality or blatantly bad. I hope these software specialists as John Graham-Cumming are truly trained in this area of expertise, for those used to software in business and financial are going to get a rude shock when jumping across to scientific software design, they are two different worlds.
Still there are times when bad design is just bad design and bad coding is just bad coding and anytime logic flaws are embedded into the code, it is bad. Error flow is usually one of the first areas to look for purely bad designs.

March 4, 2010 11:00 am

TerryBixler (08:22:24) : Why has no one asked about version control for the data? Code and data go hand in hand.
I’ve wondered the same thing, most of their data is stored in flat-files, ascii format files, it is like they designed it to use version control then didn’t do it. The metadata during check-ins would have been so insightful, almost makes one think they purposely obfuscated their methods.

debreuil
March 4, 2010 11:01 am

As a programmer that is also what surprised me the most. Not just the quality, but the techniques, and the faith in the accuracy of them.
I’m still not sure I believe they didn’t use version control. In the back of my mind I think they have to say that in order not to release the code (and more, so that no one would see the versions — it would probably be easy to show manipulation with that). If that is true though, then it shows they know almost nothing about software, and only slightly more about computers.

Eric
March 4, 2010 11:02 am

Rob uk (10:47:09)
There is a fundamental difference between the examples you cite and the climate science case.
In chemistry experiments are well documented. To reproduce another’s work don’t need their test tubes but you do need to know the precise series of steps they took.
In general climate science has failed to provide lab notes. In the case of a published temperature reconstruction they are not providing sufficient detail of how the reconstruction was made for it to be duplicated. Basic things are unreported:
what measuring stations were used
How is temp assigned to areas with no stations
How is actual temp measurement data adjusted and why
These are the basic inputs required to duplicate a study, these are lab notes not test tubes.