NOTE: Part 2 of this story has been posted: see The Smoking Code, part 2
The Proof Behind the CRU Climategate Debacle: Because Computers Do Lie When Humans Tell Them To
From Cube Antics, by Robert Greiner
I’m coming to you today as a scientist and engineer with an agnostic stand on global warming.
If you don’t know anything about “Climategate” (does anyone else hate that name?) Go ahead and read up on it before you check out this post, I’ll wait.
Back? Let’s get started.
First, let’s get this out of the way: Emails prove nothing. Sure, you can look like an unethical asshole who may have committed a felony using government funded money; but all email is, is talk, and talk is cheap.
Now, here is some actual proof that the CRU was deliberately tampering with their data. Unfortunately, for readability’s sake, this code was written in Interactive Data Language (IDL) and is a pain to go through.
NOTE: This is an actual snippet of code from the CRU contained in the source file: briffa_Sep98_d.pro
[sourcecode language=”text”]
;
; Apply a VERY ARTIFICAL correction for decline!!
;
yrloc=[1400,findgen(19)*5.+1904]
valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,2.6,2.6,2.6]*0.75 ; fudge factor
if n_elements(yrloc) ne n_elements(valadj) then message,’Oooops!’
yearlyadj=interpol(valadj,yrloc,timey)
[/sourcecode]
Mouse over the upper right for source code viewing options – including pop-up window
What does this Mean? A review of the code line-by-line
Starting off Easy
Lines 1-3 are comments
Line 4
yrloc is a 20 element array containing:
1400 and 19 years between 1904 and 1994 in increments of 5 years…
yrloc = [1400, 1904, 1909, 1914, 1919, 1924, 1929, … , 1964, 1969, 1974, 1979, 1984, 1989, 1994]
findgen() creates a floating-point array of the specified dimension. Each element of the array is set to the value of its one-dimensional subscript
F = indgen(6) ;F[0] is 0.0, F[1] is 1.0….. F[6] is 6.0
Pretty straightforward, right?
Line 5
valadj, or, the “fudge factor” array as some arrogant programmer likes to call it is the foundation for the manipulated temperature readings. It contains twenty values of seemingly random numbers. We’ll get back to this later.
Line 6
Just a check to make sure that yrloc and valadj have the same number of attributes in them. This is important for line 8.
Line 8
This is where the magic happens. Remember that array we have of valid temperature readings? And, remember that random array of numbers we have from line two? Well, in line 4, those two arrays are interpolated together.
The interpol() function will take each element in both arrays and “guess” at the points in between them to create a smoothing effect on the data. This technique is often used when dealing with natural data points, just not quite in this manner.
The main thing to realize here, is, that the interpol() function will cause the valid temperature readings (yrloc) to skew towards the valadj values.
What the heck does all of this mean?
Well, I’m glad you asked. First, let’s plot the values in the valadj array.

Look familiar? This closely resembles the infamous hockey stick graph that Michael Mann came up with about a decade ago. By the way, did I mention Michael Mann is one of the “scientists” (and I use that word loosely) caught up in this scandal?
Here is Mann’s graph from 1999
As you can see, (potentially) valid temperature station readings were taken and skewed to fabricate the results the “scientists” at the CRU wanted to believe, not what actually occurred.
Where do we go from here?
It’s not as cut-and-try as one might think. First and foremost, this doesn’t necessarily prove anything about global warming as science. It just shows that all of the data that was the chief result of most of the environmental legislation created over the last decade was a farce.
This means that all of those billions of dollars we spent as a global community to combat global warming may have been for nothing.
If news station anchors and politicians were trained as engineers, they would be able to find real proof and not just speculate about the meaning of emails that only made it appear as if something illegal happened.
Conclusion
I tried to write this post in a manner that transcends politics. I really haven’t taken much of an interest in the whole global warming debate and don’t really have a strong opinion on the matter. However, being part of the Science Community (I have a degree in Physics) and having done scientific research myself makes me very worried when arrogant jerks who call themselves “scientists” work outside of ethics and ignore the truth to fit their pre-conceived notions of the world. That is not science, that is religion with math equations.
What do you think?
Now that you have the facts, you can come to your own conclusion!
Be sure to leave me a comment, it gets lonely in here sometimes.
hat tip to WUWT commenter “Disquisitive”
========================
NOTE: While there are some interesting points raised here, it is important to note a couple of caveats. First, the adjustment shown above is applied to the tree ring proxy data (proxy for temperature) not the actual instrumental temperature data. Second, we don’t know the use context of this code. It may be a test procedure of some sort, it may be something that was tried and then discarded, or it may be part of final production output. We simply don’t know. This is why a complete disclosure and open accounting is needed, so that the process can be fully traced and debugged. Hopefully, one of the official investigations will bring the complete collection of code out so that this can be fully examined in the complete context. – Anthony
Sponsored IT training links:
Join today for 646-985 exam prep and get a free newsletter for next 642-072 and 1z0-050 exams.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.


Anyone see this?
http://www.nationalpost.com/news/canada/story.html?id=2300282
Apparently Climategate was only one of several organized efforts to break into universities and steal data.
JohnSpace nails it:
The “smoking gun” is that CRU, et al did not make their raw data, metadata, code, etc freely available for others to examine, and supposedly “prestigious” “scientific” journals accepted their papers without requiring the supporting materials to be archived. That is not real science, it’s cargo-cult science.
James: To elaborate a little further…for this code to mean anything, it needs to be shown how it effects the output/final result. Mr. Greiner has not done that. I would not bother posting this stuff unless you can determine a little more precisely what effect the suspect code has on the output. If the answer is “none”, then there really isn’t necessarily a problem.
Well, James, if you’ll provide us the actual data and code used for CRU and Mann’s various papers, I’ll bet we can determine a lot of things “a little more precisely”. Anyone who believes Mr Greiner has got it wrong, feel free to provide him with the actual materials. Since CRU at the very least suppressed their data, we are free to draw whatever conclusions we can from what IS available. If CRU (or you, James) objects, hand over the data and code which will refute Mr. Greiner’s and other commenters’ conclusions.
I’m playing Devil’s Advocate a bit here. This code fragment in isolation isn’t a smoking gun, though suspicious. As other blog comments say or imply, it needs to be set clearly in context. Was this code actually run in producing the deceptive output? When? Is it just test code? What is the meaning of the fudge factors (e.g. definite intent to mis-lead; or reasonable calibration). Was the code repeatedly re-run with different fudge factors with until it converged on a desired result? Is it just somebody playing around with an idea? Is it something left over from a “sand-pit” (i.e. experimental non-production code not intended for release) which, perhaps, got “left in” accidentally in later versions? This is the sort of thing for investigators to ferret out by tracing builds and deliveries: you can’t work this out just by looking at snippets of code or even whole programs, and it would be easy, given the current hysteria, to leap to unwarranted conclusions.
Just a summary, the code was commented in the fragment posted in the middle of this thread. It has been noted by others that it was used in other runs. Burch also noted that it was used in another area. As a note, when I spend a lot of time on a piece of code, I may save it in comments in the file. The really dumb part of this is commenting the use of the adjustment, but not commenting the code to generate it. If you don’t use the calculation, normally you comment it out too as it waste cpu cycles and storage. Only if you plan to use it elsewhere would you leave it running. hmm.
To address – “translate to VB… it would be just as obscure.” Arrays in VB can be even funkier, so it would not do any good, especially without the code for Interpol().
The Fudge Factor… this was done with the adjustment hard coded . Each element in the array is statically defined. It is 0 for the first 5, dips for the next 4, then jumps up steeply.
=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,2.6,2.6,2.6]*0.75
The multiplier lets the coder have one number to change to change the slope of the adjustment. *1.0, steeper, *.5 shallower. You can see the results by changing it. A real developer would have used a variable and that way would not have to scroll into the code to change it (just change the decl in the header or definition file.)
Nice post, Burch (07:44:25)!
The fact that adding the “fudge” is done in additional places makes it even more suspicious.
While the curve of the adjustments does correspond to the shape of Mann Hockey Stick, that is an incorrect comparison. The adjustment code shown appears to adjust smoothed temperatures from the 1930s onwards, not from hundreds of years ago.
Thus, this article is misleading.
No one yet knows the reason this code fragment exists (or similar code fragments in other files). The fragment raises good questions – but does not yet give a definitive answer.
Please be cautious in interpretation.
(I am also a s/w engineer.)
Just a general comment on the code and comments from various links above.
I saw the numbers in the Excel file for grant dollars. If this crap data was in such a state, why the heck did they not use some of the grant money to hire a grad student, even in Computer Sciences, to get them a real database? They could have created a SQL database from all these crap flat files and made life so much easier. Guess I have been in the corporate enterprise world too long and just do not understand acedemia at all. It does not address the raw data issue, but it could have preserved it and made it easier. sigh.
My great fear is that the investigation at UEA will focus on who got the files out and not on the conduct of the “team”.
Whether or not this particular snippet of code is a smoking gun or not, I say
there is definitely the stench of burned black powder in the air.
My friends, these are but the opening battles in a war against media fiction and political rhetoric. As much as you or I understand the falsehoods that the fiction CRU had fed to the alarmists, the moderates in this world, who don’t understand science and scientific peer-review process, tend to understand the stench of corruption.
All we can do as scientists is hammer away in any public forum you can find to bring the corruption into the uncomfortable public glare. The politicians are starting to notice that Climategate can provide an opportunity to garner more votes, and it is our job to persuade the moderate or disinterested voter with facts that get at the truth (or expose the falsehoods). As more voters become persuaded, they will register their views on the opinion polls that the politicians watch like hawks.
For me, the war will be won when all major political parties in democratic states begin to discard climate change policy planks in their election campaigns. As you can see, with elections as much as years away in some countries, the war will be long.
Eric Raymond, a guru blogger of programmers and of the open source movement exposed this code on Nov 24. http://esr.ibiblio.org/?p=1447.
I’m told he had not commented on climate change in his years of blogging before this point. But it’s clear now, he had been watching and knew exactly what to look for.
Anthony, I’m sure Eric has been reading your blog and CA.
Eric (ESR) responds to questions on his site.
———-
“>The “blatant data cooking” is to use the actual thermometer data where it’s available, which, of course, shows no decline over those decades …
esr: Oh? “Apply a VERY ARTIFICAL correction for decline!!”
That’s a misspelling of “artificial”, for those of you slow on the uptake. As in, “unconnected to any f…. data at all”. As in “pulled out of someone’s ass”. You’re arguing against the programmer’s own description, fool!
In fact, I’m quite familiar with the “divergence problem”. If AGW were science rather than a chiliastic religion, it would be treated as evidence that the theory is broken.”
——
“>The program you are puzzling over was used to produce a nice smooth curve for a nice clean piece of COVER ART.
esr: Supposing we accept your premise, it is not even remotely clear how that makes it OK to cook the data. They lied to everyone who saw that graphic.”
——
“>I’m sure we’ll see a correction or retraction here any minute.
esr: As other have repeatedly pointed out, that code was written to be used for some kind of presentation that was false. The fact that the deceptive parts are commented out now does not change that at all.
It might get them off the hook if we knew — for certain — that it had never been shown to anyone who didn’t know beforehand how the data was cooked and why. But since these peiple have conveniently lost or destroyed primary datasets and evaded FOIA requests, they don’t deserve the benefit of that doubt. We already know there’s a pattern of evasion and probable cause for criminal conspiracy charges from their own words.”
———
This is damning code.
I toyed with it here.
http://joannenova.com.au/2009/11/cru-data-cooking-recipe-exposed/
Imagine what kind of reasons you might come up with for that VERY ARTIFICIAL adjustment.
“Why not just make up a bunch of numbers? Obviously, there is some method to the apparent (to some) madness of the program in question, otherwise why bother?” Billy
Do you think evil is that cut and dried? First a computer program is an apparent attempt at proper procedure and probably started that way. At first glance, it appears that due process is followed. And to those unfamiliar with GIGO, a computer adds legitimacy to the result. I would assume that the code was initially created honestly and then “improved” as was thought necessary.
Computers are great tools for ANY purpose.
This is not smoking code.
This is a fresh pile of steaming code.
This is the pile they were trying rub our noses in, to break our awful habit, of using the most cost efficient means of energy to heat our homes, produce goods and services, and transport us to work.
I look at this, and I know they know all about “peer-review” but don’t know the first thing about “code-review”.
A nit perhaps: “cut-and-try” means experimental; “cut-and-dry” OTOH means ordinary and routine almost meaning straight-forward. Which was meant? I see I’m not the only one asking.
I think that TKI’s comment points to an area where further investigation is warrented.
I agree with others that it is certainly possible that a programmer could leave uncommented code, but that never gets called, in a module. It just doesn’t seem very likely that it wouldn’t have all kinds of “DO NOT USE THIS FUNCTION IN PRODUCTION!” type warnings around it.
WAG (10:00:06),
Notice that the second word in your posted article is “alleged”. Everything in the article is alleged by the IPCC guy. The comments following the article easily debunk the spin.
The guy claims that a hacker “attempted” to break into their computers. FYI, attempts to hack into every computer connected to the internet happen constantly. That’s why people use anti-virus programs.
If the alleged basis to this story was true, it would have been reported by the mainstream media as soon as the emails were leaked. But to make this a talking point more than two weeks later is just another attempt at damage control: “Hey, look over there! I think I saw a hacker!”
This is simply deliberate misdirection. The issue isn’t even the CRU insider who in all probability leaked the info — it’s the obvious fact that at the very least, now no one can rely on any conclusions based on the climate pseudo-science code or the temperature data endlessly massaged and beaten into submission by the CRU.
The CRU’s siege mentality, with the ongoing strategy sessions about how to thwart legal FOI requests are the real story, not this speculation. Conspiring to evade the law is way more serious than this IPCC guy inventing his speculation about a presumed hacking attempt.
If it it’s in there but commented out or whatever, that’s standard programmer stuff. Just because it’s been commented out doesn’t mean it hasn’t been run and since THE ORIGINAL DATA HAS BEEN LOST then I am not willing to give them the benefit of the doubt.
Anyone who thinks this piece of code is The Smoking Gun had better read this blog post, which is linked above under “Possibly Related Posts.”:
http://amanwithaphd.wordpress.com/2009/12/01/i-do-not-think-that-means-what-you-think-it-means/
Yep, there is such a thing as a fudge factor in the code. But the only part of the code that appears to use it is commented out by the semicolons. This means that the computer ignores any line that begins with a semicolon.
So, it looks like this fudge factor might only have been used to check out some code and make sure the arrays worked right. It was used in a command (yearlyadj ) but not in the actual print out. It was then commented out and not used.
I agree with “Bill” and a few other lonely voices in this thread who believe this piece of code is not a smoking gun. A more likely explanation (as others have suggested) is that it is merely a test of one of the computer models.
To all of those saying Anthony should take down this post:
Can we definitively prove that this code snippet was used to produce anything, no we can’t BECAUSE in all likelihood, CRU can’t even reproduce their own products and their refusal to release their “real” code, and that’s assuming this isn’t “real” code which is just a specious an argument as saying it’s not…
This has created a vacuum in which the leaked release is the only thing we have to go on, despite repeated requests and, IMO, improper denials through FOIA to see what is actually under-the-hood (a.k.a. access to allow the most basic aspects of the scientific method to be observed)
Self-censoring this post is not the answer, and I think Anthony’s note is more than adequate to explain the context of this code and analysis
I think it would be a good idea to append a “?” to the title of this thread.
RE: Optimizer (09:43:39)
You gave an excellent clarification.
This interpretation seems to be the most reasonable.
http://www.jgc.org/blog/2009/11/very-artificial-correction-flap-looks.html
He even identifies the published paper where this section of code would have applied.
The following text is copied from the JunkScience.com web site. I have not tried to independently verify the claims, but it makes sense.
——————————————
Update: It has become fairly obvious this archive was not “hacked” or “stolen” but rather is a file assembled by CRU staff in preparation for complying with a freedom of information request. Whether it was carelessly left in a publicly accessible portion of the CRU computer system or was “leaked” by staff believing the FOIA request was improperly rejected may never be known but is not really that important. What is important is that:
There was no “security breach” at CRU that “stole” these files
The files appear genuine and to have been prepared by CRU staff, not edited by malicious hackers
The information was accidentally or deliberately released by CRU staff
Selection criteria appears to be compliance with an or several FOIA request(s)
With some reluctance we have decided to host compressed archives of the hacked files and uncompressed directories you can browse online. Both are linked from the menu or you can simply point your browser to http://junkscience.com/FOIA/
Note, in briffa_Sep98_d.pro this line is never eventually used (commented out). It is used in another directory, briffa_Sep98_e.pro though. Here it is used to plot graphs titled:
‘Age-banded MXD from all sites’
and
‘Hugershoff-standardised MXD from all sites’
I’m not sure what these graphs are for. It is important to follow the code all the way through and be accurate tough. ‘Code doesn’t lie’, so nothing is exposed until the analysis is correct.
Imo (accounting for comments from Gavin I can’t verify so take at face value), I think this may have been an abandoned effort.
I suspect they were looking at what it would take to get a hockey stick from the data and saw it wasn’t going to happen that way. At that point they either did the biased selections or clipping and grafting of temp data. (this is speculation on my part, waiting for someone more familiar with the graphs to figure that out).
Bill:
It may be commented out in briffa_sep98_d.pro, but, as has been pointed out more than once here, it is definitely not commented out in briffa_sep98_e.pro (in the harris-tree directory) In this file it it used more than once.
MangoChutney (05:54:06) :
“just a thought, if the emails and code were obtained by a hacker and not released by a whistle blower, would the evidence be inadmissible in a court of law?
Fourth Amendment protections apply to actions by police or their agents (cops tell the hotel maid to look for the dope) not to actions by private citizens. Assuming the source isn’t acting at behest of law enforcement, suppression is unlikely. Even if law enforcement was somehow involved, “inevitable discovery” exception might apply. Courts are also much less apt to apply exclusionary rule in a civil action, the most likely venue.
Besides, most important court here is public opinion – excluding “smoking gun” evidence of fraud would have same net effect as suppressing evidence of infidelity in Tiger’s car wreck case.