NOTE: Part 2 of this story has been posted: see The Smoking Code, part 2
The Proof Behind the CRU Climategate Debacle: Because Computers Do Lie When Humans Tell Them To
From Cube Antics, by Robert Greiner
I’m coming to you today as a scientist and engineer with an agnostic stand on global warming.
If you don’t know anything about “Climategate” (does anyone else hate that name?) Go ahead and read up on it before you check out this post, I’ll wait.
Back? Let’s get started.
First, let’s get this out of the way: Emails prove nothing. Sure, you can look like an unethical asshole who may have committed a felony using government funded money; but all email is, is talk, and talk is cheap.
Now, here is some actual proof that the CRU was deliberately tampering with their data. Unfortunately, for readability’s sake, this code was written in Interactive Data Language (IDL) and is a pain to go through.
NOTE: This is an actual snippet of code from the CRU contained in the source file: briffa_Sep98_d.pro
[sourcecode language=”text”]
;
; Apply a VERY ARTIFICAL correction for decline!!
;
yrloc=[1400,findgen(19)*5.+1904]
valadj=[0.,0.,0.,0.,0.,-0.1,-0.25,-0.3,0.,-0.1,0.3,0.8,1.2,1.7,2.5,2.6,2.6,2.6,2.6,2.6]*0.75 ; fudge factor
if n_elements(yrloc) ne n_elements(valadj) then message,’Oooops!’
yearlyadj=interpol(valadj,yrloc,timey)
[/sourcecode]
Mouse over the upper right for source code viewing options – including pop-up window
What does this Mean? A review of the code line-by-line
Starting off Easy
Lines 1-3 are comments
Line 4
yrloc is a 20 element array containing:
1400 and 19 years between 1904 and 1994 in increments of 5 years…
yrloc = [1400, 1904, 1909, 1914, 1919, 1924, 1929, … , 1964, 1969, 1974, 1979, 1984, 1989, 1994]
findgen() creates a floating-point array of the specified dimension. Each element of the array is set to the value of its one-dimensional subscript
F = indgen(6) ;F[0] is 0.0, F[1] is 1.0….. F[6] is 6.0
Pretty straightforward, right?
Line 5
valadj, or, the “fudge factor” array as some arrogant programmer likes to call it is the foundation for the manipulated temperature readings. It contains twenty values of seemingly random numbers. We’ll get back to this later.
Line 6
Just a check to make sure that yrloc and valadj have the same number of attributes in them. This is important for line 8.
Line 8
This is where the magic happens. Remember that array we have of valid temperature readings? And, remember that random array of numbers we have from line two? Well, in line 4, those two arrays are interpolated together.
The interpol() function will take each element in both arrays and “guess” at the points in between them to create a smoothing effect on the data. This technique is often used when dealing with natural data points, just not quite in this manner.
The main thing to realize here, is, that the interpol() function will cause the valid temperature readings (yrloc) to skew towards the valadj values.
What the heck does all of this mean?
Well, I’m glad you asked. First, let’s plot the values in the valadj array.

Look familiar? This closely resembles the infamous hockey stick graph that Michael Mann came up with about a decade ago. By the way, did I mention Michael Mann is one of the “scientists” (and I use that word loosely) caught up in this scandal?
Here is Mann’s graph from 1999
As you can see, (potentially) valid temperature station readings were taken and skewed to fabricate the results the “scientists” at the CRU wanted to believe, not what actually occurred.
Where do we go from here?
It’s not as cut-and-try as one might think. First and foremost, this doesn’t necessarily prove anything about global warming as science. It just shows that all of the data that was the chief result of most of the environmental legislation created over the last decade was a farce.
This means that all of those billions of dollars we spent as a global community to combat global warming may have been for nothing.
If news station anchors and politicians were trained as engineers, they would be able to find real proof and not just speculate about the meaning of emails that only made it appear as if something illegal happened.
Conclusion
I tried to write this post in a manner that transcends politics. I really haven’t taken much of an interest in the whole global warming debate and don’t really have a strong opinion on the matter. However, being part of the Science Community (I have a degree in Physics) and having done scientific research myself makes me very worried when arrogant jerks who call themselves “scientists” work outside of ethics and ignore the truth to fit their pre-conceived notions of the world. That is not science, that is religion with math equations.
What do you think?
Now that you have the facts, you can come to your own conclusion!
Be sure to leave me a comment, it gets lonely in here sometimes.
hat tip to WUWT commenter “Disquisitive”
========================
NOTE: While there are some interesting points raised here, it is important to note a couple of caveats. First, the adjustment shown above is applied to the tree ring proxy data (proxy for temperature) not the actual instrumental temperature data. Second, we don’t know the use context of this code. It may be a test procedure of some sort, it may be something that was tried and then discarded, or it may be part of final production output. We simply don’t know. This is why a complete disclosure and open accounting is needed, so that the process can be fully traced and debugged. Hopefully, one of the official investigations will bring the complete collection of code out so that this can be fully examined in the complete context. – Anthony
Sponsored IT training links:
Join today for 646-985 exam prep and get a free newsletter for next 642-072 and 1z0-050 exams.
Discover more from Watts Up With That?
Subscribe to get the latest posts sent to your email.


HAL 9000 did not malfunction. It was an error in the human programming and human direction that created the unfortunate problems.
Mr. Jones knew the raw data was lost for whatever reason. Further, it happened before he headed-up CRU, which would be all the more reason for him to have presented this fact early on and provided his “adjusted” data for review as such. He had many years to say “we lost data and here is how we approached the problem”. But he and CRU did not take that path.
The emails and code show us that humans were involved and there was programming adjusted and bad direction (decisions) given – no different than HAL 9000.
The smoking gun is the agenda that these actions underwrite.
I’m not convinced that this microscopic tearing apart of lines of code trawling for comments, semicolons and arrays is helping to clarify matters, so I may move to a different thread which looks at the whole cadaver, not just the odd suspicious skin mark. What I want to know is:
(1) what is the basic structure of the model? i.e.:
– How are the real physical processes simplified mathematically;
– How are a priori unknown parameters in the model given values to make the model match reality in the form of past data?
– Does the resulting model behave (a) deterministically (b) chaotically (c) periodically (d) some mixture of these (e) chaotically but with bounds (f) “exponential” runaway (g) something new?
– Is the model sensitive to small errors in data (linked to b and e above)?
(2) How was this model encoded in the form of a computer algorithm?
– Has it been tested?
– How?
(3) Where did the actual underlying raw data come from and how was it sanitized (possibly quite legitimately) for use in this model?
(4) Is the model intended for forward extrapolation?
– Is the extrapolation valid?
– Does the extrapolation inherently contain show-stopping flaws (see below)?
(5) Did the output of the preceding steps actually get supplied to IPCC?
At the moment all I see in this thread is commentary on fragments of code which appear to be merely attempting to do a complex best-fit of disparate data and trying different techniques.: I see no physical modelling at all! Yet I can hardly believe that this fiasco all boils down to mathematical artefacts and inappropriate curve fitting and extrapolation, such as (i) extending a series of spline curves beyond the actual data range, or (ii) using second- or greater-degree polynomials to smooth data points. Such techniques would be fine for presentation to show smoothed and interpolated data in order to emphasise trend instead of random errors, but the techniques would obviously generate hockey-sticks just outside the range of data (at both ends). Such an error would be so elementary it should not get past the first stage of a peer review. My limited understanding is that in any case some error like this had already been flushed out by demonstrating that random data caused hockey sticks. Is this model really so opaque and spaghetti-like that one cannot demonstrate conclusively that either it still is or is not inherently a hockey-stick generator regardless of data?
I’m not a climatologist, warmist, or coolist. I’m only a software engineer who used in the past to model simple physical systems; I have only come into this issue, as a spectator, in the last week. I cannot believe that the questions above have not been asked before, and so I plan to look for them and their answers in the existing material.
Aside from this: have there been any individual or mass suicides, apocalyptic Heaven’s Gate style, resulting from the interaction of the global warming hypothesis on panicky over-suggestible people?
The following code is from
FOIA\documents\osborn-tree6\summer_modes\data4alps.pro
This file is dated 11.08.2008. (dd.mm.yyyy)
doinfill=0 ; use PCR-infilled data or not?
doabd=1 ; use ABD-adjusted data or not?
docorr=1 ; use corrected version or not? (uncorrected only available
; for doinfill=doabd=0)
…
printf,1,’IMPORTANT NOTE:’
printf,1,’The data after 1960 should not be used. The tree-ring density’
printf,1,’records tend to show a decline after 1960 relative to the summer’
printf,1,’temperature in many high-latitude locations. In this data set’
printf,1,’this “decline” has been artificially removed in an ad-hoc way, and’
printf,1,’this means that data after 1960 no longer represent tree-ring
printf,1,’density variations, but have been modified to look more like the
printf,1,’observed temperatures.’
It seems, they had a pre-corrected version of data, aka ‘value added data’,
because there is no code actually implementing a correction.
Same thing in
FOIA\documents\osborn-tree6\summer_modes\data4sweden.pro
dated 28.11.2006 and
FOIA\documents\osborn-tree6\summer_modes\hovmueller_lon.pro
dated 28.02.2007
Anthony: Thanks for the caveat; I’m not sure all readers distinguish between your views and your guest posters’ and I know that you wouldn’t be as extreme as to say “that all of the data that was the chief result of most of the environmental legislation created over the last decade was a farce” as the result of one bit of out of context code. Perhaps the caveat could be moved to the top?
I do have a theory about the context here, echoing “Morgan” and “TKI” above: I think this was done as input to the tree ring calibration. We know from other code (like that posted by TKI) above and the Harry file that the recent “decline” in the tree-ring data was a problem when trying to calibrate the earlier data against the instrumental series – such a known (but unexplained) divergence would throw off the whole thing; at least by reducing the correlation but also most likely by offsetting the whole reconstructed tree data upwards.
Assuming for a moment one believes that the tree ring decline happened for some valid measurement reason, as opposed to trees “peak clipping” the temperature signal, then the logical thing to do here would be to truncate both series at the point where the decline starts, and only correlate before that. Indeed several code snippets indicating doing just that. You then can at least validly say that the trees correlate with R^2=xx up to 1960, or something like that.
Another, much more dodgy way, would be to fudge the more recent values so they track recent temperatures more closely before trying the calibration. This obviously produces a completely bogus correlation overall, but it might help you get a rough idea of what the calibration offset needs to be for earlier years.
From the comments Morgan posted above, they obviously believed that they could then regenerate a valid correlation/calibration using the real data afterwards. Maybe that’s why the code was commented out – the first run was with the fudge, to seed it, the second without. Obviously one would wish for some kind of parameter to the code to switch it on or off, but I guess if this is only going to be done once, that’s pretty typical programmer behaviour.
(Actually this idea reminds me of bootstrapping a compiler, if that makes sense to any other (former) compiler-heads. You don’t expect the first output to be any good, it just has to be enough to compile the real thing)
But whether it’s valid or not, without any evidence of this actually being used in anger, any idea that any published reconstruction, let alone current data, was “fudged” or “falsified” using it is pure speculation. You simply cannot generalise this to invalidity of all climate data, not least because the CRU data tracks other independent series extremely closely for recent years:
http://www.woodfortrees.org/plot/hadcrut3vgl/from:1979/offset:-0.15/mean:12/plot/gistemp/from:1979/offset:-0.24/mean:12/plot/rss/mean:12/plot/hadcrut3vgl/from:1979/offset:-0.15/trend/plot/gistemp/from:1979/offset:-0.24/trend/plot/rss/trend
(UAH left out because of known divergence between it and RSS)
But a caveat of my own: Many eyes make all bugs (and dodgy algorithms) shallow. Clearly the whole process needs to be thrown out in the open – and maybe even make use of the power of the Open Source movement to actually assist with the coding? I’m already doing this to a small degree with WFT in the technologies I know, but which aren’t really ideal for statistical work; but I’m sure there are lots of R, IDL etc. experts out there who would love to help!
@woodfortrees (Paul Clark) (04:01:43) :
“Perhaps the caveat could be moved to the top?”
Seconded
woodfortrees (Paul Clark) (04:01:43) :
Thanks for that level headed comment!
TKl (01:05:58) :
It seems, they had a pre-corrected version of data, aka ‘value added data’,
because there is no code actually implementing a correction.
If you care to read what you copied:
“printf,1,’IMPORTANT NOTE:’
printf,1,’The data after 1960 should not be used. ”
THE DATA SHOULD NOT BE USED
ie TRUNCATE the data at 1960
It does not say use the fudged data after 1960
I couldn’t resist a satirical poke at Monbiot (Dec 4) in the UK’s Guardian.
( see http://www.guardian.co.uk/environment/blog/2009/dec/04/debate-climate-sceptics?showallcomments=true#end-of-comments )
>
>
Mr Monbiot, bravo to you, sir!
You made a mug of that old fart, Lawson by ridiculing him with HadCRUT3 temperature series wheeze! I literally wet myself when I read your ‘Guardian’ piece where you say, “What it actually shows is that eight out of the 10 warmest years since records began have occurred since 2001.” Corker! Mum’s the word now on that ‘reconstructed’ 1000 year record set ; )
No one came back at you with the 12 Oct 2009 email, either. You know the part – where that dullard, Trenberth says to Mann, “The fact is that we can’t account for the lack of warming at the moment and it is a travesty that we can’t…. Our observing system is inadequate.” Idiot!
Trenberth and Jones are too much of a liability now. I’m starting to like that ‘apology’ you made more and more. I think I see where you’re taking this one ( thinking: sacrifices for the cause). The team talk in the locker room is Jones and Trenberth are plum scapegoats – throw them out and keep the integrity of the team intact, right? We may do something about this on RC. MM was wondering if you’d be up for more flim-flam in case someone does another ‘Trenberth’?
Btw, SM and his team of holocaust deniers over on CA and WUWT haven’t yet chewed over the lost 800+ ground- based climate-measuring stations from the official GIStemp. We might want to cull another set of ‘cold’ ground-based stations and augment the HadCRUT3 with a slew from China near some power stations (UHI?). Any thoughts?
Regards,
GB
mkurbo (00:05:32) :
The raw data is not lost – it just does not reside in uncorrected for at CRU
The majority of this early and latter data will be recorded on paper reports. To retreive this is not a job for a computer but manual eyeball and brain.
An example is ships log books
http://www.corral.org.uk/Home
Costs:
http://www.corral.org.uk/Home/project-meetings/team-meeting-28th-september-2009-1
These would be much more ordered and readable than a pile of photocopied papers from around the word.
I believe these logbooks from the 40s have been used by cru to provide a more accurate sst from that date.
Having got your paper copies you then would have to asses each site manually or by comparison to near neighbours.
Any duplicates will have to be removed. Any out of tolerance readings killed or perhaps the whole site or period for a site removed. Other errors due to manual data transfer and entry will also have to be corrected (misreading a 2 instead of a 7 etc)
Having done this other adjustments for site relocation (manual?) and recording times and uhi (partially manually- just because say “oswalds creek” has a population of 20 does not mean that the measurements are not affected by building work).
Having done this you have value added results, which will eventually be made available (when National met offices give permission)
If you want raw data, what you will get would be the paoper copies. How on earth will you retrieve valid data from these unless you spend the next 30 years invesigating and adjusting just as CRU has. Why will yours be more acceptable? Can we afford to wait 40 years for the results?
Even the Surface Station project (last updated 2008/04/18) with all its helpers has not manged to produce an output!
This graph showing solar activity is reversed left to right otherwise most people would recognize the “hockey stick”.
http://en.wikipedia.org/wiki/File:Carbon14_with_activity_labels.svg
Based on this graph, it could be “scientifically” argued that C14, not C02 is what is driving climate change.
bill (07:13:28) :
“Having done this other adjustments for site relocation (manual?) and recording times and uhi (partially manually- just because say “oswalds creek” has a population of 20 does not mean that the measurements are not affected by building work). ”
You claim that adjusting for the UHI of a site with a metropolitan population of 20 is necessary, and that such adjustment will have to be site specific and capable of capturing such effects as ‘building work’, and thus need to be performed manually.
Please document that this necessary procedure has been performed in the CRU dataset for all sites with a nearby population of 20 or more, and that it has been perfromed properly. Ditto any other alleged temperature datasets.
“If you want raw data, what you will get would be the paoper copies. How on earth will you retrieve valid data from these unless you spend the next 30 years invesigating and adjusting just as CRU has.”
Please explicitly define what is meant by ‘just as CRU has’? Really, a lot of people are very interested to know what that is, and have gone so far as to file FOIA requests to learn. If you know, please out with it.
“Why will yours be more acceptable?”
Because it will include proper UHI adjustment, be open source, and not have been produced by secretive tribalists intent on hiding the decline.
“Can we afford to wait 40 years for the results?”
Absolutely. If that is what it takes, we can’t afford not to.
That said, the UK Met seems to think they can get it done in three. They can probably do it in less than ten. As the current ‘no global warming’ period is already at least that long, we can wait.
JJ
bill (06:30:21) :
> If you care to read what you copied:
> “printf,1,’IMPORTANT NOTE:’
> printf,1,’The data after 1960 should not be used. ”
>
> THE DATA SHOULD NOT BE USED
>
> ie TRUNCATE the data at 1960
>
> It does not say use the fudged data after 1960
I did read what I copied. Where did I advise to use the data?
But if one doesn’t want to use the data why “adjust” it?
This is a very sloppy, over-reaching post.
“””Remember that array we have of valid temperature readings?”””
No… I remember an array of years, not temperatures
“””valadj, or, the “fudge factor” array as some arrogant programmer likes to call it is the foundation for the manipulated temperature readings. It contains twenty values of seemingly random numbers. We’ll get back to this later.”””
Is the whole array a fudge factor, or the .75 it is being multiplied by (which you left out of your graph by the way)?
It is not at all clear what the interpol() function is doing what you claim. It seems more likely it is “interpolating” values into the valadj array for the missing years. Do you have more evidence that it is doing something else?
This is embarassingly bad. Throwing up softballs like this just gives the warmists ammunition to blow us off.
Bill – I guess I’m not being clear enough, sorry…
It’s the conduct that is the smoking gun here – there was ample opportunity for CRU to NOT have ended up in this position.
For this to truly unravel, the two dots that need firm connection are conduct and agenda.
The code means nothing by itself. We have to know where it changed conclusions and publications.
And we don’t know that. And cannot know it yet. Probably no one does, the matter is very complex and the proper records may never have been kept. Or if made, may no longer exist.
Believing Jones or anyone else can explain what was done is fantasy. Some of those involved know more, some less, but no one will have the entire picture, purpose, data, conversion, adjustment, code, storage, and every other step before publication.
Clearly reassessment is the proper path. The alternative, attempting step-by-step correction will prove a bottomless swamp.
Governments and stations all over the world will have to provide raw data again. And they may no longer have it. If they thought CRU had safely preserved that data for all time then bureaus around the world may have discarded their originals.
And reassessment must be done honestly and independently. Jones and others may or may not have been honest but they are certainly not independent and disinterested. So the original gang must be excluded from influencing any review.
The law will ponder matters for a long time. Violating Freedom of Information Acts is probably the only violation that can be proved. University discipline, if any, is rare and totally unpredictable.
There’s a lot more to this ClimateGate story. This small (2 to 3 dozen) cabal of climate scientists could not have possibly gotten to this point without extraordinary funding, political support at virtually all levels of government, especially at the national level and unparalleled cooperation from the national and world media. This wide-spread networked support continues even as we the people puzzle over what this is all about. I ask you, “What are you seeing and hearing from our national media on the subject?” Anything? What are you seeing and hearing from all levels of our government, local and regional newspapers and media outlets? Anything of substance? At all of these levels the chatter has remained remarkably quite on the subject, wouldn’t you say? Why? What points and positions are you beginning to hear on the radio and see on the television? This cabal of scientists has an unprecedented level of support given the revelations contained in the emails, documented in the computer software code and elaborated in the associated programmer remarks (REM) within the code. And —- this has gone on for years, AND continues even in the presence of the most damning evidence one could imagine, or even hope for. Watergate pales in comparison, given the trillions of dollars in carbon offset taxes, cap & trade fees hanging in the balance and the unimaginable political control over people’s lives this all implies. The mainstream media’s conspiracy of silence proves the point. Their continued cover-up is as much a part of this crime as the actual scientific fraud. ABC, CBS and NBC are simply co-conspirators exercising their 5th Amendment rights.
Anthony, I made another post regarding the CRU’s source code analysis that I did on December 3rd.
http://cubeantics.com/2009/12/climategate-code-analysis-part-2/
I thought you might be interested since you linked to the previous article.
-Robert
I am just an everyday sort of guy,but I am appauled at this lack of respect for ther scientific method and the arrogant skewing of data to suit a political agenda.Full disclosure…You bet your ass full disclosure.I shall continue to follow this as I am totaly PISSED OFF about it.
I’m wondering if you have any idea what this code was used for.
I’m also wondering if you’ve ever heard of “simulations”.
Do I need to say more?
Do you know what simulations are?
Do you understand that scientists often generate random data?
Do you have any evidence that this code was used to falsify data, as opposed to simply used to simulate random data according to a certain model?
This blog post is irresponsible. If you want to know what the code was used for, why not ask the coder? Presuming that it was used for nefarious ends is entirely unjustified.
someone mentioned monbiot still claiming this decade has continued the ‘warming’.
well, bbc world service radio is endlessly playing a promo for a series of programs next year looking at the first decade of the 21st century:
it goes:
10 years of cyber technology
10 years of blah blah (forget what the second one is)
10 years of warming of the planet
bbc had pachauri for half an hour last nite with the most inane host, inserting stuff like ‘what EVERONE’S worried about is the tipping point’ and the like. the audience got to ask reverential questions and appeared to all make money from the AGW industry. climeategate or anything associated with it was not mentioned. shame on the media.
where again do the codes come in?
the data in the images appears to be wrong just with a glance. Where art thou Gyres created from the Earth’s spin. As we go zippin round the son in orbit, the Earth (sort of like an addict) is spun. There should be circular patterns counter clock-wise in the Northern hemisphere and clock-wise in the Southern.
a quick disclaimer though, I am not a whether person
mkurbo (08:47:16) :
“It’s the conduct that is the smoking gun here – there was ample opportunity for CRU to NOT have ended up in this position.”
Correct. Focus on the unscientific, unprofessional, secretive, manipulative behavior and machinations and corruptions of the scientific process by the team and their allies.
Don’t focus on the code, which can’t be properly analyzed from the outside. The most we can do is find possibly suspicious segments, not yet a smoking gun. And don’t focus on the temperature record. The fudgings that have been done on it will not affect it much. The globe is clearly warming, and we know the general shape of the major up-and-down trends since 1850.
Focus rather on the partisanship and dishonesty that is implied by ANY amount of fudging, data-concealment, and manipulativeness, and on the wider web of bias and corruption that must exist in the field for these spiders to have operated so successfully for so long.
The key point of Climategate is that it raises a strong suspicion that “climate science” amounts to “advocacy research,” not that this or that of its finding is incorrect or exaggerated. Its findings and logic MAY be correct, but now we can’t trust them enough to pass ruinously expensive legislation based on them. We would be nuts to “buy” anything from the team—or its allies. They’ve forfeited our trust. There must be a reality check by neutral panels of non-climatologists—a reexamination of everything, and a complete exposure of everything that’s been going under under the rock.
bill (09:47:02) :
“As someone else said – why bother with the code why not just draw the line you want?”
Thats a silly comment. Obviously if your intent is to deceive you would want as much legitimate data behind the result as possible. You would want to absolutely minimize divergence from the actual data as much as possible while producing the desired effect.
Here we have seen a willingness to deceive, “hide the decline”. Its all about the designing the correct visual impact to the masses.
Its quite simply propaganda not science. If that doesn’t disturb you it should.
Your right e mails are just talk but when coupled with their refusal to release research under the FOIA you have to wonder what are they hiding. It doesn’t help their credibility when they write about destroying data rather than give it up. Actions speak so much louder than words.
The two MMs have been after the CRU station data for years. If they ever hear there is a Freedom of Information Act now in the UK, I think I’ll delete the file rather than send to anyone.. From. Phil Jones 2-2-05