Comprehensive network analysis shows Climategate likely to be a leak

This lends cred to WUWT’s previous analysis done by our own Charles the moderator: The CRUtape Letters™, an Alternative Explanation,

Climate-Gate: Leaked

by Lance Levsen, Network Analyst – courtesy of Small Dead Animals

http://www.swfwmd.state.fl.us/conservation/indoors/leak.jpg

Introduction

Some time starting in mid November 2009, ten million teletypes all started their deet-ditta-dot chatter reeling off the following headline: “Hackers broke into the University of East Anglia’s Climate Research Unit….”

I hate that. It annoys me because just like everything else about climate-gate it’s been ‘value-added’; simplified and distilled. The contents of FOIA2009.zip demand more attention to this detail and as someone once heard Professor Jones mutter darkly, “The devil is in the details…so average it out monthly using TMax!”

The details of the files tell a story that FOIA2009.zip was compiled internally and most likely released by an internal source.

The contents of the zip file hold one top-level directory, ./FOIA. Inside that it is broken into two main directories, ./mail and ./documents. Inside ./mail are 1073 text files ordered by date. The files are named in order with increasing but not sequential numbers. Each file holds the body and only the body of an email.

In comparison, ./documents is highly disorganized. MS Word documents, FORTRAN, IDL and other computer code, Adobe Acrobat PDF’s and data are sprinkled in the top directory and through several sub-directories. It’s the kind of thing that makes the co-workers disorganized desk look like the spit and polish of a boot camp floor.

What people are missing entirely is that these emails and files tell a story themselves.

The Emails

Proponents of the hacker meme are saying that s/he broke into East Anglia’s network and took emails. Let’s entertain that idea and see where it goes.

There is no such thing as a private email. Collecting all of the incoming and outgoing email is simple in a mail server. Using: Postfix the configuration is always_bcc=<email address>, here are links on configuring the same for Sendmail, and for Exim. Those are the three main mail servers in use in the Unix environment. Two of them, Sendmail and Exim are or were in use as the external mail gateways and internal mail servers at the University of East Anglia (UEA).

When a mail server receives an email for someone@domain.net, it checks that it is authoritative for that domain. This means that a server for domain.net will not accept email for domain.ca. The mail server will usually then run checks on the email for spam, virus, and run other filters. It will then check to see whether to route the email to another server or to drop the email in a users mailbox on that server. In all examples examined in the released emails, the mail gateway forwarded the emails to another server.

The user then has a mail client that s/he uses to read email. Outlook Express, Eudora, Apple Mail, Outlook, Thunderbird, mutt, pine and many more are all mail clients.

Mail clients use one of two methods of reading email. The first is called POP and that stands for Post Office Protocol. A mail client reading email with POP logs into the mail server, downloads the email to the machine running the mail client and will then delete the original email from the users spool file on the mail server.

The second protocol is called IMAP, Internet Message Access Protocol. IMAP works by accessing the mailboxes on the mail server and doing most of the actions there. Nothing is actually downloaded onto the client machine. Only email that is deleted and purged by the mail client is gone. Either protocol allows the user the opportunity to delete the email completely.

Most email clients are setup for reading emails with POP by default and POP is more popular than IMAP for reading email.

The released emails are a gold mine for a system administrator or network administrator to map. While none of the emails released contained headers, several included replies that contained the headers of the original emails. An experienced administrator can create an accurate map of the email topography to and from the CRU over the time period in question, 1998 thru 2009.

Over the course time, UEA’s systems administrators made several changes to the way email flows through their systems. The users also made changes to the way they accessed and sent email.

The Users

Using a fairly simple grep1 we can see that from the start of the time-frame, 1999, until at least 2005 the CRU unit accessed their email on a server called pop.uea.ac.uk. Each user was assigned a username on that server. From the released emails, we can link username to people as such:

In the previously referenced grep comes some more useful information. For instance, we know that Professor Davies was using QUALCOMM Windows Eudora Light Version 3.0.3 (32) in September of 1999. (ref Email: 0937153268.txt). If you look at the README.txt for that version you can see that it requires a POP account and doesn’t support IMAP.

As mentioned previously, POP deletes email on the server usually after it is downloaded. Modern POP clients do have an option to save the email on the server for some number of days, but Eudora Light 3.0.3 did not. We can say that Professor Davies’ emails were definitely removed from the server as soon as “Send/Recv” was finished.

This revelation leaves only two scenarios for the hacker:

  1. Professor Davies’ email was archived on a server and the hacker was able to crack into it, or
  2. Professor Davies kept all of his email from 1999 and he kept his computer when he was promoted to Pro-Vice Chancellor for Research and Knowledge Transfer in 2004 from his position as Dean of the School of Environmental Sciences.

The latter scenario requires that the hacker would have had to know how to break into Prof. Davies’ computer and would have had to get into that computer to retrieve those early emails. If that were true, then the hacker would have had to get into every other uea.ac.uk computer involved to retrieve the emails on those systems. Given that many mail clients use a binary format for email storage and given the number of machines the hacker would have to break into to collect all of the emails, I find this scenario very improbable.

Which means that the mail servers at uea.ac.uk were configured to collect all incoming and outgoing email into a single account. As that account built up, the administrator would naturally want to archive it off to a file server where it could be saved.

This is a simple evolution. You just run a crontab to start a shell-script that will stop the mail server, move the mail spool file into a file somewhere else, nulls the live spool and restart the mail server. The account would reside on the mail server, the file could be on any server.

Alternatively you could use a procmail recipe to process the email as it comes in, but that may be a bit too much processing power for a very busy account.

This also helps to explain the general order of the ./mail directory. Only a computer would be able to reliably export bodies of email into numbered files in the FOIA archive. As the numbers are in order not just numerically but also by date, the logical reasoning is that a computer program is numbering emails as they are processed for storage. This is extremely easy to do with Perl and the Mail::Box modules.

The Email Servers

I’ve created a Dia diagram2 of the network topography regarding email only as demonstrated in the released emails. Here’s a jpeg of it:

CRU's network for email 	  from 1998 thru 2009.
click to enlarge

The first thing that springs to mind is that the admins did a lot of fiddling of their email servers over the course of ten years. 🙂 The second thing is the anomaly. Right in the middle of 2006-2009 there is a Microsoft Exchange Server. Normally, this wouldn’t be that big of an blip except we’ve already demonstrated that the servers at UEA were keeping a copy of all email in and out of the network. Admins familiar with MS Exchange know that it too is a mail server of sorts.

It is my opinion that the MS Exchange server was working in conjunction with ueams2.uea.ac.uk and I base this opinion on the fact that ueams2.uea.ac.uk appears both before and after the MS Exchange Server. It doesn’t change its IP address nor does it change the type of mail server that is installed on it. There is a minor version update from 4.51 to 4.69. You can see Debian’s changelog between the Exim versions here.

I’ve shown that the emails were collected from the servers rather than from the users accounts and workstations, but I haven’t shown which servers were doing the collection. There are two options, the mail gateway or the departmental mail servers.

As demonstrated above, I believe that the numbers of the filenames correspond to the order that the emails were archived. If so, the numbers that are missing, represent other emails not captured in FOIA2009.zip.

I wrote a short Bash program3 to calculate the variances between the numbering system of the email filenames. The result is staggering, that’s a lot of email outside of what was released. Here’s a graph of the variances in order as well as a graph with the variances numerically sorted . Graph info down below.

Variance from Email Number to the 	  last Email Number
click to enlarge
Variances sorted and plotted
click to enlarge

The first graph is a little hard to read, but that’s mostly because the first variance is 8,805,971. To see a little better, just lop off the first variance and rerun gnuplot. For simplicity, that graph is here. The mean of the variances is 402839.36 so the average amount of emails between each released email is 402,839. While not really applicable, but useful, the standard deviation is 736228.56 and you can visualize that from the second graph.

I realize that variance without reference is useless, in this instance the number of days between emails. Here is a grep of the emails with their dates of origin.

I do not see the administrators copying the email at the departmental level, but rather at the mail gateway level. This is logical for a few reasons:

  • The machine name ueams2.uea.ac.uk implies that there are other departmental mail servers with the names like ueams1.uea.ac.uk, (or even ueams.uea.ac.uk), maybe a ueams3.uea.ac.uk. If true, then you would need to copy email from at least one other server with the same scripts. This duplication of effort is non-elegant.
  • There is a second machine that you have to copy emails from and that is the MS Exchange server so you would need a third set of scripts to create a copy of email. Again, this would be unlike an Administrator.
  • Departmental machines can be outside the purview of Administration staff or allow non-Administrative staff access. This is not where you want to be placing copies of emails for the purposes of Institutional protection.
  • As shown with the email number variances, and if they are representative of the email number as it passed through UEA’s email systems, that’s a lot of emails from a departmental mail server and more like an institutional mail gateway.

So given the assumptions listed above, the hacker would have to have access to the gateway mail server and/or the Administration file server where the emails were archived. This machine would most likely be an Administrative file server. It would not be optimal for an Administrator to clutter up a production server open to the Internet with sensitive archives.

The Documents

The ./FOIA/documents directory is a complete mess. There are documents from Professor Hulme, Professor Briffa, the now famous HARRY_READ_ME.txt, and many others. There seems to be no order at all.

One file in particular, ./FOIA/documents/mkhadcrut is only three lines long and contains:

	  tail +13021 hadcrut-1851-1996.dat | head -n 359352 | ./twistglob > hadcrut.dat

	  # nb. 1994- data is already dateline-aligned

	  cat hadcrut-1994-2001.dat >> hadcrut.dat

Pretty simple stuff, get everything in hadcrut-1851-1996.dat starting at the 13021st line. From that get only the first 359352 lines and run that through a program called twistglob in this directory and dump the results into hadcrut.dat. Then dump all of the information in hadcrut-1994-2001.dat into the bottom of hadcrut.dat.

….Except there isn’t a program called twistglob in the ./FOIA/documents/ directory. Nor is there the resultant hadcrut.dat or the source files hadcrut-1851-1996.dat and hadcrut-1994-2001.dat.

This tells me that the collection of files and directories in ./documents isn’t so much a shared directory on a server, but a dump directory for someone who collected all of these files. The originals would be from shared folders, home directories, desktop machines, workstations, profiles and the like.

Remember the reason that the Freedom of Information requests were denied? In email 1106338806.txt, Jan 21, 2005 Professor Phil Jones states that he will be using IPR (Intellectual Property Rights) to shelter the data from Freedom of Information requests. In email 1219239172.txt, on August 20th 2008, Prof. Jones says “The FOI line we’re all using is this. IPCC is exempt from any countries FOI – the skeptics have been told this. Even though we (MOHC, CRU/UEA) possibly hold relevant info the IPCC is not part our remit (mission statement, aims etc) therefore we don’t have an obligation to pass it on.”

Is that why the data files, the result files and the ‘twistglob’ program aren’t in the ./documents directory? I think this is a likely possibility.

If Prof. Jones and the UEA FOI Officer used IPR and the IPCC to shelter certain things from the FOIA then it makes sense that things are missing from the ./documents directory. Secondly it supports the reason that ./documents is in such disarray is that it was a dump folder. A dump folder explicitly used to collect information for the purpose of release pursuant to a FOI request.

Conclusion

I suggest that it isn’t feasible for the emails in their tightly ordered format to have been kept at the departmental level or on the workstations of the parties. I suggest that the contents of ./documents didn’t originate from a single monolithic share, but from a compendium of various sources.

For the hacker to have collected all of this information s/he would have required extraordinary capabilities. The hacker would have to crack an Administrative file server to get to the emails and crack numerous workstations, desktops, and servers to get the documents. The hacker would have to map the complete UEA network to find out who was at what station and what services that station offered. S/he would have had to develop or implement exploits for each machine and operating system without knowing beforehand whether there was anything good on the machine worth collecting.

The only reasonable explanation for the archive being in this state is that the FOI Officer at the University was practising due diligence. The UEA was collecting data that couldn’t be sheltered and they created FOIA2009.zip.

It is most likely that the FOI Officer at the University put it on an anonymous ftp server or that it resided on a shared folder that many people had access to and some curious individual looked at it.

If as some say, this was a targeted crack, then the cracker would have had to have back-doors and access to every machine at UEA and not just the CRU. It simply isn’t reasonable for the FOI Officer to have kept the collection on a CRU system where CRU people had access, but rather used a UEA system.

Occam’s razor concludes that “the simplest explanation or strategy tends to be the best one”. The simplest explanation in this case is that someone at UEA found it and released it to the wild and the release of FOIA2009.zip wasn’t because of some hacker, but because of a leak from UEA by a person with scruples.

Footnotes

1 See file ./popaccounts.txt

2 See file ./email_topography.dia

3 See file ./email_variance.sh

4 See file ./gnuplotcmds

Notes

Graph Information

Graphs created with gnuplot using a simple command file4 for input. I use a stripped down version of the variants_results_verbose.txt file, it’s the same, just stripped of comment and the filenames.. The second graph is a numerically sorted version, $> sort -n ./variance_results.txt > variance_sorted_numerically.txt.

Assigned Network Numbers for UAE from RIPE.NET

RIPE.NET has assigned 139.222.0.0 – 139.222.255.255,193.62.92.0 – 193.62.92.255, and 193.63.195.0 – 193.63.195.255 to the University of East Anglia for Internet IP addresses.

RIPE.NET Admin contact for the University of East Anglia: Peter Andrews, Msc, Bsc (hons) – Head of Networking at University of East Anglia. (Linked In, Peter isn’t in the UEA directory anymore so I assume he is no longer at UEA.)

RIPE.NET Tech Contact for the University of East Anglia: Andrew Paxton

Current Mail Servers at UEA

A dig for the MX record of uea.ac.uk (email servers responsible for the domain uea.ac.uk) results in the following:

	  $> dig mx uea.ac.uk

	  ; <<>> DiG 9.6.1-P2 <<>> mx uea.ac.uk

	  ;; global options: +cmd

	  ;; Got answer:

	  ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 737

	  ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 13, ADDITIONAL: 13

	  ;; QUESTION SECTION:

	  ;uea.ac.uk.			IN	MX

	  ;; ANSWER SECTION:

	  uea.ac.uk.		50935	IN	MX	2 ueamailgate01.uea.ac.uk.

	  uea.ac.uk.		50935	IN	MX	2 ueamailgate02.uea.ac.uk.

The IP addresses for the two UEA email servers are:

ueamailgate01.uea.ac.uk. 28000 IN A 139.222.131.184

ueamailgate02.uea.ac.uk. 28000 IN A 139.222.131.185

Test connections to UEA’s current mailservers:

	  $> telnet ueamailgate01.uea.ac.uk 25

	  Trying 139.222.131.184...

	  Connected to ueamailgate01.uea.ac.uk.

	  Escape character is '^]'.

	  220 ueamailgate01.uea.ac.uk ESMTP Sendmail 8.13.1/8.13.1; Mon, 7 Dec 2009 01:45:42 GMT

	  quit

	  221 2.0.0 ueamailgate01.uea.ac.uk closing connection

	  Connection closed by foreign host.

	  $> telnet ueamailgate02.uea.ac.uk 25

	  Trying 139.222.131.185...

	  Connected to ueamailgate02.uea.ac.uk.

	  Escape character is '^]'.

	  220 ueamailgate02.uea.ac.uk ESMTP Sendmail 8.13.1/8.13.1; Mon, 7 Dec 2009 01:45:49 GMT

	  quit

	  221 2.0.0 ueamailgate02.uea.ac.uk closing connection

About Me

I’ve been a Unix, Windows, OS X and Linux systems and network administrator for 15 years. I’ve compiled, configured, and maintained everything from mail servers to single-signon encrypted authentication systems. I run lines, build machines and tinker with code for fun. You can contact me via: lance@catprint.ca.

Lance Levsen,

December, 2009


Sponsored IT training links:

We offer 100% pass result in first attempt for all kind of IT exams including 70-685 and 70-271. Join 640-460 online course to save a big deal on real exam.


Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
256 Comments
Inline Feedbacks
View all comments
Editor
December 7, 2009 10:54 am

hidemydecline (09:06:23) :
> I think Harry the programmer is the source.
I don’t. I think Harry is too busy, probably not vindictive? altruistic? undedicated? enough to do it. Also, he would have done a much better job organizing the documents directory. It’s such an eclectic mishmash that at first I thought it might have been all the attachments to the Emails but I quickly found that wasn’t the case.
—–
I completely missed looking at the real mail headers in responses, how could I have missed that? OTOH, I instantly recognized that the mail file names were just unix times. While that’s a glaring flaw in the analysis, everything else looks very good to me.

crosspatch
December 7, 2009 11:00 am

And things like 1134526470.txt would not have inspired a lot of confidence 🙂

December 7, 2009 11:06 am

Was that all the “leaker” has or is a more juicy t-bit to come? Is NASA next?
Tune in tomorrow for another episode of “FOOl THE PEOPLE”.

December 7, 2009 11:07 am
crosspatch
December 7, 2009 11:08 am

Whoever accumulated the emails had access to more than just Phil Jones’ email.

$ more 0926947295.txt
From: Dave Schimel \
To: Shrikant Jagtap \
Subject: RE: CO2
Date: Mon, 17 May 1999 09:21:35 -0600 (MDT)
Cc: franci \, Benjamin Felzer \, Mike Hulme \, schimel@ucar.edu,
wigley@ucar.edu, kittel@ucar.edu, nanr@ucar.edu, Mike MacCracken \

Not sure if the above will format correctly.

Ian
December 7, 2009 11:08 am

Re: Cold Englishman.
I think you need to read the BBC weatherman’s account of this a bit more closely. As I understand it, what he received was the string of emails that related directly to the discussion of his article (which had asked where global warming had gone), not the entire 61.9 MB zip file.
It’s also not clear in his account as to whether the string arrived “out of the blue” (i.e., someone sent him a copy of the text versions extractable from the “Mail” file), or whether he received those emails from someone involved in the exchange.

Nick
December 7, 2009 11:08 am

The really interesting part is that I think the emails are a selection. Someone has done quite a considerable amount of work editing out interesting emails. That’s a lot of work.
What the CRU might be slowly realising is that the hacker/whistle blower has all the emails. If they tell a porkie to the inquiry, or the inquiry comes to an odd conclusion, more emails will be leaked.
Nick

P Gosselin
December 7, 2009 11:09 am

This will lift your spirits!
Lord Monckton goes right down the list of charlatans (he calls them crooks) starting at about 9 minutes… Jones/ Hansen / Santer/ Mann, etc.
http://www.cfact.tv/2009/12/07/lord-monckton-on-climategate-at-the-2nd-international-climate-conference/
Hat Tip: EIKE
Anthony’s work is mentioned quite prominently starting at about 11:56.

Ken Hall
December 7, 2009 11:10 am

So the “Russian FSB/KGB did it” is a tinfoil-hat conspiracy theory then!
Good to know we have conspiracy nut cases running the climate “science” upon which will be levied trillions of dollars of policy.
I suppose they will be dissing the moon landings and claiming that Elvis is alive and well and has an enormous carbon footprint too!
[sarc] No, really. Fan-bloody-tastic! [/sarc]

P Gosselin
December 7, 2009 11:10 am

Anyone watching Monckton’s presentation can only conclude the whole thing is a FRAUD.

P Gosselin
December 7, 2009 11:11 am

I also say they are crooks,
The IPCC is gang of mobsters.
Period no need to discuss the source of e-mails.
The leaker is a hero.

Yarmy
December 7, 2009 11:11 am

Leif Svalgaard (09:41:36) :
Third Party (08:52:03) :
The atmosphere contains from 4-percent water vapor in the troposphere to 40-percent near the surface.
Get the numbers straight. At 100% relative humidity at 30C [tropics] the concentration of water vapor is 30 gram/cubic meter. Considering that 1 cubic meter of air at the surface weighs 1234 gram, the water concentration can at most be 30/1234 = 2.4%.
Your [or your source’s] numbers are 10-20 times too high.

Moreover, what’s any of it got to do with leaked emails?

Stacey
December 7, 2009 11:14 am

Dear Lance
Very good post and thanks.
You said:-
“The only reasonable explanation for the archive being in this state is that the FOI Officer at the University was practising due diligence. The UEA was collecting data that couldn’t be sheltered and they created FOIA2009.zip”
Just playing devil’s advocate. Is it plausible the FOI computer was hacked and the zip file downloaded?

Dave
December 7, 2009 11:17 am

With the EPA announcement about trying to regulate facilities that put out more than 25,000 tons of CO2 per year, I wonder how many people realize what other industries that will affect. Like I just looked up Coors (who is supposed to be environmentally friendly) and they put out over 1 million tons per year:
http://www.molsoncoors.com/responsibility/data/performance
I wonder how people will feel when they’re beer/wine/soda is attacked. I think there’s a ton of industries aside from oil/energy that will be impacted that people don’t realize. This for instance talks about CO2 fire extinguishers use in a variety of industries such as steel and some marine vessels have CO2 fire extinguishers that are gigantic:
http://www.epa.gov/ozone/snap/fire/co2/co2report.html
Oh and in the latest of New Scientist they had an article that Copenhagen wouldn’t cost people too much, yet in the article trying to make it sound like no big deal made it sound like a big deal – airfare doubling in cost and electricity bills going up 15% along with many other expenses rising 1% or more. Also in back of the magazine they had a new phrase called “Ockham’s Broom” which is where scientists sweep inconvenient facts under the rug, yet they made no mention of AGW when introducing the word.

crosspatch
December 7, 2009 11:20 am

“Emails are not held in clear form on a server”
Uhm, yes they are in the majority of cases. Most emails are also transmitted between servers in clear form. SMTP is NOT a secure communications format. NEVER put financial information, for example, in an email unless YOU encrypt the email yourself.
Some email server software will store the mail body in a binary format but by far the majority of SMTP mail transmission is done “in the clear” with mail spools that are also plain text.
Also, most large organizations these days never delete email until a long period of time (several years) has elapsed. This is done to protect themselves from litigation or to allow them to litigate. Your work email belongs to your employer and there may be several other employees that have access to it.
Consider every single email you send/receive though your work account to be visible by others.

Jim Carson
December 7, 2009 11:25 am

I can’t agree that this is a good analysis. The fact that Lance Levsen claims,

the average amount of emails between each released email is 402,839.

tells me that this analysis was hastily made and not even scanned for credulity before publishing.
Anthony, I appreciate the speed of breaking news and the difficulty of staying ahead of it, but this warranted a smell test, or better yet, a call for volunteers to review it.
At minimum, you should introduce these things with a disclaimer that you haven’t verified the results. This blog is starting to look like a bandwagon.

P Gosselin
December 7, 2009 11:28 am
dearieme
December 7, 2009 11:30 am

Jones dunnit. That’s my guess. He’s undergone a quasi-religious conversion to truth-telling and this was his chosen route to distance himself from his corrupt past.
No more fudging the data! No more hiding the decline!
He denies it? What more proof could you want?

Michael
December 7, 2009 11:30 am

[snip – sorry, but schoolbuses are wayyy off topic of this thread]

John Galt
December 7, 2009 11:30 am

BBC: UK Climate Code May Be Scrapped
One of the first victims of the Climategate scandal may by the very computer code that is supposed to track global temperature records.
According to a report by the BBC, a computer software expert says the source code used by the Climatic Research Unit at the University of East Anglia is “below the standard in any commercial software.” Thousands of e-mails and documents were stolen from the CRU, and published on the Internet.
[more] http://www.newsmax.com/insidecover/climategate_code_computer/2009/12/05/294861.html
Original Source: http://news.bbc.co.uk/2/hi/programmes/newsnight/8395514.stm

Michael
December 7, 2009 11:32 am

Moderator, please change “They told them” to “I told them”, thanks.

JustPassing
December 7, 2009 11:34 am

Am I now seeing right?
Google now produces 280,000,000 results for climategate.
WOW

PhilW
December 7, 2009 11:37 am

That Monckton video should be nailed to the top of the WUWT posts

Carrick
December 7, 2009 11:41 am

Followup on crosspatch’s comment:
Here is a simple Unix program for converting UNIX epoch time into “ctime”. The hours will be wrong unless you are in the same time zone as the one in which the times were converted into UNIX epoch time.
#include <stdio.h>
#include <time.h>
int main()
{
time_t tm;
while (scanf(“%lu”, &tm) == 1) { printf(“%s”, ctime(&tm)); }
return 0;
}

Julian in Wales
December 7, 2009 11:41 am

So it looks like an inside job?
I seem to remember that the person who released this information claimed it was only part of what S/he had collected?
If it really was a whistle blower it reasonable to expect that S/he was in a position to collect other valuable files? Lets hope for more instalments!