by Willis Eschenbach
People keep saying “Yes, the Climategate scientists behaved badly. But that doesn’t mean the data is bad. That doesn’t mean the earth is not warming.”

Darwin Airport – by Dominic Perrin via Panoramio
Let me start with the second objection first. The earth has generally been warming since the Little Ice Age, around 1650. There is general agreement that the earth has warmed since then. See e.g. Akasofu . Climategate doesn’t affect that.
The second question, the integrity of the data, is different. People say “Yes, they destroyed emails, and hid from Freedom of information Acts, and messed with proxies, and fought to keep other scientists’ papers out of the journals … but that doesn’t affect the data, the data is still good.” Which sounds reasonable.
There are three main global temperature datasets. One is at the CRU, Climate Research Unit of the University of East Anglia, where we’ve been trying to get access to the raw numbers. One is at NOAA/GHCN, the Global Historical Climate Network. The final one is at NASA/GISS, the Goddard Institute for Space Studies. The three groups take raw data, and they “homogenize” it to remove things like when a station was moved to a warmer location and there’s a 2C jump in the temperature. The three global temperature records are usually called CRU, GISS, and GHCN. Both GISS and CRU, however, get almost all of their raw data from GHCN. All three produce very similar global historical temperature records from the raw data.
So I’m still on my multi-year quest to understand the climate data. You never know where this data chase will lead. This time, it has ended me up in Australia. I got to thinking about Professor Wibjorn Karlen’s statement about Australia that I quoted here:
Another example is Australia. NASA [GHCN] only presents 3 stations covering the period 1897-1992. What kind of data is the IPCC Australia diagram based on?
If any trend it is a slight cooling. However, if a shorter period (1949-2005) is used, the temperature has increased substantially. The Australians have many stations and have published more detailed maps of changes and trends.
The folks at CRU told Wibjorn that he was just plain wrong. Here’s what they said is right, the record that Wibjorn was talking about, Fig. 9.12 in the UN IPCC Fourth Assessment Report, showing Northern Australia:

Figure 1. Temperature trends and model results in Northern Australia. Black line is observations (From Fig. 9.12 from the UN IPCC Fourth Annual Report). Covers the area from 110E to 155E, and from 30S to 11S. Based on the CRU land temperature.) Data from the CRU.
One of the things that was revealed in the released CRU emails is that the CRU basically uses the Global Historical Climate Network (GHCN) dataset for its raw data. So I looked at the GHCN dataset. There, I find three stations in North Australia as Wibjorn had said, and nine stations in all of Australia, that cover the period 1900-2000. Here is the average of the GHCN unadjusted data for those three Northern stations, from AIS:

Figure 2. GHCN Raw Data, All 100-yr stations in IPCC area above.
So once again Wibjorn is correct, this looks nothing like the corresponding IPCC temperature record for Australia. But it’s too soon to tell. Professor Karlen is only showing 3 stations. Three is not a lot of stations, but that’s all of the century-long Australian records we have in the IPCC specified region. OK, we’ve seen the longest stations record, so lets throw more records into the mix. Here’s every station in the UN IPCC specified region which contains temperature records that extend up to the year 2000 no matter when they started, which is 30 stations.

Figure 3. GHCN Raw Data, All stations extending to 2000 in IPCC area above.
Still no similarity with IPCC. So I looked at every station in the area. That’s 222 stations. Here’s that result:

Figure 4. GHCN Raw Data, All stations extending to 2000 in IPCC area above.
So you can see why Wibjorn was concerned. This looks nothing like the UN IPCC data, which came from the CRU, which was based on the GHCN data. Why the difference?
The answer is, these graphs all use the raw GHCN data. But the IPCC uses the “adjusted” data. GHCN adjusts the data to remove what it calls “inhomogeneities”. So on a whim I thought I’d take a look at the first station on the list, Darwin Airport, so I could see what an inhomogeneity might look like when it was at home. And I could find out how large the GHCN adjustment for Darwin inhomogeneities was.
First, what is an “inhomogeneity”? I can do no better than quote from GHCN:
Most long-term climate stations have undergone changes that make a time series of their observations inhomogeneous. There are many causes for the discontinuities, including changes in instruments, shelters, the environment around the shelter, the location of the station, the time of observation, and the method used to calculate mean temperature. Often several of these occur at the same time, as is often the case with the introduction of automatic weather stations that is occurring in many parts of the world. Before one can reliably use such climate data for analysis of longterm climate change, adjustments are needed to compensate for the nonclimatic discontinuities.
That makes sense. The raw data will have jumps from station moves and the like. We don’t want to think it’s warming just because the thermometer was moved to a warmer location. Unpleasant as it may seem, we have to adjust for those as best we can.
I always like to start with the rawest data, so I can understand the adjustments. At Darwin there are five separate individual station records that are combined to make up the final Darwin record. These are the individual records of stations in the area, which are numbered from zero to four:

Figure 5. Five individual temperature records for Darwin, plus station count (green line). This raw data is downloaded from GISS, but GISS use the GHCN raw data as the starting point for their analysis.
Darwin does have a few advantages over other stations with multiple records. There is a continuous record from 1941 to the present (Station 1). There is also a continuous record covering a century. finally, the stations are in very close agreement over the entire period of the record. In fact, where there are multiple stations in operation they are so close that you can’t see the records behind Station Zero.
This is an ideal station, because it also illustrates many of the problems with the raw temperature station data.
- There is no one record that covers the whole period.
- The shortest record is only nine years long.
- There are gaps of a month and more in almost all of the records.
- It looks like there are problems with the data at around 1941.
- Most of the datasets are missing months.
- For most of the period there are few nearby stations.
- There is no one year covered by all five records.
- The temperature dropped over a six year period, from a high in 1936 to a low in 1941. The station did move in 1941 … but what happened in the previous six years?
In resolving station records, it’s a judgment call. First off, you have to decide if what you are looking at needs any changes at all. In Darwin’s case, it’s a close call. The record seems to be screwed up around 1941, but not in the year of the move.
Also, although the 1941 temperature shift seems large, I see a similar sized shift from 1992 to 1999. Looking at the whole picture, I think I’d vote to leave it as it is, that’s always the best option when you don’t have other evidence. First do no harm.
However, there’s a case to be made for adjusting it, particularly given the 1941 station move. If I decided to adjust Darwin, I’d do it like this:

Figure 6 A possible adjustment for Darwin. Black line shows the total amount of the adjustment, on the right scale, and shows the timing of the change.
I shifted the pre-1941 data down by about 0.6C. We end up with little change end to end in my “adjusted” data (shown in red), it’s neither warming nor cooling. However, it reduces the apparent cooling in the raw data. Post-1941, where the other records overlap, they are very close, so I wouldn’t adjust them in any way. Why should we adjust those, they all show exactly the same thing.
OK, so that’s how I’d homogenize the data if I had to, but I vote against adjusting it at all. It only changes one station record (Darwin Zero), and the rest are left untouched.
Then I went to look at what happens when the GHCN removes the “in-homogeneities” to “adjust” the data. Of the five raw datasets, the GHCN discards two, likely because they are short and duplicate existing longer records. The three remaining records are first “homogenized” and then averaged to give the “GHCN Adjusted” temperature record for Darwin.
To my great surprise, here’s what I found. To explain the full effect, I am showing this with both datasets starting at the same point (rather than ending at the same point as they are often shown).

Figure 7. GHCN homogeneity adjustments to Darwin Airport combined record
YIKES! Before getting homogenized, temperatures in Darwin were falling at 0.7 Celcius per century … but after the homogenization, they were warming at 1.2 Celcius per century. And the adjustment that they made was over two degrees per century … when those guys “adjust”, they don’t mess around. And the adjustment is an odd shape, with the adjustment first going stepwise, then climbing roughly to stop at 2.4C.
Of course, that led me to look at exactly how the GHCN “adjusts” the temperature data. Here’s what they say
GHCN temperature data include two different datasets: the original data and a homogeneity- adjusted dataset. All homogeneity testing was done on annual time series. The homogeneity- adjustment technique used two steps.
The first step was creating a homogeneous reference series for each station (Peterson and Easterling 1994). Building a completely homogeneous reference series using data with unknown inhomogeneities may be impossible, but we used several techniques to minimize any potential inhomogeneities in the reference series.
…
In creating each year’s first difference reference series, we used the five most highly correlated neighboring stations that had enough data to accurately model the candidate station.
…
The final technique we used to minimize inhomogeneities in the reference series used the mean of the central three values (of the five neighboring station values) to create the first difference reference series.
Fair enough, that all sounds good. They pick five neighboring stations, and average them. Then they compare the average to the station in question. If it looks wonky compared to the average of the reference five, they check any historical records for changes, and if necessary, they homogenize the poor data mercilessly. I have some problems with what they do to homogenize it, but that’s how they identify the inhomogeneous stations.
OK … but given the scarcity of stations in Australia, I wondered how they would find five “neighboring stations” in 1941 …
So I looked it up. The nearest station that covers the year 1941 is 500 km away from Darwin. Not only is it 500 km away, it is the only station within 750 km of Darwin that covers the 1941 time period. (It’s also a pub, Daly Waters Pub to be exact, but hey, it’s Australia, good on ya.) So there simply aren’t five stations to make a “reference series” out of to check the 1936-1941 drop at Darwin.
Intrigued by the curious shape of the average of the homogenized Darwin records, I then went to see how they had homogenized each of the individual station records. What made up that strange average shown in Fig. 7? I started at zero with the earliest record. Here is Station Zero at Darwin, showing the raw and the homogenized versions.

Figure 8 Darwin Zero Homogeneity Adjustments. Black line shows amount and timing of adjustments.
Yikes again, double yikes! What on earth justifies that adjustment? How can they do that? We have five different records covering Darwin from 1941 on. They all agree almost exactly. Why adjust them at all? They’ve just added a huge artificial totally imaginary trend to the last half of the raw data! Now it looks like the IPCC diagram in Figure 1, all right … but a six degree per century trend? And in the shape of a regular stepped pyramid climbing to heaven? What’s up with that?
Those, dear friends, are the clumsy fingerprints of someone messing with the data Egyptian style … they are indisputable evidence that the “homogenized” data has been changed to fit someone’s preconceptions about whether the earth is warming.
One thing is clear from this. People who say that “Climategate was only about scientists behaving badly, but the data is OK” are wrong. At least one part of the data is bad, too. The Smoking Gun for that statement is at Darwin Zero.
So once again, I’m left with an unsolved mystery. How and why did the GHCN “adjust” Darwin’s historical temperature to show radical warming? Why did they adjust it stepwise? Do Phil Jones and the CRU folks use the “adjusted” or the raw GHCN dataset? My guess is the adjusted one since it shows warming, but of course we still don’t know … because despite all of this, the CRU still hasn’t released the list of data that they actually use, just the station list.
Another odd fact, the GHCN adjusted Station 1 to match Darwin Zero’s strange adjustment, but they left Station 2 (which covers much of the same period, and as per Fig. 5 is in excellent agreement with Station Zero and Station 1) totally untouched. They only homogenized two of the three. Then they averaged them.
That way, you get an average that looks kinda real, I guess, it “hides the decline”.
Oh, and for what it’s worth, care to know the way that GISS deals with this problem? Well, they only use the Darwin data after 1963, a fine way of neatly avoiding the question … and also a fine way to throw away all of the inconveniently colder data prior to 1941. It’s likely a better choice than the GHCN monstrosity, but it’s a hard one to justify.
Now, I want to be clear here. The blatantly bogus GHCN adjustment for this one station does NOT mean that the earth is not warming. It also does NOT mean that the three records (CRU, GISS, and GHCN) are generally wrong either. This may be an isolated incident, we don’t know. But every time the data gets revised and homogenized, the trends keep increasing. Now GISS does their own adjustments. However, as they keep telling us, they get the same answer as GHCN gets … which makes their numbers suspicious as well.
And CRU? Who knows what they use? We’re still waiting on that one, no data yet …
What this does show is that there is at least one temperature station where the trend has been artificially increased to give a false warming where the raw data shows cooling. In addition, the average raw data for Northern Australia is quite different from the adjusted, so there must be a number of … mmm … let me say “interesting” adjustments in Northern Australia other than just Darwin.
And with the Latin saying “Falsus in unum, falsus in omis” (false in one, false in all) as our guide, until all of the station “adjustments” are examined, adjustments of CRU, GHCN, and GISS alike, we can’t trust anyone using homogenized numbers.
Regards to all, keep fighting the good fight,
w.
FURTHER READING:
My previous post on this subject.
The late and much missed John Daly, irrepressible as always.
More on Darwin history, it wasn’t Stevenson Screens.
NOTE: Figures 7 and 8 updated to fix a typo in the titles. 8:30PM PST 12/8 – Anthony
MattB (08:32:59) asks:
You are seemingly very incurious.
UHI was advanced as one explanation for the sharp decline in the 1939/1941 timeframe. Yet, that explanation is not compatible with the record for the prior 60 years. I can think of no measurement or mechanical reason for the prior decline.
“”” Ryan Stephenson (02:46:32) :
Can I please correct you. You keep using the phrase “raw data”. Averaged figures are not “raw data”. Stevenson screens record maximum and minimum DAILY temperatures. This is the real RAW data.
When you do an analysis of temperature data over one year then you should always show it as a distribution. It will have a mean and a standard deviation. Take the UK. It may have a mean annual temperature of 15Celsius with a standard deviation of 20Celsius. “””
So Ryan; if a Stevenson screen records the maximum and minimum daily temperature (the RAW data), just what is the purpose of showing that as a distribution.
The RAW data is a record of DIFFERENT observations. Suppose I go to the London zoo, and record what creature I find in each cage or enclosure or display area. Should I then show this RAW data as a distribution and calculate its mean and standard deviation ? Would I perhaps find that the mean animal in the london zoo, is a Wolverine, and the standard deviation is a Lady Amherst pheasant ?
Why all this rush to try and convert real RAW data into something totally false and manufactured.
The temperature is different every place on earth and changes with time in a way that is different at every place; so why try to replace all of that real RAW information with two numbers that at best are total misinformation.
It seems to me that statisticians, having run out of useful things to calculate, gravitate towards climatology and start applying their methodologies to disguising what the instruments tell us the weather and climate is, or has been.
GISStemp and HadCRUT are simply that; mathematical creations of arbitrary AlGorythms applied to recorded information of historic weather data; the results of which have no real scientific significance, as far as planet earth is concerned. They certainly don’t tell us whether living conditions on earth are getting better or worse; or even how good they might have been at some past epoch.
Thank you Sir !
lets hope the idiots don’t put the world into reverse gear. I have posted my Representative the prime reader ( for 5 year olds) with the WUWT link.
When the stevenson screens came in, it would be interesting to see what they did to “adjust” for that change…. That is why there is a step change for Darwin around 1940-41 in record zero I’d say.
bill (08:44:07) :
Ryan Stephenson (04:46:20) :
A uk spaghetti graph for you
Figures unadjusted met office
http://img410.imageshack.us/img410/8996/ukspaghetti.jpg
—
Did I notice “De Bilt” in there? That’s a village in The Netherlands, and it’s clearly not in the UK. The dutch met office KNMI is sited there.
It happens that we (the US) have a meteorological site on Darwin proper. Coordinates are 12.4246°S 130.891597°E. I have no idea how that site relates to the sixty-seven-year-old thermometer at Darwin in terms of position, but it’s got enough instrumentation to corroborate or invalidate the adjustments to Darwin’s thermal record, I’d guess.
Website is http://www.arm.gov/sites/twp/c3
I just don’t know how to access the data. It’s only about seven years old, though, so it’d only be able to serve to validate recent data.
I’d guess the AGW community has already done some work there, if I were in the habit of guessing.
I’d also guess that it’s within a few kilometers of the Darwin weather station, which should be CEFGW.
“When the stevenson screens came in, it would be interesting to see what they did to “adjust” for that change….”
From my read of the homogenation methods, they did not do anything specific to adjust for any specific issue (such as Stevenson screens, TOB, etc) for stations outside the US.
For USHCN stations, they did apply a metadata based homogenization. If the station metadata documeted a site change or an instrument change etc, then they applied a specific correction to the data in an attempt to account for it.
For stations outside the US, they did not do that. Instead, they homogenized each station to a reference station, using a first difference series. This does not apply an explicit, defined correction to a discontinuity of known source…
Perhaps it is a mere coincidence, but there is a tremendous amount of similarity between the homogeneity adjustments shown in Figure 8 and the plot of the valadj array at http://cubeantics.com/2009/12/climategate-code-analysis-part-2/comment-page-4/?rcommentid=989&rerror=incorrect-captcha-sol&rchash=6b6ceabae8a0407d386b125f133a74ff.
I mean, the general shape is the about the same, the magnitude is the almost the same and the time scale is close.
Figure 7 more so than Figure 8
We’ve got to stop preaching to the choir and get on over to http://www.washingtonpost.com/wp-dyn/content/article/2009/12/08/AR2009120803402.html
The warmers are out in force, calling Palin a Bimbo for speaking up, and excoriating the WashPo for running an Op they don’t like.
A couple things occured to me:
1) The reasonable adjustments to stations should have a natural distribution centered on zero. Adjustments for station location or instrumentation changes should be equally positive and negative. The only decent reason for a positive bias among many stations would be moving stations from poor sites (beside an AC) to a good site (in a park), but even then, an adjustment would result in ‘locking in’ the bias of the poor site to all future measurements from the new site.
2) If #1 is true and the net result of all adjustments, then global temperatures do not need adjustment, only gridding. Adjustment is only required for continuity in examining individual stations.
3) The exception in this process would be any form of econometric adjustment for heat island effect. However, this would almost always be a negative adjustment.
Bottom Line: If the sum of all global adjustments is positive, it’s a biased adjustment.
My 1st thought on this is that what is needed is project similar to the surfacestations.org project. Grass-roots volunteer project. Volunteers do the same type of analysis as Willis did for Darwin for as many of the GHCN stations as possible (limiting factor certainly will be raw data) & have them compiled for the worls to see on a website. It could be eye opening. I am guessing with the number of scientifically trained readers on WUWT that this potentially is an achievable community science project.
Any takers? I’ll pitch in with some station analysis.
Can there be anything more upsetting and unsettling to a scientific evidence-based argument than to show that the evidence as presented is not only untrustworthy but perhaps intentionally so? And can we agree that if the temperature record has been massaged, twisted, encouraged, and otherwise manipulated to minimize inconvenient facts, that what we have is scientific fraud even if done in sincerity, with the best of intentions? Finally, why should the peoples of developing countries have to suffer more and longer and be denied the wonders and comforts of modern life just because of the faulty work of well-intentioned zealots who lost their ability to claim objectivity many years back?
Jeff L (11:53:41) :
Jeff, it’s already happening. I’ve formed surfacetemps.org, email me if you are interested, willis [at] surfacetemps.org
First, yes, I read the text you quoted. I know that huge adjustments are sometimes made to individual stations. I’ve looked at them. I’ve looked at a lot of stations. The adjustments to Darwin Zero are in a class all their own.
And yes, that is possible, it may all just be innocent fun and perfectly scientifically valid. And if someone steps up to the plate and lists why those adjustments were made, and the scientific reasons for each one, I’ll look like a huge fool.
Still waiting …
And I’m racking my brains trying to think what could possibly justify the amazing adjustments made for to record at Brisbane’s (Eagle Farm) International Airport over the latter part of the period 1950 – 2008. Any ideas?
http://thedogatemydata.blogspot.com/2009/12/raw-v-adjusted-ghcn-data.html
JJ (06:51:00) :
Be careful on the 0.6 figure for the cumulative adjustments — they are using F, not C.
Willis;
I went to surfacetemps.org – Seems to be selling park equipment. WUWT?
I have been analyzing South Texas temperature data, attempting to determine San Antonio UHI using 5 surrounding rural sites (using raw data). Since the SA site had a major move in June 1942 (from a rapidly growing downtown to a (then) rural airport), I looked to see how NOAA had handled the move:
In the 5 years before the move, NOAA subtracted 2°F (1.97 to 2.07) annually from the raw data. In the 5 years after the move, they subtracted 1°F (1.01 to 1.11) from the raw data.
I’d be willing to repeat your exercise using the whole NOAA dataset for South Texas once surfacetemps is up and running.
BTW, how do you decompress a UNIX .Z file using Windows? I want to check what NOAA posts as raw data, is truely raw.
Ryan Stephenson (04:46:20) : …Anyway, the means for the annual distribution show a 0.35Celsius increase in the last 10 years compared to the first 10 years. Problem is the averaged monthly show a variation ranging from -0.03 to +1.0 Celsius taken over the same period.
If you want to see a baffling wide variation of trends, check out averaged monthly Tmax & Tmin instead of Tmean.
I think Willis is not correct to assume CRU have used the GHCN Darwin Zero. I also think it is wrong to assume Jones / CRU have simply use the GHCN station data. See my take over at;
http://www.warwickhughes.com/blog/?p=357
willis
really spectacular article. i nominate you for the nobel prize!
this has set me thinking. as a pure novice in all this, i know that in scotland virtually all the recording stations are at or near airports.
our local recordings are from Leuchars military airport just across the water from Carnoustie here. that is a very busy military airport. for security reasons, presumably, we cannot gain access to verify the position of the measuring point.
i wonder if anyone has investigated the observatory in Armagh (pronounced arm – ahh) in ireland.
this is sited in a small city which is very rural and has, to my knowledge, been recording for over 150 years. it should be a good source of unadalturated data.
perhaps you or steve could have a look at this.
Makes me wonder. I can’t honestly judge this in anyway, I’m not qualified. I can’t buy AGW because I would have to buy it on faith. I can’t deny it either because I would have to deny it on faith. Bottom line, I have no opinion.
If they’re going to devastate economies to “correct” the climate, they better “fix” things in such a way that it will help us whether they’re right or not. For example, nuclear power, solar thermal, and so on, done right, could help us to fight GW and to secure our domestic energy future. (Both providing jobs and security) It would help us either way. Those are the kinds of answers we need. What I hope doesn’t happen are answers that’re too specific to mitigating climate change.
WinRar does it…
Seems to me it would be best to use some process that reconciles temperature readings from the various stations by associating each with the all other “nearby” readings. (i.e., not varying from some fixed time by more than a couple of minutes). The average of such a closely associated group would provide a global temperature reading at a specific times of the day. …
There would seem to be numerous ways to aggregate such groupings to rule out real outliers without having to make arbitrary changes to individual readings.
THANK YOU!!! This breakdown puts some meat on the bones of the manipulation, and gives us direct questions which can be either answered directly, or obfuscated. I cannot express my thanks for your and others efforts to dig down through the available data, and allow the rest of us just getting up to speed on the raw datasets (such that remain).
More please, it is appreciated by all seeking truth.