Darwin Zero Before and After

Guest Post by Willis Eschenbach

Recapping the story begun at WUWT here and continued at WUWT here, data from the temperature station Darwin Zero in northern Australia was found to be radically adjusted and showing huge warming (red line, adjusted temperature) compared to the unadjusted data (blue line). The unadjusted data showed that Darwin Zero was actually cooling over the period of the record. Here is the adjustment to Darwin Zero:

Figure 1. The GHCN adjustments to the Darwin Zero temperature record.

Many people have written in with questions about my analysis. I thank everyone for their interest. I’m answering them as fast as I can. I cannot answer them all, so I am trying to pick the relevant ones. This post is to answer a few.

• First, there has been some confusion about the data. I am using solely GHCN numbers and methods. They will not match the GISS or the CRU or the HadCRUT numbers.

• Next, some people have said that these are not separate temperature stations. However, GHCN adjusts them and uses them as separate temperature stations, so you’ll have to take that question up with GHCN.

• Next, a number of people have claimed that the reason for the Darwin adjustment was that it is simply the result of the standard homogenization done by GHCN based on comparison with other neighboring station records. This homogenization procedure is described here (PDF).

While it sounds plausible that Darwin was adjusted as the GHCN claims, if that were the case the GHCN algorithm would have adjusted all five of the Darwin records in the same way. Instead they have adjusted them differently (see below). This argues strongly that they were not done by the listed GHCN homogenization process. Any process that changed one of them would change all of them in the same way, as they are nearly identical.

• Next, there are no “neighboring records” for a number of the Darwin adjustments simply because in the early part of the century there were no suitable neighboring stations. It’s not enough to have a random reference station somewhere a thousand km away from Darwin in the middle of the desert. You can’t adjust Darwin based on that. The GHCN homogenization method requires five well correlated neighboring “reference stations” to work.

From the reference cited above:

“In creating each year’s first difference reference series, we used the five most highly correlated neighboring stations that had enough data to accurately model the candidate station.”

and “Also, not all stations could be adjusted. Remote stations for which we could not produce an adequate reference series (the correlation between first-difference station time series and its reference time series must be 0.80 or greater) were not adjusted.”

As I mentioned in my original article, the hard part is not to find five neighboring stations, particularly if you consider a station 1,500 km away as “neighboring”. The hard part is to find similar stations within that distance. We need those stations whose first difference has an 0.80 correlation with the Darwin station first difference.

(A “first difference” is a list of the changes from year to year of the data. For example, if the data is “31, 32, 33, 35, 34”, the first differences are “1, 1, 2, -1”. It is often useful to examine first differences rather than the actual data. See Peterson (PDF) for a discussion of the use of the “first-difference method” in climate science.)

Accordingly, I’ve been looking at the candidate stations. For the 1920 adjustment we need stations starting in 1915 or earlier. Here are all of the candidate stations within 1,500 km of Darwin that start in 1915 or before, along with the correlation of their first difference with the Darwin first difference:

WYNDHAM_(WYNDHAM_PORT) = -0.14

DERBY = -0.10

BURKETOWN = -0.40

CAMOOWEAL = -0.21

NORMANTON = 0.35

DONORS_HILL = 0.35

MT_ISA_AIRPORT = -0.20

ALICE_SPRINGS = 0.06

COEN_(POST_OFFICE) = -0.01

CROYDON = -0.23

CLONCURRY = -0.2

MUSGRAVE_STATION = -0.43

FAIRVIEW = -0.29

As you can see, not one of them is even remotely like Darwin. None of them are adequate for inclusion in a “first-difference reference time series” according to the GHCN. The Economist excoriated me for not including Wyndham in the “neighboring stations” (I had overlooked it in the list). However, the problem is that even if we include Wyndham, Derby, and every other station out to 1,500 km, we still don’t have a single station with a high enough correlation to use the GHCN method for the 1920 adjustment.

Now I suppose you could argue that you can adjust 1920 Darwin records based on stations 2,000 km away, but even 1,500 km seems too far away to do a reliable job. So while it is theoretically possible that the GHCN described method was used on Darwin, you’ll be a long, long ways from Darwin before you find your five candidates.

• Next, the GHCN does use a good method to detect inhomogeneities. Here’s their description of their method.

To look for such a change point, a simple linear regression was fitted to the part of the difference series before the year being tested and another after the year being tested. This test is repeated for all years of the time series (with a minimum of 5 yr in each section), and the year with the lowest residual sum of the squares was considered the year with a potential discontinuity.

This is a valid method, so I applied it to the Darwin data itself. Here’s that result:

Figure 2. Possible inhomogeneities in the Darwin Zero record, as indicated by the GHCN algorithm.

As you can see by the upper thin red line, the method indicates a possible discontinuity centered at 1939. However, once that discontinuity is removed, the rest of the record does not indicate any discontinuity (thick red line). By contrast, the GHCN adjusted data (see Fig. 1 above) do not find any discontinuity in 1941. Instead, they claim that there are discontinuities around 1920, 1930, 1950, 1960, and 1980 … doubtful.

• Finally, the main recurring question is, why do I think the adjustments were made manually rather than by the procedure described by the GHCN? There are a number of totally independent lines of evidence that all lead to my conclusion:

1. It is highly improbability that a station would suddenly start warming at 6 C per century for fifty years, no matter what legitimate adjustment method were used (see Fig. 1).

2. There are no neighboring stations that are sufficiently similar to the Darwin station to be used in the listed GHCN homogenization procedure (see above).

3. The Darwin Zero raw data does not contain visible inhomogeneities (as determined by the GHCN’s own algorithm) other than the 1936-1941 drop (see Fig. 2).

4. There are a number of adjustments to individual years. The listed GHCN method does not make individual year adjustments (see Fig. 1).

5. The “Before” and “After” pictures of the adjustment don’t make any sense at all. Here are those pictures:

Figure 3. Darwin station data before and after GHCN adjustments. Upper panel shows unadjusted Darwin data, lower panel shows the same data after adjustments.

Before the adjustments we had the station Darwin Zero (blue line line with diamonds), along with four other nearby temperature records from Darwin. They all agreed with each other quite closely. Hardly a whisper of dissent among them, only small differences.

While GHCN were making the adjustment, two stations (Unadj 3 and 4, green and purple) vanished. I don’t know why. GHCN says they don’t use records under 20 years in length, which applies to Darwin 4, but Darwin 3 is twenty years in length. In any case, after removing those two series, the remaining three temperature records were then adjusted into submission.

In the “after” picture, Darwin Zero looks like it was adjusted with Sildenafil. Darwin 2 gets bent down almost to match Darwin Zero. Strangely, Darwin 1 is mostly untouched. It loses the low 1967 temperature, which seems odd, and the central section is moved up a little.

Call me crazy, but from where I stand, that looks like an un-adjustment of the data. They take five very similar datasets, throw two away, wrench the remainder apart, and then average them to get back to the “adjusted” value? Seems to me you’d be better off picking any one of the originals, because they all agree with each other.

The reason you adjust is because records don’t agree, not to make them disagree. And in particular, if you apply an adjustment algorithm to nearly identical datasets, the results should be nearly identical as well.

So that’s why I don’t believe the Darwin records were adjusted in the way that GHCN claims. I’m happy to be proven wrong, and I hope that someone from the GHCN shows up to post whatever method that they actually used, the method that could produce such an unusual result.

Until someone can point out that mystery method, however, I maintain that the Darwin Zero record was adjusted manually, and that it is not a coincidence that it shows (highly improbable) warming.

Sponsored IT training links:

Want to pass HP0-J33 at first try? Gets certified 000-377 study material including 199-01 dumps to pass real exam on time.

0 0 votes

Article Rating

303 Comments

Inline Feedbacks

View all comments

Willis Eschenbach

December 22, 2009 2:32 pm

Gail Combs (05:44:57), you raise an interesting and often misunderstood point:

Ryan Stephenson (02:45:27) :
I certainly agree with your approach.
The idea of using stations up to 1500 km or more away “to adjust” a site seems very “dodgy” at best. I do not know Australia, but the whole idea that weather in one location is similar to that in another does not pass the smell test. Within 100 miles of where I live we have oceans, lakes, the piedmont and sand hills. Five miles away ALWAYS gets more rain that I or the airport weather station down the street does. The idea might work OK on a flat plain but not on mountains and ocean front areas. And yes I do understand a significance test for correlations. As a test to determine if the station may have a problem that needs investigation – fine. As the basis for “adjustments” NO.
The whole thing reeks of lazy data handling techniques.

What is used to adjust the data is not the absolute value of the data. It is the change in the data. As you point out, someplace five miles away may get very different temperatures from where you are.
However, what is related is the change in temperature. If you are having a hot month, it is very likely that someplace five miles away is having a hot month as well. It is this relationship that is used for the adjustment.
However, that doesn’t mean that the adjustment is valid. As someone pointed out above, what it can do is re-institute a UHI signal on a station which has been moved.
Someone else pointed out that Darwin is an outlier. While this is true, such outliers should be a huge warning flag to whoever is using the algorithm. People often say “the exception proves the rule” without realizing that “prove” in this expression has the meaning used in the phrase “proving ground”, as in “testing ground”. The real meaning of the phrase is “the exception tests the rule”. In the current case, the rule seems to be failing the test …

Willis Eschenbach

December 22, 2009 2:52 pm

wobble (06:53:46) :

Willis Eschenbach (22:57:54) :

“”our only test based solely on distance was limiting the list of potential reference stations to the 100 stations nearest to each candidate station.””

Great work, Willis.
1. I think you’ve shown that the Peterson algorithm couldn’t have been followed. (Can you just tell us where you read about the 0.8 correlation requirement? Isn’t it possible that they accepted a correlation of 0.4 as long as it was the “highest correlated neighboring station?”)

Not according to the GHCN document, An Overview of the Global Historical Climatology Network Temperature Database, which says (emphasis mine):

Our approach to adjusting historical data is to make
them homogeneous with present-day observations, so
that new data points can easily be added to homogeneity-
adjusted time series. Since the primary purpose
of homogeneity-adjusted data is long-term climate
analysis, we only adjusted time series that had at least
20 yr of data. Also, not all stations could be adjusted.
Remote stations for which we could not produce an
adequate reference series (the correlation between
first-difference station time series and its reference
time series must be 0.80 or greater) were not adjusted.
2. I think you’ve also shown at least one flaw in the Peterson algorithm even if it is utilized objectively. Testing 100 stations for adequate correlation doesn’t seem reasonable. As you hinted at before, even if 1,000 adjustments are using correlations with a p-value = 0.05, then that implies 50 of those 1,000 correlations were a coincidence.

There are many flaws in the Peterson algorithm, both theoretical and practical.

Overall, I think JJ did you a favor. He was definitely condescending, but he did push you to strengthen your argument quite a bit. However, I agree that it would be nice for him to help out with some of these efforts. He’s obviously smart enough. Unless he’s working on something of his own right now. JJ, do you care to share?

I’m kinda weird, in that I do all the scientific work that I do simply because I’m passionate about climate science. I have persevered and gone forward with my analysis not because of JJ, but in spite of him. These kinds of studies take time, as unlike Peterson and Hanson and their ilk, I have a regular job that does not involve climate science.
I don’t think JJ is a bad guy, I just get upset by constant cavilling with no corresponding productive input. Lead, follow, or get out of the way.

Willis Eschenbach

December 22, 2009 3:10 pm

Ryan Stephenson (08:27:36) :

@Willis: Have you tried taking the approach you/we suggest? Take the data between the discontinuities detected by the algorithm, then check the gradient of that data? Maybe it is difficult because there are only 10 or so datapoints and it is dependent on where you start and stop exactly, but it seems that the gradients between the discontinuities are about 0Celsius/Century compared to the gradient imposed by the corrections. Excel can plot a line of regression between the points, e.g. for the last 12 years we seem to have stable data and Excel can plot a line of regression between the last 12 points of the raw data and the gradient of this line can be derived. Sorry to dump the extra work on you but if you have the data tabulated already it should take you long.
That would really blow this whole thing out of the water. After all, what the algorithm itself is telling us is the site changed about three times after 1940 but was stable as a measuring site between those changes. So that is where the quality data is – the raw data between the site changes. Measure the gradient between the changes and its job done. It sure isn’t 6Celsius/century. So the algorithm will be shown to fail the sanity check badly.

The problem is that the method you suggest assumes that there really were some kind of problems needing adjustment in those years. I find nothing, either in the record itself or in the station metadata, indicating any problem in those years. The Aussies make adjustments based (very loosely, it appears) on the metadata … but they don’t make a single adjustment in any of the years adjusted by GHCN.

Willis Eschenbach

December 22, 2009 3:36 pm

temp (12:33:00) :

As I said originally, I am assuming he is giving correlations over the entire time series because he isn’t giving starting and ending years for producing the correlation, which isn’t what they did, I think.
If that is wrong, then fine, but it isn’t clear from what he has written.

My apologies for the lack of clarity, temp, you raise an important point. For the 1920 step, I used the correlation for the period 1900-1940, or as much of that period as the station covered. There is nothing I could find in their documentation which describes what period they actually used.
Nor does their documentation specify a couple of other important things. One is whether the correlation is statistically significant. If we flip two coins, and then flip two other coins, they may come up all heads (correlation = 1.0 between the two sets of two). However, because of the length of the dataset (N=2), the correlation is not statistically significant. A ten year temperature dataset is generally not long enough to establish significance.
The other is the effect of autocorrelation on the significance. (Autocorrelation measures how much this year’s data resembles last year’s data.) Temperature records tend to have fairly high autocorrelation, a warm year is likely to be followed by a warm year and vice versa. As a result, we are more likely to get a spurious correlation between temperature series.
I chose a forty-year period because of those two factors, which make a shorter time period more likely to produce spurious correlations. I did not use the entire period of the overlap between two stations because the correlation on that is likely to be lower than on the 40 year period, and I wanted to give the algorithm every chance. However, I still found nothing resembling Darwin’s record in the “neighboring” stations.
Which brings me back to my point that nature is not homogeneous. It is freckled and fractured, a “ghost of rags and patches”. Trying to force it into a homogeneous strait-jacket can be done … but what we end up with, almost by definition, will be “un-natural”. Nature specializes in jumps and changes, in instant inversions and retroversions, and not in bland homogeneity. As the poet said:

Pied Beauty
GLORY be to God for dappled things—
For skies of couple-colour as a brinded cow;
For rose-moles all in stipple upon trout that swim;
Fresh-firecoal chestnut-falls; finches’ wings;
Landscape plotted and pieced—fold, fallow, and plough;
And áll trádes, their gear and tackle and trim.
All things counter, original, spare, strange;
Whatever is fickle, freckled (who knows how?)
With swift, slow; sweet, sour; adazzle, dim;
He fathers-forth whose beauty is past change:
Praise him.
Gerard Manley Hopkins (1844–89). Poems. 1918

Homogenize that and see how it all looks … not pretty.

wobble

December 22, 2009 3:37 pm

Willis Eschenbach (14:52:53) :
“”according to the GHCN document,…the correlation between first-difference station time series and its reference time series must be 0.80 or greater””
Thanks! Again, great job.

Ryan Stephenson

December 22, 2009 4:38 pm

@Willis: I think maybe you misunderstand me. I am not suggesting that the algorithm that detects the discontinuities was right or wrong. That doesn’t matter to me. What matters to me is if the adjustment that was made, automatically or manually, and whether that adjustment was reasonable. Up till now you have been asking “the Team” to justify their adjustments, and waiting…. an waiting….. But I have suggested an approach that will indicate if their adjustments were wrong or not without their help.
So lets assume the algorithm that detects the discontinuities is doing so “willy-nilly”. So what? The Teams algorithm tells us that between 1961 and 1980 there was no need for any adjustment – because the black line is perfectly flat here. This means that the algorithm is telling us that the site was perfectly stable during that time and that the temperature measurments were valid. So let us look at the raw data between those dates. We can pick out the specific years that are between the adjustments that were made – about 17 years altogether – then we can plot a line of regression between them. The gradient of that line of regression will give us the rate of change of temperature. I haven’t done the plot, but just looking at it suggests about 2Celsius per century during that period*. But the gradient of the adjusted data shows a gradient of 6Celsius per century. These two facts are in direct contradiction. The site is detected to have been apparently stable for 17 years, by the Teams own algorithm, so the measured rate of change during those 17 years should be valid, by the Teams own criteria. But their own adjustment method comes up with a rate of change three times higher than this. The adjustment method fails this simple sanity check. There is far more adjustment than is necessary.
*naturally the measured rate of change is not attributable to any specific source. Could be CO2, could be UHI, could be something else.

Ryan Stephenson

December 22, 2009 5:10 pm

@Willis: OK, I tried to do it myself, but GISS/GHCN are both unable to deliver me the raw data 😉
I read the results from the graph you have shown of the raw data during about 1962 to 1979 and stuffed them into Excel, which was then used to add a linear line of regression. The gradient of that line came out at 3.5Celsius per century. Very high, certainly, but definitely not 6Celsius per century. The algorithm used to make the adjustment fails the sanity check – the adjustments used at Darwin are not reasonable.
Before anybody says “wow, 3.5Celsius per century- we’re all going to fry” remember this is data over only a 17 year period, it is preceeded by a long cooling period of about 1Celsius per century lasting some 50 years or more, and we aren’t able to indentify the source of the warming trend – it may not be CO2. I think if you did the same analysis of the full dataset using this approach then you would probably see that temperatures today are about the same as they were in 1880. Shame we can’t get hold of the raw data as it seems to have been, um, taken off line.

3x2

December 22, 2009 5:21 pm

Phil. (23:26:41) :
This is shown by the data, the record low at the airport is 10.4, the record at the PO was 13.4.
Which data? I can’t find any 10.4 or 13.4 for Darwin in v2.mean.

temp

December 22, 2009 6:44 pm

My reply to, Willis Eschenbach (15:36:39).
Well, I’m pretty sure that they didn’t use a signficance measure or they’d certainly of mentioned that. It seems like they used a straight cut off of 0.8. You could certainly argue not using a p-value cut-off is a weakness.
I’m also pretty sure your 40 year period is likely to wrong. Remember in total, they are only looking for 25 years or so of data from the station (certainly not 40). You’ve linked the pdf to one of their new papers, but if you go back and look at the refs in that, I’m pretty sure that they talk about a 5 year window analysis, and at the average global level that has given us something that looks similar to what they’ve produced (not exactly).
No homogenization method is going to be perfect or “natural”, but the fact of the matter is that the data does need to be homogenized (though I will point out that if you simply take all of the data and average it and ignore everything you still get a nice bi-step warming this century, including a late 20th century warming trend, resulting in now being as warm as any other time this century).
For your point to be relevant:
1. You have to claim no use of the data is “good” and that we should essentially ignore all of the surface data.
or
2. You need to suggest a method that produces a more “natural” result that doesn’t show warming.

temp

December 22, 2009 6:47 pm

My reply to wobble (12:57:25).
If he REALLY wants the details about this particular station, I’d suggest he’d do what has already been suggested- that he contact GHCN and ask them what they did.

KevinUK

December 22, 2009 6:54 pm

Ryan Stephenson (17:10:25)
“Shame we can’t get hold of the raw data as it seems to have been, um, taken off line”
If you are after the NOAA GHCN V2 data you can find it and download it from here.
NOAA GHCN V2 data
and it might be worth having a look at the links in this thread.
Reproducing Willis Eschenbach’s WUWT Darwin analysis
KevinUK

KevinUK

December 22, 2009 8:07 pm

Ryan Stephenson (17:10:25)
If you (and anyone else) are interested in getting access to the NOAA GHCN data using a very friendly user interface that amongst other things enables you to easily select and then export raw and/or adjusted data for a selected WMO station to an Excel spreadsheet/CSV file, then click the link below to the TEKTemp system (as I’ve called it).
You’ll need to register first before you can login and use it, but its simple as you just need to supply a valid email address (to be used as your username) and a password on the TEKTemp registration page
The TEKTemp system
Note this system is currently running on my own test web server and so may not always be on-line. I hope to port it to a proper web hosting account shortly
KevinUK

Nick Stokes

December 22, 2009 8:34 pm

KevinUK (13:23:59) :
In total, I have found 194 instances of WMO stations where “cooling” has been turned into “warming” by virtue of the adjustments made by NOAA to the raw data.
And how many instances of turning warming into cooling?

Willis Eschenbach

December 22, 2009 8:34 pm

temp (18:47:44) :

My reply to wobble (12:57:25).
If he REALLY wants the details about this particular station, I’d suggest he’d do what has already been suggested- that he contact GHCN and ask them what they did.

BWA-HA-HA-HA …
Sorry, temp, that just slipped out, couldn’t help it …
But heck, why not? I’ve tried this path many times without any joy, but this time might be different. I just sent the following to the address given on the NCDC web site for questions about GHCN, ncdc.info@noaa.gov .

Dear Friends:
I am a student of the climate, and I have been struggling to understand the GHCN homogenization method used for adjusting the GHCN raw temperature data. I am trying to reproduce your adjustments to the Darwin Australia climate station, and I have been unable to do so. Unfortunately, your published information is inadequate to answer the questions I have about the procedure. In particular, I would like to know:
1. In the the published information about the procedure, you describe setting up a “first difference reference series” comprised of stations which have a correlation of 0.80 with Darwin. For the 1920 adjustment to the Darwin station, I cannot find any such stations within the first hundred stations nearest to Darwin. Which stations were actually used to form the reference series for Darwin? I am particularly interested in those stations used for the 1920 adjustment.
2. In the published information, you do not specify the width of the window you are using for the correlation calculation. I have tried windows of various widths, but have not been able to reproduce your results. What time period (in years) on either side of the target year are you using for your correlation calculations?
3. I see nothing in the procedure about determining whether the correlation between Darwin and another station is statistically significant, or is occurring purely by chance. Did you do such a test, and if so, what was your cutoff (e.g. p<0.05, p<0.01)?
4. The statistical significance of correlation calculations are known to be greatly affected by autocorrelation. If you did do significance calculations for the construction of the first-difference reference series, did you adjust for autocorrelation, and if so, how?
5. How many reference stations were actually used for the calculation of the adjustment made to Darwin in 1920?
6. All of these questions could be easily answered without you having to do any investigation, by an examination of the computer code that was used for the procedure. Is that code published on the web somewhere? If not, could you put it on the web so that these kinds of questions be answered? And if not, could I get a copy of the code?
Many thanks for your assistance in this matter, and have a marvellous Holidays,
Willis Eschenbach

So … we’ll see. I’m not holding my breath. I’ll report back with the answer. My bet is on the polite brushoff, but since we’re now in the post-Climategate era, who knows? I’d be happy to be surprised.

Willis Eschenbach

December 22, 2009 8:45 pm

temp (18:44:00) :

… (though I will point out that if you simply take all of the data and average it and ignore everything you still get a nice bi-step warming this century, including a late 20th century warming trend, resulting in now being as warm as any other time this century).

Do you have a citation for this? Is it gridded, or simply an average? How is the averaging done?
For the record, I should say that I do think that the planet warmed in the 20th century … and the 19th century … and the 18th century …
I also think that we are no warmer than the earlier warm period in the 1930s-40s.

Willis Eschenbach

December 22, 2009 10:29 pm

And the ride through the fun house begins … in reply to my email of questions for GHCN, I just got the following:

********** THIS IS AN AUTOMATED REPLY **********
*— IMPORTANT —*
DO NOT REPLY TO THIS MESSAGE. The email address
used here is for automated email only. Messages
sent to this address are not read.
Thank you for contacting the National Climatic
Data Center (NCDC).
In order to serve you as efficiently as possible,
a self-help system has been developed containing
some of the more common questions about NCDC’s
online systems, available data, and products.
Please check the links below and see if they
answer your question(s). If you are still
having a problem, and you are unable to find
what you are looking for after checking these
links, please redirect your question(s) to:
ncdc.orders@noaa.gov
NCDC is required by Public Law 100-685 (Title IV,
Section 409) to assess fees, based on fair market
value, to recover all monies expended by the
U. S. Government to provide the data. More
information can be found at
http://www.ncdc.noaa.gov/ol/about/ncdchelp.html#FREE.
NCDC does provide free access to some data at
www4.ncdc.noaa.gov/cgi-win/wwcgi.dll?wwAW~MP~F
*—-
Common Questions about NCDC Systems:
*—-
Check Status of Existing Orders –
nndc.noaa.gov/?orderstatus.shtml
Question about a Web Order-
http://www.ncdc.noaa.gov/ol/about/ncdchelp.html#ON
Question about On-Line Subscriptions –
nndc.noaa.gov/?subscriptions.shtml
General OnLine Help –
http://www.ncdc.noaa.gov/ol/about/ncdchelp.html
*—-
Common Questions about Locating NCDC Data and Products:
*—-
Most Popular Products – http://www.ncdc.noaa.gov/mpp.html
Climate Data – http://www.ncdc.noaa.gov/climatedata.html
Radar Data – http://www.ncdc.noaa.gov/radardata.html
Satellite Data – http://www.ncdc.noaa.gov/satellitedata.html
Online Store – http://www.ncdc.noaa.gov/store.html
Weather Stations – http://www.ncdc.noaa.gov/stations.html
The Most Popular Products page lists our most
requested products and provides links to those
products. Please check this web page since
many data and product questions can be answered
here. Additional products can be found in our
Online Store (including some available for FREE).
For ASCII data files related to stations of your
choice, our Climate Data Online system contains
data for the full period of record. A STRONG
NOTE OF CAUTION is advised here if you do not
know what an ASCII file is. ASCII files need to
be loaded into spreadsheets or databases to be
further enhanced for analysis and reporting.
They are not easily readable in ASCII. NCDC
receives a number of calls from people who have
paid for and downloaded data only to find out
they do not really need ASCII files. If you
know you want ASCII files, the URL is
http://www.ncdc.noaa.gov/cdo.html
Although NCDC does not have sunrise/sunset
information, we do receive a lot of requests
for it. If you are looking for sunrise/sunset
information, you should visit the U.S. Naval
Observatory Data Services web site at
http://aa.usno.navy.mil/AA/data/
Again, thank you for contacting NCDC. We hope
that this self help page will answer your
questions. If it does not, we apologize for
any inconvenience this may cause and ask that
you redirect your question(s) to:
ncdc.orders@noaa.gov
National Climatic Data Center
Federal Building
151 Patton Avenue
Asheville NC 28801-5001
http://www.ncdc.noaa.gov
Phone: 828-271-4800
FAX: 828-271-4876

I’ve resent the email to the address in the automated reply, ncdc.orders …

MartinGAtkins

December 22, 2009 10:46 pm

I’m sorry this is a bit late, may not be relevant to Willis Eschenbachs work or may have already been covered. WUWT has been so busy lately it’s hard to keep abreast of things. I do feel however that it would of interest to Australian station researchers. The late John L. Daly examined Darwin station and here is what he found.
But there is something peculiar about Darwin, a tropical site. It shows an overall cooling. But that cooling was mostly done in a period of only 3 years between 1939 and 1942.
http://www.john-daly.com/darwin.htm

Ryan Stephenson

December 23, 2009 2:35 am

@temp 18:44.
“No homogenization method is going to be perfect or “natural”, but the fact of the matter is that the data does need to be homogenized ”
Unfortunately the homogenization DOES need to be perfect. We are looking to adjust data that only persists for about 15 years between significant site changes, and the underlying trend we are looking for during those periods is only 1Celsius per century – i.e. 0.15Celsius when the difference in annual means can be anything up to 1.5Celsius. So a very small error in the homogenisation process will kill the accuracy of that data.
So is the total homogenisation process perfect? Well the approach I have used to do a sanity check of the data at Darwin uses the Teams own algorithm to detect when the site is perceived to be “stable” and not requiring any homogenisation. The problem is that these periods are only about 15 years long, and during that time the data can be going up and down more or less randomly by 1.5Celsius. Fortunately, at Darwin between 1962 and 1979 the rate of rise is very steep. Therefore we can plot a line of regression between the points and even though there is considerable uncertainty in exactly where the line should run, we can see that it is certainly closer to 3.5Celsius/century rather than 6Celsius per century. No amount of fiddling will allow you to impose a 6Celsius per century rate of rise on the raw data between 1962 and 1979, during which time the Teams own algorithm tells you the site was stable.
So Darwin fails this simple sanity check – the algorithm as applied here is not working. Period. OK, so now you say “ah, but Darwin is unusual – the algorithm works in ‘usual’ cases”. Problem is – how can you prove that? In the “usual” cases the rate of rise in the adjusted data is much smaller. If I applied the same sanity check approach to a “normal” station I couldn’t say anything with any certainty because there would be so much uncertainty about where exactly the line of regression should be fitted, so I wouldn’t even try. So we can’t sanity check the algorithm in the “usual” cases – we have to take it on faith that the algorithm is working. Most mathematicians would throw out the algorithm at this point – they would say it fails one sanity check and those it might pass are highly disputable. They would take a look at the algorithm again and make sure it was doing something sensible with the Darwin data. They would use the Darwin data to expose the flaws in their algorithm.
“the fact of the matter is that the data does need to be homogenized ”
Well, why is there a need? We can see from the Teams own algorithm that the sites are really only stable for about 15 years before some massive adjustment is necessary. We can see that doing a line of regression to find the rate of change of the temperature during those 15 year periods is problematic because there isn’t enough data in 15 years to be sure that the gradient we get is correct. But is replacing the data with data from another station a valid method of getting a long, unbiased trend?
Well remember what we have just said. The sites are changing every 15 years according to the Team own algorithm. But the algorithm looks for correlation between several sites in order to backfill dubious data in the site under inspection. But how can the algorithm reliably do that? If all the sites are likely to be changing every 15 years, how can you reliably correlate one site to another? Remember that both the site under investigation and the sites used for homogenisation are changing very often – if this wasn’t the case then you could just throw out the data from the sites that had changed and stick with the sites that show no site changes. So the algorithm is probably backfilling data between all the sites in a given region, taking data from site 1 and giving it to site 2 which gives data to site 3 and also to site 4 in some crazy difficult to follow method that might very well result in some circular changes being made (unless of course they are using the method “all sites are created equal, but some are more equal than others”!). The data may match over some specific 15 year period to an extent, but is that sufficiently reliable? Is it any more reliable than fitting a line of regression to 15 years of stable data? Well the algorithm doesn’t seem to look for perfect correlation anyway, and the correlation it is looking for is based on the annual changes in the means i.e. a weather signal not a climate signal. This of course sounds rather dubious as a process, and then we have the sanity check peformed above which shows us that the algorithm is known to fail anyway.
There is another underlying problem with the homogenisation process. It tends to be biased in favour of stations that suffer the worst UHI effect. Most sites at the beginning of the 19th century would have been edge of town locations with low-rise building, little tarmac and no heavy traffic. During the 20th century all these factors would have changed contributing to significant UHI (which is why simply averaging all the data only proves there was warming of the measurment kit, not that the wamring was specifically related to CO2). Thos sites that were moved during the 20th century were almost certainly moved from urban areas (where they got in the way of development) to rural areas. However, the Teams algorithm automatically detects these sites because of the discontinuities in the readings caused by the site change, and seeks to replace the data from elsewhere. Where is it most likely to get that data? From those sites that have not changed much. Those are sites that were on the edge of time, have seen massive development around them, but then have not been moved. The site then suffers from UHI – but the Teams algorithm can’t detect that because it can only see sudden changes in the means either side of data from a given year. So the algorithm detects these urbanised sites as “good” sites and then uses these to replace the data at the “bad” rural sites. This is exactly the opposite of what logic would tell you should be the case.
Fundamentally Willis has shown that the algorithm is definitely wrong, that the output of the algorithm cannot be relied upon in any cases and that the underlying logic of the algorithm is faulty. We can go on to say that the high frequency of site changes and the unknown precise impact of UHI at each site makes it impossible to rely on the surface temperature measurements as a means of detecting climate change of 1Celsius per century reliably and should be abandoned. The best we can do is to learn from these mistakes and set up true climate monitoring stations to give us data reliably for the next 100 years unimpacted by UHI or site changes.

KevinUK

December 23, 2009 2:46 am

Nick Stokes (20:34:00) :
“KevinUK (13:23:59) :
In total, I have found 194 instances of WMO stations where “cooling” has been turned into “warming” by virtue of the adjustments made by NOAA to the raw data.
And how many instances of turning warming into cooling”
Nick why don’t you visit my thread on ”diggingintheclay‘ and you’ll find out?
The link to the thread on ‘Physically unjustifiable NOAA GHCN adjustments’ is here
As you’ll see I found 216 instances in which NOAA’s GHCN adjustments resulted in ‘warming being tirned into cooling’ and I list the Top 30 worse cases. The worst case is Mayo Airport, Yukon, Canada. As with Darwin and many other stations there is no physically justifiable reason as to why its raw data needs to be adjusted. It is an almost complete set of raw data spanning 1925 to 1990 with (according to NOAA’s GHCN station inventory data), no station moves and no duplicate series. Its raw data trend looks relative ‘flat’ but is in fact slightly positive at +1.3 deg. C/century. For some reason NOAA feels the need to adjust this raw data so that all data prior to 1983 is increased using a ‘staircase’ type step function (see the thread for the charts) similar to Darwin (and Edson – see thread) except in this case the adjustments result in a strong cooling trend of -4.5 deg. C/century.
I’ve checked the raw data for Mayo Airport on the Environment Canada web site here and it shows that raw data is available from 1984 to 2009 and it all looks fine with occasional missing data so why did it ‘drop out’ of NOAA’s adjustment analysis after 1983?
Perhaps you’d like to have a look at some of the other stations on my two lists? It’s easy to extract the data for individual WMO stations using my TEKTemp admin system (see earlier post for link).
KevinUK

Janis B

December 23, 2009 5:51 am

KevinUK (02:46:24) :

“For some reason NOAA feels the need to adjust this raw data so that all data prior to 1983 is increased using a ’staircase’ type step function”

I loaded Mayo A data from the Climate Explorer data into Excel, plotted the adjustments and see what you mean by ’staircase’.
Jesus… there must be some kind of ‘statistical explanation’ – or whatever the hell scientists call it – to substantiate why the algorithm does not introduce spurious trends, and does, in fact, qualitatively improve on the ‘original’ data. And no – that the adjustment trends largely cancel themselves out over whole dataset is not a sufficient explanation, I think.
Actually, would it be possible and not too difficult to group and at least average the trends (or the resulting anomalies) somehow – without weighting them and not within grid but, say, by some of the countries?

temp

December 23, 2009 6:58 am

“Do you have a citation for this? Is it gridded, or simply an average? How is the averaging done?”
No reference. I did it myself. Took 10 minutes. It is an easy thing to do, and I doubt it is publishable in of to itself, but we are working on doing something with it (essentially taking an opposite direction with respect to constructing global temp trends).
If you don’t believe me, do it yourself, or ask somebody that you trust to do it (ask a regular contributor to this blog to do it). Doing the simple average is really really easy.
For averaging w/ no gridding, I simply took the values for all the stations availible at any time point and averaged them.
Gridding adds layers of compexitity, but I did it as simply as I could.
With the completely raw data just averaged, I don’t get the the late 1930’s to be especially warm. The other “warm” period in this century shifts by about 30 years or so. I contribute this mostly to changes in the distribution of stations. In early years, they are mostly European and N. American. Post-WWII, more stations from other parts of the world come in, and you see warming (there might be a urban heat island and/or green house gas effect contributing). From there, the average decreases pretty substantially, but then it goes back up.

temp

December 23, 2009 7:07 am

Ryan Stephenson (02:35:00)- I just want to point out that did I offer another option. That is to simply throw out all of the surface data.
Generally thought, I disagree with you. Through different data analysis methods people routinely pull good information out of data that has serious flaws.
Possibly, this homogenization method isn’t the best and has issues, which is why I suggested from the beginning that something that would generate real interest would be a “better” homogenzation method that doesn’t show global warming.

Ryan Stephenson

December 23, 2009 7:20 am

I spotted another logical flaw in these homogenisation approaches.
The temperature data is only measured to the nearest 0.1Celsius. In order to detect a correlation between two sites, you would have to allow variations of +/-0.1Celsius to allow for variations that are purely due to the random uncertainty in the measurement due to the low resolution of thermometers. This means that to allow correlation between two sites we must allow discrepancies of (at least) 0.1Celsius that occur between the two sets of data.
However, the climate signal we are normally looking for (not at Darwin, which seems to be a special case!) is of the order of 0.1Celsius per decade. So if we use a correlation algorithm which says “oh, if the sites don’t correlate right down to 0.1Celsius we don’t care” and we use this to correct for sites that have moved every ten years then we are permitting uncertainties in the correlation which are as big as the alleged climate change signal that we are looking for.
That isn’t mathematically sound.
Probably the correlation looks okay, because the charts go up and down in the right places, but the precision of the correlation can’t possibly be sufficient to reliably extract a climate trend signal from the merged data.

temp

December 23, 2009 7:28 am

KevinUK (02:46:24) : Read your site. That is actually good work, though I’m not sure you conclusions are right.
You seem to see exactly about what you’d expect. “Corrections” lead to as much warming as cooling.
I don’t think for example, you can claim there is no reason for the GHCN to correct the data in the manner they have. There are legitimate issues with stations where you’d expect there to be artificial induced warming and cooling, and if those cases can be identified, then a correction should be made.
You could make the argument over long periods of times, they should cancel out (which is why I thought it would be interesting to just look at absolute averages), which is what your results seem to suggest.

KevinUK

December 23, 2009 7:57 am

Janis B (05:51:10) :]]
“Actually, would it be possible and not too difficult to group and at least average the trends (or the resulting anomalies) somehow – without weighting them and not within grid but, say, by some of the countries”
I’m working on this right now. I’ll be following up with a further serie sof threads on digginthclay over the next few days. You’ll be very surprised by the number of WMO stations that drop out of the NOAA GHCN analysis in the last two decades or so.
If you arent aware of them I’ll be reproducing the analyses of Giorgio Gillestri and RomanM very shortly. I had hoped GG would be prepared to extend his interesting analysis but it seems he comes from the Eric Steig school of the scientific method where you give up on your analysis once you’ve reached your pre-conceived conclusion. Pity, but I’m happy to pick up where he left off and do a somewhat more detailed analysis which will include diaggregating his analysis into NH/SH and high, mid and low latitudes in each hemisphere.
KevinUK