Comparing GHCN V1 and V3

Much Ado About Very Little

Guest post by Zeke Hausfather and Steve Mosher

E.M. Smith has claimed (see full post here: Summary Report on v1 vs v3 GHCN ) to find numerous differences between GHCN version 1 and version 3, differences that, in his words, constitute “a degree of shift of the input data of roughly the same order of scale as the reputed Global Warming”. His analysis is flawed, however, as the raw data in GHCN v1 and v3 are nearly identical, and trends in the globally gridded raw data for both are effectively the same as those found in the published NCDC and GISTemp land records.

clip_image002

Figure 1: Comparison of station-months of data over time between GHCN v1 and GHCN v3.

First, a little background on the Global Historical Climatology Network (GHCN). GHCN was created in the late 1980s after a large effort by the World Meteorological Organization (WMO) to collect all available temperature data from member countries. Many of these were in the form of logbooks or other non-digital records (this being the 1980s), and many man-hours were required to process them into a digital form.

Meanwhile, the WMO set up a process to automate the submission of data going forward, setting up a network of around 1,200 geographically distributed stations that would provide monthly updates via CLIMAT reports. Periodically NCDC undertakes efforts to collect more historical monthly data not submitted via CLIMAT reports, and more recently has set up a daily product with automated updates from tens of thousands of stations (GHCN-Daily). This structure of GHCN as a periodically updated retroactive compilation with a subset of automatically reporting stations has in the past led to some confusion over “station die-offs”.

GHCN has gone through three major iterations. V1 was released in 1992 and included around 6,000 stations with only mean temperatures available and no adjustments or homogenization. Version 2 was released in 1997 and added in a number of new stations, minimum and maximum temperatures, and manually homogenized data. V3 was released last year and added many new stations (both in the distant past and post-1992, where Version 2 showed a sharp drop-off in available records), and switched the homogenization process to the Menne and Williams Pairwise Homogenization Algorithm (PHA) previously used in USHCN. Figure 1, above, shows the number of stations records available for each month in GHCN v1 and v3.

We can perform a number of tests to see if GHCN v1 and 3 differ. The simplest one is to compare the observations in both data files for the same stations. This is somewhat complicated by the fact that station identity numbers have changed since v1 and v3, and we have been unable to locate translation between the two. We can, however, match stations between the two sets using their latitude and longitude coordinates. This gives us 1,267,763 station-months of data whose stations match between the two sets with a precision of two decimal places.

When we calculate the difference between the two sets and plot the distribution, we get Figure 2, below:

clip_image004

Figure 2: Difference between GHCN v1 and GHCN v3 records matched by station lat/lon.

The vast majority of observations are identical between GHCN v1 and v3. If we exclude identical observations and just look at the distribution of non-zero differences, we get Figure 3:

clip_image006

Figure 3: Difference between GHCN v1 and GHCN v3 records matched by station lat/lon, excluding cases of zero difference.

This shows that while the raw data in GHCN v1 and v3 is not identical (at least via this method of station matching), there is little bias in the mean. Differences between the two might be explained by the resolution of duplicate measurements in the same location (called imods in GHCN version 2), by updates to the data from various national MET offices, or by refinements in station lat/lon over time.

Another way to test if GHCN v1 and GHCN v3 differ is to convert the data of each into anomalies (with baseline years of 1960-1989 chosen to maximize overlap in the common anomaly period), assign each to a 5 by 5 lat/lon grid cell, average anomalies in each grid cell, and create a land-area weighted global temperature estimate. This is similar to the method that NCDC uses in their reconstruction.

clip_image008

Figure 4: Comparison of GHCN v1 and GHCN v3 spatially gridded anomalies. Note that GHCN v1 ends in 1990 because that is the last year of available data.

When we do this for both GHCN v1 and GHCN v3 raw data, we get the figure above. While we would expect some differences simply because GHCN v3 includes a number of stations not included in GHCN v1, the similarities are pretty remarkable. Over the century scale the trends in the two are nearly identical. This differs significantly from the picture painted by E.M. Smith; indeed, instead of the shift in input data being equivalent to 50% of the trend, as he suggests, we see that differences amount to a mere 1.5% difference in trend.

Now, astute skeptics might agree with me that the raw data files are, if not identical, overwhelmingly similar but point out that there is one difference I did not address: GHCN v1 had only raw data with no adjustments, while GHCN v3 has both adjusted and raw versions. Perhaps the warming the E.M. Smith attributed to changes in input data might in fact be due to changes in adjustment method?

This is not the case, as GHCN v3 adjustments have little impact on the global-scale trend vis-à-vis the raw data. We can see this in Figure 5 below, where both GHCN v1 and GHCN v3 are compared to published NCDC and GISTemp land records:

clip_image010

Figure 5: Comparison of GHCN v1 and GHCN v3 spatially gridded anomalies with NCDC and GISTemp published land reconstructions.

If we look at the trends over the 1880-1990 period, we find that both GHCN v1 and GHCN v3 are quite similar, and lie between the trends shown in GISTemp and NCDC records.

1880-1990 trends

GHCN v1 raw: 0.04845 C (0.03661 to 0.06024)

GHCN v3 raw: 0.04919 C (0.03737 to 0.06100)

NCDC adjusted: 0.05394 C (0.04418 to 0.06370)

GISTemp adjusted: 0.04676 C (0.03620 to 0.05731)

This analysis should make it abundantly clear that the change in raw input data (if any) between GHCN version 1 and GHCN version 3 had little to no effect on global temperature trends. The exact cause of Smith’s mistaken conclusion is unknown; however, a review of his code does indicate a few areas that seem problematic. They are:

1. An apparent reliance on station Ids to match stations. Station Ids can differ between versions of GHCN.

2. Use of First Differences. Smith uses first differences, however he has made idiosyncratic changes to the method, especially in cases where there are temporal lacuna in the data. The method which used to be used by NCDC has known issues and biases – detailed by Jeff Id. Smith’s implementation and his method of handling gaps in the data is unproven and may be the cause.

3. It’s unclear from the code which version of GHCN V3 that Smith used.

STATA code and data used in creating the figures in this post can be found here: https://www.dropbox.com/sh/b9rz83cu7ds9lq8/IKUGoHk5qc

Playing around with it is strongly encouraged for those interested.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
275 Comments
Inline Feedbacks
View all comments
gallopingcamel
June 24, 2012 9:38 pm

Just when I thought that Mosher had mellowed a little he starts ranting.
Methinks Steve doth protest too much. Calm down and listen just this once.

phlogiston
June 24, 2012 9:44 pm

“Hit ‘EM where they aint” I guess must be Steve Mosher’s principle in, astonishingly, refusing point blank to reply to the detailed rebuttal by EM Smith while filling the thread with musings about life in general. This looks like a mixture of cowardice and arrogance – in any case there is no doubt that it constitutes an admission of defeat.
This spectacular intellectual defeat of Mosher and Hausfather by EM Smith is reminiscent of the outcome of the famous Oxford debate on evolution by natural selection between Thomas Huxley and Samuel Wilberforce at the Royal Institution in 1860 (http://en.wikipedia.org/wiki/1860_Oxford_evolution_debate) – paradoxically perhaps considering the Cheifio’s faith and presumed atheism on the other side. Such is history. This is what happens when true intellect and honesty confronts establishment dogma and a dishonest defence of special interest.

Carrick
June 24, 2012 9:44 pm

sunshine:

NOAA: “The Little Ice Age (or LIA) refers to a period between 1350 and 1900

1900 is the extreme outer edge, and I think you would have to admit that an interval that extends beyond this isn’t typical of LIA weather.
Regarding the interval selection, if you’ll listen to and absorb what is being said regarding the issues about reduced geographic regions, you’ll sharpen your arguments as a result. You may or may not get the conclusions you prefer, but that’s not the point of your analysis, right? It’s to learn the truth.

June 24, 2012 9:48 pm

Sunshine
‘And for 3rd or 4th time you dodge the key point: Spurious data did appear in BEST.”
Let me explain this to you yet again.
The QC data file goes through ADDITIONAL checks. Those checks happen in code.
I can use GHCN daily as an example. For the GHCN daily data set every day of data
has a spatial consistency check
Want a link?
ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt
To lazy to read? here is the text
QFLAG1 is the quality flag for the first day of the month. There are
fourteen possible values:
Blank = did not fail any quality assurance check
D = failed duplicate check
G = failed gap check
I = failed internal consistency check
K = failed streak/frequent-value check
L = failed check on length of multiday period
M = failed megaconsistency check
N = failed naught check
O = failed climatological outlier check
R = failed lagged range check
S = failed spatial consistency check
T = failed temporal consistency check
W = temperature too warm for snow
X = failed bounds check
Now, when Berkeley earth reads this data in, it applies MOST OF but not ALL OF
these QA flags
DO you want to see what QA flags have been applied to the data in question.
1. get the package BerkeleyEarth
2. download the QC data
3. read the flags.txt file and see what flags have been applied
Now, the spatial quality flags are typically not applied. WHY?
A. The berkeley data includes more sources
B. If you have more sources your spatial consistency test can be better.
Spatial consistency is determined not for just GHCN daily but for all the data around that site.
That process happens as a part of the kriging process, where the weather error is minimized in a iterative process.
At some point in the future I hope to add two datasets that will allow you to see the final data
1. post scalpel data: this is like 170K station segments
2. final total feild values.
One of the draw backs of doing using this kriging process is that you cant easily
see what happens to outliers and you cant easily see what the final weighted values are.
So for example.. if you have a value that is -13.7 and all the surrounding stations
have other values, the weighting process will deweight unreliable data. Its an iterative process
that searches to minimize the weather noise.

June 24, 2012 9:49 pm

Carrick
You give the impression BEST is untouched raw data. But they had their method of handling the data. And the data is not untouched and pure.
Also, who were the peer-reviewers of Mueller’s BEST? Why do I say Mueller? He is the face of it. We all know it. Why distract like that.

June 24, 2012 9:56 pm

mosher, thanks for the link on the LIA. One of the links (http://notalotofpeopleknowthat.wordpress.com/2011/11/11/what-was-life-like-in-the-little-ice-agepart-ii/) in that article had this to say:
“The late 1870’s were equally cold in China and India , where up to 18 million died from famines caused by cold, drought and monsoon failure.
The cold snap persisted into the 1880’s and 1890’s when large ice floes formed on the Thames.”
Remember mosher, stuff did happen before 1950, even if BEST disagrees.
And if people do think I tried to hi-jack the thread, I apologize, but my first post on this thread did start with: “(Moderator — feel free to snip, but I think this is relevant)”

Carrick
June 24, 2012 10:01 pm

Amino:

Again, and really, for the last time, you are biased. You prefer Meuller and Hansen. My goodness man, look around you.

I have absolutely no idea what you are talking about. I prefer Mueller and Hansen. As in “Gentlemen prefer Meuller [sic] and Hansen?”
What???
And OK I’m looking around. Now what?
Look, this all started out as a discussion between sunshine and myself (which is how I got you and he confused) over Mueller’s data starting here. I think you’ve lost the thread somewhere because you are adding a lot of content to this thread that isn’t there.
I wrote something critical of Mueller that hardly shows a preference for him, I didn’t make any particular claims about GISTEMP other than a cautionary note about the need to compare apples to apples. (As you add more stations, if the new stations are at higher latitude, you expect a small as in *yawn* upwards shift in temperature, just like what has been seen.)
Beyond that, course I’m biased. We all have biases. But what does that have to do with anything we’re discussing here? How are my biases manifesting themselves in a way that distracts from the original thread? If you think I prefer (or am biased in favor of) the BEST temperature reconstruction or the GISTEMP (land-only) ones, that would be a mistake on your part;
Try and stay on track.

Carrick
June 24, 2012 10:02 pm

Amino, I’m looking at GHCN not BEST.

Carrick
June 24, 2012 10:05 pm

And I was criticizing a finding of Mueller and BEST in my comment to sunshine. And I’ve consistently said that BEST is based on GHCN. As to where I’ve “given the impression that BEST is untouched raw data,” I’ll have to ask for a link or a retraction there.
You keep making wild claims, you need to ante up now with factual examples or admit you were mistaken.

June 24, 2012 10:06 pm

Carrick
What’s up with all the arm waving and smoke screens? I did say it wrong. I should have said the data set Mueller has identified himself with. Why be so petty? But I think that’s in your nature. You didn’t do name calling this time. But the same condescension of name calling comes through anyway.
If you want to talk cherry picking, as you did in a comment above where you told me to compare one area of data in BEST to the same area in another data set that’s fine. It is cherry picking. Not sure why you can’t see it is.
I think you are not open to any real discussion. It is clear you have your beliefs. You are not here to discuss but to tell us why you think people who don’t agree with you are wrong.

June 24, 2012 10:08 pm

Carrick: “1900 is the extreme outer edge, and I think you would have to admit that an interval that extends beyond this isn’t typical of LIA weather.”
So some people say. But HADCRUT3 NH says is cooled from ~1875 to 1912 or so.
http://www.woodfortrees.org/plot/hadcrut3nh/from:1876/to:1912/plot/hadcrut3nh/from:1876/to:1912/trend
Again, I think there are climate myths that suggest late 1800 to early 1900 temperatures were a cold starting point and it was even colder before 1895, when in fact going back in time it might have been warmer all the way back to the 1870s and might well have been significantly warmer than today.
As for you not liking the size of the regions I am looking at, I disagree. I think a climate signal should be as visible in a place as big as BC or as small was California or Washington or Oregon. And if the past is warmer then we can then question why BEST ignores pre-1950 data.
The massaging of data that EM Smith is try to highlight is essential to understanding whether modern warming is a myth caused by adjustments or just a small blip in a long history of climate ups and downs.

June 24, 2012 10:09 pm

Carrick
not really sure what you are looking for.
Are you saying BEST is raw untouched data.
Are you saying it was altered?
Can’t be sure why you want some kind of retraction.

June 24, 2012 10:10 pm

Carrick
you do give a clear impression of BEST that it is true to real world temperature. That is where you give the impression it is raw data.

June 24, 2012 10:19 pm

Carrick
this is what you said:
“When you adjust for differences in the “land-only” algorithms, BEST and GISTEMP get very findings, since they have the largest geographical coverage, so this is believable.”
http://wattsupwiththat.com/2012/06/22/comparing-ghcn-v1-and-v3/#comment-1016898
I can’t stay with you any longer Carrick. You’re inconsistent. Maybe someone else would like to dance in circles with you.

Carrick
June 24, 2012 10:21 pm

Amino, I’m asking for a url link to where I said what you are characterizing me as saying. Or an admission that your characterizations are in error.
I will claim I never said “BEST is raw untouched data”, in fact I never said anything particular about the BEST reconstruction (other than a comparison of it to GISTEMP).
In which comment for example do I give “give a clear impression of BEST that it is true to real world temperature”? Can you point this out to me and others?

Carrick
June 24, 2012 10:25 pm

Sunshine:

As for you not liking the size of the regions I am looking at, I disagree. I think a climate signal should be as visible in a place as big as BC or as small was California or Washington or Oregon.

I didn’t say I “didn’t like it”. I said you had to be careful because the smaller the geographic region, the larger the effect of regional scale variability on the temperature record, and that *you* need to consider this when making your arguments.
And I know you aren’t going to like this one:

But HADCRUT3 NH says is cooled from ~1875 to 1912 or so.

I don’t think HADCRUT is reliable before 1950. I have a factual basis for making that decision, it could be flawed, it’s something that Steven Mosher and I both separately want to look at, and I can go into if you are interested.
This is back to the issue… if you don’t think 1970-2010 is reliable, why do you trust 1850-1912, where the global geographical coverage was much worse, and the quality controls in place and instrumentation were much more primitive?
Being skeptical means being skeptical all of the time, even at times when it appears to hurt your arguments. As I said, the objective should be about arriving at the truth, not about who can toss out the superior rhetoric.

Carrick
June 24, 2012 10:53 pm

Amino:

I can’t stay with you any longer Carrick. You’re inconsistent. Maybe someone else would like to dance in circles with you.

All I’m asking you do to is be honest about how you characterize my positions, and if you mischaracterized them through misunderstanding admit to that, and not to continue to attack me with mischaracterizations of my views.
This appears to be as close as you can get to substantiating your somewhat wild claims. I did say:

“When you adjust for differences in the “land-only” algorithms, BEST and GISTEMP get very findings, since they have the largest geographical coverage, so this is believable.

Same geographical coverage=same answers equals the algorithms give consistent results when they are expected to. You’d expect this (my point) and this is what you find (my point). What this doesn’t seem to imply is that GISTEMP is up to some “funny stuff” in their adjustments. Nor does it imply that the temperature series is necessarily better or worse than any other.
What this statement does not say is any of these claims of yours,
1) BEST uses raw temperatures,
2) that raw temperatures are “true” temperatures,
3) that I (like most gentlemen apparently ;-)) prefer Hasnen and Mueller,
4) fill in the blank(s) I’ve lost track of all of the silly claims you’ve made about me at this point,
You can’t even honestly admit that you overgeneralized my comments so now having accused me of slinking off (when I TOLD you I was going to remodel my kitchen and might be back assuming I didn’t perish e.g. running the wet saw), you now slink off without even that admission?
It’s all good, on your way then.

June 24, 2012 10:53 pm

Carrick, lets put it this way. Do I think the LIA ended in 1850 on the dot? No.
Do I trust HADCRUT? Not necessarily. But it does coincide with some of the Greenland data I’ve seen.
Upernavik: Warmest November 1878, warmest Dec 1873 etc
http://www.arctic.noaa.gov/reportcard/greenland_1873-2011_stats_vs_1981-2010_table_htc3.pdf
I suspect it was warmer in the 1870s in many regions than it is now.
.
“the smaller the geographic region, the larger the effect of regional scale variability on the temperature record, and that *you* need to consider this when making your arguments.”
My argument is simple. A GAT is like sausage maker attempting to mash all the climate records into one number … say 42 an make it appear that global temperatures are this relatively stable upwardly rising 2d graph.
The number 42 says nothing about climate. There is nothing unusual about our current climate. If anything the unusual part is how calm it appears and how small the fluctuations are from month to month.
Climate is chaotic and in many US states the last 5 years aren’t #1 or #2 and in some states it isn’t #3, or #4 either.
Why? It isn’t global warming.

Carrick
June 24, 2012 10:55 pm

Amino, if you wanted to know which series i thought were most credible as actual representations of global mean temperature, I’d put NCDC at the top and ECMWF close behind. GISTEMP has too many ad hoc steps for me, BEST is land only so it’s not even a global temperature reconstruction and anyway I’m not a big fan of the way they implemented kriging (assumption of radial symmetry is neither proven in their paper, nor likely to be true).

June 24, 2012 10:56 pm

Amino Acids in Meteorites says:
June 24, 2012 at 9:49 pm (Edit)
Carrick
You give the impression BEST is untouched raw data. But they had their method of handling the data. And the data is not untouched and pure.
##############################################
Let me see if I can help
using google.
You can use it to find things like this
http://berkeleyearth.org/data/
and the bottom of this page is a piece of english
“Source files
The source files we used to create the Berkeley Earth database are available in a common format here.”
That word “here” is a link.
it magically brings you to another place
http://berkeleyearth.org/source-files/
These are the source files. that means, these are the sources used to compile the dataset.
Now if you look at all of those datasets you will see some ( like CRUTEM4) that contain data series that are adjusted. and if you look at the sources for crutem4
http://www.cru.uea.ac.uk/cru/data/temperature/crutem4/station-data.htm
you will see other sources… and if you follow those sources down you find things like this
http://www.ec.gc.ca/dccha-ahccd/default.asp?lang=en&n=70E82601-1
which is derived from data like this
http://www.climate.weatheroffice.gc.ca/climateData/monthlydata_e.html?timeframe=3&Prov=BC&StationID=65&mlyRange=1920-01-01|2005-04-01&Year=1920&Month=01&Day=01
Got that?
Now the really hard part is taking all these disparate sources and creating a master file.
Because… that same source for CRU can also be a source for GHCN daily..
and that same source could be a source for GHCN v2 monthly back and GHCN v3
The process of resolving all the sources happens in the merge step. trust me
merging different datasets is not a fun job.
Maybe I’ll do a post on that. In the end there are a few sources that contribute huge
portions of the data. Ghcn daily is one of those. Since its daily data its about as “raw” as you get.. I think it may be helpful to folks to do an entire post on the various sources
used and how the final dataset is built. That’s a fair amount of work and I’ve already given months of time away writing and testing software that allows people to do that for themselves if they are truly interested.
All that said there is another initiative underway to compile another comprehensive dataset
http://www.surfacetemperatures.org/

June 24, 2012 11:00 pm

yes amino the link please..
link nanny, can you come and make amino link?

Carrick
June 24, 2012 11:04 pm

Amino:

If you want to talk cherry picking, as you did in a comment above where you told me to compare one area of data in BEST to the same area in another data set that’s fine. It is cherry picking. Not sure why you can’t see it is.

My final comment here, unless something interesting enough pops up.
Again this not cherry picking… comparing the same latitudes is an apples to apples comparison. (This allows us to compare how the algorithms changed the reconstruction, instead of how changes in geographical distribution changed the reconstruction. )

I think you are not open to any real discussion. It is clear you have your beliefs. You are not here to discuss but to tell us why you think people who don’t agree with you are wrong.

I am actually, anybody who knows me would tell you that. Don’t expect me to meekly agree that I’m wrong though, when you haven’t proven it or (for the last time) agree to inaccurate characterizations of my views, especially ones like the laundry list you’ve given above pulled apparently from thin air.
Bye.

Carrick
June 24, 2012 11:12 pm

Steven, as any astute observer will notice, BEST uses adjusted rather that raw data.
And as anybody who understands the issues with the data would recognize, is that properly adjusted data will give a more accurate picture of climate than unadjusted data. That doesn’t imply we take the adjustments carte blanche, but of course you haven’t. You’ve spent considerable effort (as has Zeke) understanding the effects of the adjustments on the global reconstructed temperature series.
I not claim BEST uses raw data, I would not claim that raw data is necessarily a better representation of climate than adjusted data, because it is land only, I would never mistake it for a global temperature series, and as I’ve said, some good ideas,it’s implementation has some details that need tested and possibly tweaked.

June 24, 2012 11:20 pm

EM
‘I noted one where I found 144 C for a station in the USA.
Now Steve is happy to say ~’but we can find that 144 C and change it to something valid’. I think “Hmmm… So we catch the 144 C, but do we catch the 40C that ought to be 38 C?””
wrong. To quote willis, where did I ever say that?
If I find 144C that value is DROPPED.
For the 40C that should be 38C? those errors can be caught as well. There is extensive literature on this where temperature series are “corrupted” with bogus values and then the algorithm is tested to see if it can find the error.

June 24, 2012 11:32 pm

Bruce
“The number 42 says nothing about climate. There is nothing unusual about our current climate. If anything the unusual part is how calm it appears and how small the fluctuations are from month to month.
Climate is chaotic and in many US states the last 5 years aren’t #1 or #2 and in some states it isn’t #3, or #4 either.”
The climate does not exist. Climate is a word people use to point to LONG TERM STATISTICS about weather.
“global warming” is not spatially uniform we dont expect it to be spatially uniform. we dont expect the whole globe to warm up at the same time or cool down at the same time.
that is WHY the LIA doesnt end on a given day. that is why is more severe in some places and less severe in others. you might even find a spot today that for the past 10 years was the same temp as it was in 1850. but generally over space and time the planet has been warming.
i
Is there anything unusual about the warming we have seen? First thats a ill posed question.
second its really not interesting. The question is can we explain the warming we see?
“natural variation” is not an explanation. It’s renaming the phenomema.
It’s the sun! is an explanation.
Its the sum of all forcing! is an explanation
Its not unusual, is not an explanation.
Its aerosols. is an explanation