The Big Valley: Altitude Bias in GHCN

Foreword: The focus of this essay is strictly altitude placement/change of GHCN stations. While challenge and debate of the topic is encouraged, please don’t let the discussion drift into other side issues. As noted in the conclusion, there remain two significant issues that have not been fully addressed in GHCN. I believe a focus on those issues (particularly UHI) will best serve to advance the science and understanding of what GHCN in its current form is measuring and presenting, post processing. – Anthony

Tibet valley, China. Image from Asiagrace.com - click for more info/poster

By Steven Mosher, Zeke Hausfather, and Nick Stokes

Recently on WUWT Dr. McKitrick raised several issues with regard to the quality of the GHCN temperature database. However, McKitrick does note that the methods of computing a global anomaly average are sound. That is essentially what Zeke Hausfather and I showed in our last WUWT post. Several independent researchers are able to  calculate the Global Anomaly Average with very little differences between them.

GISS, NCDC, CRU, JeffId/RomanM, Tamino, ClearClimateCode,  Zeke Hausfather, Chad Herman, Ron Broberg,  Residual Analysis, and MoshTemp all generally agree. Given the GHCN data, the answer one gets about the pace of global warming is not in serious dispute. Whether one extrapolates as GISS does or not, whether one uses a least squares approach or a spatial averaging approach, whether one selects a 2 degree bin or a 5 degree bin, whether one uses an anomaly period of 1961-90 or 1953-1982, the answer is the same for virtually all practical purposes. Debates about methodology are either a distraction from the global warming issues at hand or they are specialist questions that entertain a few of us. Those specialist discussions may refine the answer or express our confidence in the result more explicitly, but the methods all work and agree to a high degree.

As we noted before, the discussion should therefore turn and remain focused on the data issues. How good is GHCN as a database and how serious are its shortcomings? As with any dataset, those of us who analyze data for a living look for several things. We look for errors, we look for bias, we look at the sampling characteristics, and we look at adjustments.  Dr. McKitrick’s recent paper covers several topics relative to the make up and changes in GHCN temperature data. In particular he covers changes over time in the sampling of GHCN stations. He repeats a familiar note: over time the stations representing the temperature data set have changed. There is, as most people know, a fall off in stations reporting shortly after 1990 and then again in 2005. To be sure there are other issues that he raises as well. Those issues, such as UHI, will not be addressed here. Instead, the focus will be on one particular issue: altitude. We confine our discussion to that narrow point in order to remove misunderstandings and refocus the issue where it rightly belongs.

McKitrick writes:

Figure 1-8 shows the mean altitude above sea level in the GHCN record. The steady increase is consistent with a move inland of the network coverage, and also increased sampling in mountainous locations. The sample collapse in 1990 is clearly visible as a drop not only in numbers but also in altitude, implying the remote high-altitude sites tended to be lost in favour of sites in valley and coastal locations. This happened a second time in 2005. Since low-altitude sites tend to be more influenced by agriculture, urbanization and other land surface modification, the failure to maintain consistent altitude of the sample detracts from its statistical continuity.

There are several claims here.

  1. The increase in altitude is consistent with a move inland and out of valleys
  2. The increase in altitude is consistent with more sampling in mountainous locations.
  3. Low level sites tend to be influenced by agriculture, urbanization and other land use modifications

A simple study of the metadata available in the GHCN  database shows that the stations that were dropped do not have the characteristics that McKitrick supposes. As Nick Stokes documents, the process of dropping stations is more related to dropping coverage  in certain countries rather than a direct effort to drop high altitude stations . McKitrick also get the topography specifics wrong.  He supposes that the drop in thermometers shifts the data out of mountainous inland areas into the valleys and low level coastal areas, areas dominated by urbanization and land use changes. That supposition is not entirely accurate as a cursory look at the metadata shows.

There are two significant periods when stations are dropped; Post 1990 and again in 2005. As Stokes show below.

FIGURE 1: Station drop and average altitude of stations.

The decrease in altitude is not caused by a move into valleys, lowland and coastal areas. As the following figures show, the percentage of coastal stations is stable, mountainous stations are still represented and the altitude loss more likely comes from the move out of mountainous valleys .

A simple summary of the total inventory shows this

ALL STATIONS Count Total Percent
Coastal 2180 7280 29.95
Lake 443 7280 6.09
Inland 4657 7280 63.97

TABLE 1: Count of Coastal Stations

The greatest drop in stations occurs in the 1990-1995 period and the 2005 period, as shown above McKitrick supposes that the drop in altitude means a heavier weighting for coastal stations. The data do not support this

Dropped Stations 90-95 Count Total Percent
Coastal 487 1609 30.27
Lake 86 1609 5.34
Inland 1036 1609 64.39
Dropped in 2005-06
Coastal 104 1109 9.38
Lake 77 1109 6.94
Inland 928 1109 83.68

TABLE 2: Count of Coastal Stations dropped

The great march of the thermometers was not a trip to the beach. Neither was the drop in altitude the result of losing a higher percentage of  “mountainous” stations.

FIGURE 2: Distribution of Altitude for the entire GHCN Inventory

Minimum 1st Qu Median Mean 3rd Qu Max NA
-224.0 38.0 192.0 419.9 533.0 4670 142

TABLE 3: descriptive statistics for Altitude of the entire dataset

We can assess the claim about the march of thermometers down the mountains in two ways. First, by looking at the actual distribution of dropped stations.

FIGURE 3 Distribution of altitude for stations dropped in 1990-95

Minimum 1st Qu Median Mean 3rd Qu Max NA
-21.0 40.0 183.0 441 589.2 4613.0 29

TABLE 4:  Descriptive statistics for the Altitude of dropped stations

The character of stations dropped in the 2005 time frame are slightly different. That distribution is depicted below

FIGURE 4 Distribution of altitude for stations dropped in 2005-06

Minimum 1st Qu Median Mean 3rd Qu Max NA
–59 143.0 291.0 509.7 681.0 2763.0 0

TABLE 5:  Descriptive statistics for the Altitude of dropped stations 2005-06

The mean of those dropped is slightly higher than the average station. That hardly supports the contention of thermometers marching out of the mountains. We can put this issue to rest with the following observation from the metadata. GHCN metadata captures the topography surrounding the stations. There are four classifications FL, HI, MT and MV: flat, hilly, mountain and mountain valley. The table below hints at what was unique about the dropout.

Type Entire Dataset Dropped after90-95 Dropped 2005-06 Total of two major movements
Flat 2779 455 (16%) 504 (23%) 959 (43%)
Hilly 3006 688 (23%) 447 (15%) 1135 (38%)
Mountain 61 15 (25%) 3 (5%) 18 (30%)
Mountain Valley 1434 451(31%) 155 (11%) 606 (42%)

TABLE 6 Station drop out by topography type

There wasn’t shift into valleys as McKitrick supposes, but rather mountain valley sites were dropped.  Thermometers left the flatlands and the mountainous valleys. That resulted in a slight decrease in the overall altitude.

That brings us to McKitrick’s third critical claim. McKitrick claims that the dropping of thermometers over weights places more likely to suffer from urbanization and differential land use.  “Low level sites tend to be influenced by agriculture, urbanization and other land use modifications.” The primary concern that Dr. McKitrick voices is that the statistical integrity of the data may have been compromised. That claim needs to be turned into a testable hypothesis. What exactly has been compromised? We can think of two possible concerns. The first concern is that by dropping higher altitude mountain valley stations one is dropping stations that are colder. Since temperature decreases with altitude this would seem to be a reasonable concern. However, it is not. Some people make this claim, but McKitrick does not. He doesn’t because he is aware that the anomaly method prevents this kind of bias. When we create a global anomaly we prevent this kind of bias from entering the calculation by scaling the measurements of station by the mean of that station. Thus, a station located at 4000m may be at -5C, but if that station is always at -5C its anomaly will be zero. Likewise, a station at sea level in Death Valley that is constantly 110F will also have an anomaly of zero. Anomaly captures the departure from the mean of that station.

What this means is that as long as high altitude stations warm or cool at the same rate as low altitude stations, removing them or adding them will not bias the result.

To answer the question of whether dropping or adding higher altitude stations impacts the trend we have several analytical approaches. First, we could add back in stations. But we can’t add back in GHCN stations that were discontinued. The alternative is to add stations from other databases.  Those studies indicate that adding addition stations does not change the trends:

http://www.yaleclimatemediaforum.org/2010/08/an-alternative-land-temperature-record-may-help-allay-critics-data-concerns/

http://moyhu.blogspot.com/2010/07/using-templs-on-alternative-land.html

http://moyhu.blogspot.com/2010/07/arctic-trends-using-gsod-temperature.html

http://moyhu.blogspot.com/2010/07/revisiting-bolivia.html

http://moyhu.blogspot.com/2010/07/global-landocean-gsod-and-ghcn-data.html

The other approach is to randomly remove more stations from GHCN and measure the effect. If we fear that GHCN has biased the sample by dropping higher altitude stations, we can drop more stations and measure the effect. There are two ways to do this. A Monte Carlo approach and an approach that divides the existing data into subsets:

Nick Stokes has conducted the Monte Carlo experiments. In his approach stations are randomly removed  and global averages are recomputed. Stations were removed based on a randomization approach that preferentially removed high altitude stations. This test gives us an estimate of the Standard Error as well.

Period Trend of All Re-Sampled s.d
1900-2009 0.0731 0.0723 0.00179
1979-2009 0.2512 0.2462 0.00324
Mean Altitude 392m 331m

Table 7 Monte Carlo test of altitude sensitivity

This particular test consists of selecting all the stations whose series end after 1990. There are 4814 such stations. The sensitivity to altitude reduction was performed by randomly removing higher altitude stations. The results indicate little to no interaction between altitude and temperature trend in the very stations end after the 1990 period.

The other approach, dividing the sample, was approached in two different ways by Zeke Hausfather and Steven Mosher. Hausfather, approached the problem using a paired approach. Grid cells are selected for processing if the have stations both above and below 300m. This eliminates cells that are represented by a single station.  Series are then constructed for the stations that lie above 300m and below 300m.

Period Elevation > 300m Elevation <300m
1900-2009 .04 .05
1960-2009 .23 .19
1978-2009 .34 .28

Table 8. Comparison of trend versus altitude for paired station testing

FIGURE 5: Comparison of temperature Anomaly for above mean and below mean stations

This test indicates that higher elevation stations tend to see higher rates of warming rather than lower rates of warming. Thus, dropping them, does not bias the temperature record upward. The concern lies in the other direction. If anything the evidence points to this: dropping higher altitude stations post 1990 has lead to a small underestimation of the warming trend.

Finally, Mosher, extending the work of Broberg tested the sensitivity of altitude by dividing the existing sample in the following way, by raw altitude and by topography.

  1. A series containing all stations.
  2. A series of lower altitude stations Altitude < 200m
  3. A series of higher altitude stations Altitude >300m
  4. All Stations in Mountain Valleys
  5. A series of stations at very high altitude. Altitude >400m

The results of that test are shown below

FIGURE 6 Global anomaly.  Smoothing performed for display purpose only with a 21 point binomial  filter

The purple series is the highest altitude stations. The red series lower elevation series. Green is the mountain valley stations. A cursory look at the “trend” indicates that the higher elevation stations warm slightly faster than the lower elevation, confirming Hausfather. Dropping higher elevation stations, if it has any effect whatsoever works to lower the average.  Stations at lower altitudes tend to warm less rapidly than stations at higher elevations. So quite the opposite of what people assume, the dropping of higher altitude stations is more likely to underestimate the warming rather than over estimate the warming.

Conclusion:

The distribution of altitude does change with time in GHCN v2.mean data. That change does not signal a march of thermometers to places with higher rates of warming. The decrease in altitude is not associated with a move toward or away from coasts. The decrease is not clearly associated with a move away mountainous regions and into valleys, but rather a movement out of mountain valley and flatland regions. Yet, mountain valleys do not warm or cool in any differential manner. Changing altitude does not bias the final trends in any appreciable way.

Regardless of the differential characteristics associated with higher elevation, changes in temperature trends is not clearly or demonstrably one of them.  For now, we have no  evidence whatsoever that marching thermometers up and down hills makes any contribution to a overestimation of the warming trend.

Dr. McKitrick presented a series of concerns with GHCN. We have eliminated the concern over changes in the distribution of altitude. That merits a correction to his paper. The concerns he raised about latitude, and airports and UHI will be addressed in forthcoming pieces. Given the preliminary work done on airports. (and here) and latitude to date, we can confidently say that the entire debate will come down to two basic issues: UHI and adjustments, the issues over latitude changes and sampling at airports will fold into those discussions. So, here is where the debate stands. The concerns that people have had about methodology have been addressed. As McKitrick notes, the various independent methods get the same answers. The concern about altitude bias has been addressed. As we’ve argued before, the real issue with temperature series is the metadata, its related microsite and UHI issues and adjustments made prior to entry in the GHCN database.

Special thanks to Ron Broberg for editorial support.

References:

A Critical Review of Global Surface Temperature Data Products. Ross McKitrick, Ph.D. July 26, 2010

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

161 Comments
Inline Feedbacks
View all comments
Jeff
August 19, 2010 10:53 am

I think it is clear that the adjustments done to the raw data is where the devil in the details rests. I have yet to see a reasonable UHI adjustment in the dozens of station in GISS that I have looked into. In all too many cases I have seen older records adjusted down (?) and newer records adjusted down by lesser amounts. Very few locations have experienced de-urbanization … the UHI adjustments should be increasing from old to new not the other way around.

BBD
August 19, 2010 10:58 am

Minor typo, para 4 sentence 2:
‘How good is GHCN as a database and how serious are it’s shortcomings?’
‘it’s’ should be ‘its’. Wouldn’t normally bother but the quality of the writing is good enough to make the error stand out.
I’m still reading, but with the Nepal business fresh in mind I am most interested in the subject of this essay (which I have been awaiting since Mr Mosher dropped hints in the comments over at Lucia’s).
My thanks to all three authors for their ongoing efforts, both for their own sake and for the example they set to others, here and elsewhere.
Dominic
[Thanx, fixed. ~dbs, mod.]

August 19, 2010 11:02 am

C James,
For better or worse, pre-1970s GHCN v2.mean is by and large the only “raw” data available to use. Post-1970s we have been playing around with using GSOD/ISH and other alternative datasets, though efforts so far indicate that over broad geographic regions they give results similar to GHCN.

Slabadang
August 19, 2010 11:03 am

You can easy recognize an honest approach when you read it!
Im catching up on details and I really appriciate the “tone” and the willingness invitation to get to the bottom of whats “allmost settled” and not.Its obviuos who will take responsability or not to gain end deserve trust! There are some aspekts and issues in the comments above that I think is intrseting to adress.The “correlation” or Coincidence” that the temprecord year was the same year a big drop in high altitude temps where dropped ?

Admin
August 19, 2010 11:19 am

“Skeptics should not hope for more than a .15C adjustment”
Steve, that’s premature.

Bill Illis
August 19, 2010 11:24 am

The lapse rate is 6.5C per km.
The average temperature in a grid box, for example, would therefore increase by 0.65C for each decline in the average altitude of just 100 metres. Your numbers are showing changes of that magnitude.
Now to the extent that each individual station is measured/calculated based on its individual anomaly only, this shouldn’t matter. But if it isn’t done strictly by station, anomaly only, it will make a very large difference.
[On the other hand, there is also some evidence that temperatures are increasing faster at the surface than higher up in the troposphere and, therefore, the lapse rate profile is also changing by altitude which will also influence the trend, even in a strict station anomaly only calculation].

Rex from NZ
August 19, 2010 11:26 am

Can someone clarify two things for me: (1) How many times a day is
the temperature recorded for the stations (in general), and what has
been used to determine the frequency and choice of time, and (2) How
is it known, or established, or agreed, as to what area of land is represented
by each station. Surely this latter is critical, because the area the station
represents needs to be known so that the proper weight can be applied
when working out the mean. It is the mean temperature per sq kilometre
that is surely the important thing, not just the mean per se.
Or am I way off the track here?

Tim
August 19, 2010 11:27 am

where did Table 6 come from?
Why do these simple to compute figures appear wrong? – and apparently wrong in a way we’ve come to expect!

BillD
August 19, 2010 11:47 am

BillyBob says:
August 19, 2010 at 10:21 am
” Given the GHCN data, the answer one gets about the pace of global warming is not in serious dispute. ”
Considering that all of the GHCN anomaly caluclators use the mean:
1) The min could be going up
2) The max could be going up
3) Or it could be a combination of both
Having looked at the raw GHCN data, I can say the max is not going up. It is the min.
Therefore it is UHI.
BillyBob:
You are correct that the min is going up much faster than the max. One region where this is especially strong is in the Swiss Alps. One problem with your conclusion is that green house gases are expected to reduce night-time cooling and to have a greater effect on the min rather than the max temperature. In the arctic and the in the mountains, the green house effect seems a more likely explanation than UHI for stronger increases in the min compared to the max temperature.

latitude
August 19, 2010 11:50 am

“The greatest drop in stations occurs in the 1990-1995 period”
It’s about 1990-1993, which is exactly where the hockey stick takes off.
I don’t think it’s coincidence at all and even though you guys did a “random” test,
I don’t think dropping the stations was random.
I’d still like to see someone look at the ‘trends’ from the stations that were dropped.
I would be willing to bet there was a reason for dropping those stations.
There is just too much coincidence that many stations were dropped,
and immediately after that, we had catastrophic unprecedented global warming.

david
August 19, 2010 12:24 pm

Mosh wrote:
“I ask you to step on scale in 1950. You weigh 175 lbs.
I measure you every year until 2010. By 2010 you weigh 225 lbs.
No, by 2010 he weighs 225 lbs after his weight has been adjusted. We don’t now what he actually weighs in 2010 – that’s the problem.

GeoChemist
August 19, 2010 12:33 pm

“to disprove a case that hasn’t been made”….Now there’s the scientific method in action….nothing like a nice bias to help design your study……and besides, you are studying surface temperature measurements which is a crappy metric for “global warming”. And I used to think you were open-minded.

latitude
August 19, 2010 1:10 pm

Steve, you’re going to have to talk English, without inflections, to this biologist.
I didn’t question the data. I accepted the data.
I asked you a question about the collection of the data.
You can trust the data, but not trust the people that collect it, especially when they drop this many stations for no apparent reason.
I would like to see someone look at each station that was dropped, each individual station, and see what the trends were for that individual station.
For my money, it is too much of a coincidence that right after those stations were dropped, we had catastrophic unprecedented global warming.
Does not pass the sniff test….

pyromancer76
August 19, 2010 1:17 pm

I have tried to follow Anthony’s admonition: keep to altitude placement and change of GHCN stations. Sorry, can’t. Too much “hail fellow well met” attitude on the part of those who want climate pseudo-scientists to be scientists. They aren’t, and under the current conditions of their pseudo-field never will be.
S. Mosher writes, “GISS, NCDC, CRU, JeffId/RomanM, Tamino, ClearClimateCode, Zeke Hausfather, Chad Herman, Ron Broberg, Residual Analysis, and MoshTemp all generally agree. Given the GHCN data, the answer one gets about the pace of global warming is not in serious dispute.”
Give us back the RAW DATA and openly show all analysis and then we can talk. Until then, Garbage In, Garbage Out. The current “climate” data has no scientific use since it has not been gathered scientifically. And no claims that using anomalies wipes out significant problems are valid. In my opinion, politeness is turning into cover for lying.
(By the way, when has there been a serious dispute that the climate has warmed since the Little Ice Age?)
When I understand the issues on Climate Audit (only sometimes) and I read the comments by Stephen Mosher, I am impressed. Not so today by any of the current arguments. (Hope this posts properly; I hit a key and the formatting changed)

GeoChemist
August 19, 2010 1:18 pm

You misunderstood my comment as I was referring to your case about airport testing where you seem to think you already know the answer before even testing a hypothesis. Will you admit it if you are wrong? But my main complaint is that you are equating the temperature trend estimates with “the pace of global warming”. I am sorry but not matter how you slice and dice the surface trends it will not provide sufficient rationale to make conclusions about CO2 driven AGW. You equated the two in the opening paragraph using the term “global warming” instead of surface temperature anomaly. The temperature anomalies, even if not contaminated by UHI or biased in any way are influenced regionally by a variety of first-order forcings (i.e land-use, etc.) and thus are not equivalent to the effect of CO2. So my bitch is the claim that surface temperature anomalies equal global warming.

899
August 19, 2010 1:21 pm

All of that aside, I have to enquire: Why drop stations at all?
You see? As I see things, with fewer stations there arrives that neat ability to extrapolate temperatures to a wider area, even when those extrapolations are grossly inaccurate.
What’s worse? Those dropped stations are seemingly completely ignored where —if one were to evaluate matters on a quality level— those dropped stations could well be used to verify the extrapolations.
You say you don’t want to discuss UHI, but isn’t that really the big elephant in the room which you and others are seemingly going out of your way to ignore?
In your attempt to narrowly focus on only one aspect, you instead create even more doubt in people’s minds regarding your motivations for doing such.

ZZZ
August 19, 2010 1:24 pm

That the min temperatures are increasing (while the max temperatures are not increasing) really undermines the alarmists’ claims that global warming is bad for us. I don’t think even wild animals are going to miss the lower night-time low temperatures, and it’s almost certainly good news for humanity. We can expect longer growing seasons, more places to live comfortably, even less fuel burned to stay warm overnight in cold climates. Of course that last suggested benefit is doubtful if it’s all a UHI effect — stop burning fuel overnight and the lower overnight min temperatures will tend to return.

latitude
August 19, 2010 1:25 pm

You do realize until you un-moderate my last post, and post it, it looks like you’re talking to yourself. 😉
Reply: Mosh doesn’t moderate, but he has editing privileges which allows him to see unapproved comments even if he doesn’t realize that. ~ ctm

jorgekafkazar
August 19, 2010 1:28 pm

Steven Mosher says: …5.UHI: may not be possible. UHI: Depends ENTIRELY on the definition of Rural, that is metadata…”
You appear to be dismissing the UHI issue with a wave of your hand, here. But there must be some approximate way to quantify “non-rural” or to at least identify the location parameters that bias urban-measured temperatures.
But I’m not buying any of this. The entire notion of a global temperature as discussed here is nonsense. Any system that ignores atmospheric enthalpy and fails to account for the 1000 times larger oceanic heat sink is an exercise in futility.