The Big Valley: Altitude Bias in GHCN

Foreword: The focus of this essay is strictly altitude placement/change of GHCN stations. While challenge and debate of the topic is encouraged, please don’t let the discussion drift into other side issues. As noted in the conclusion, there remain two significant issues that have not been fully addressed in GHCN. I believe a focus on those issues (particularly UHI) will best serve to advance the science and understanding of what GHCN in its current form is measuring and presenting, post processing. – Anthony

Tibet valley, China. Image from Asiagrace.com - click for more info/poster

By Steven Mosher, Zeke Hausfather, and Nick Stokes

Recently on WUWT Dr. McKitrick raised several issues with regard to the quality of the GHCN temperature database. However, McKitrick does note that the methods of computing a global anomaly average are sound. That is essentially what Zeke Hausfather and I showed in our last WUWT post. Several independent researchers are able to  calculate the Global Anomaly Average with very little differences between them.

GISS, NCDC, CRU, JeffId/RomanM, Tamino, ClearClimateCode,  Zeke Hausfather, Chad Herman, Ron Broberg,  Residual Analysis, and MoshTemp all generally agree. Given the GHCN data, the answer one gets about the pace of global warming is not in serious dispute. Whether one extrapolates as GISS does or not, whether one uses a least squares approach or a spatial averaging approach, whether one selects a 2 degree bin or a 5 degree bin, whether one uses an anomaly period of 1961-90 or 1953-1982, the answer is the same for virtually all practical purposes. Debates about methodology are either a distraction from the global warming issues at hand or they are specialist questions that entertain a few of us. Those specialist discussions may refine the answer or express our confidence in the result more explicitly, but the methods all work and agree to a high degree.

As we noted before, the discussion should therefore turn and remain focused on the data issues. How good is GHCN as a database and how serious are its shortcomings? As with any dataset, those of us who analyze data for a living look for several things. We look for errors, we look for bias, we look at the sampling characteristics, and we look at adjustments.  Dr. McKitrick’s recent paper covers several topics relative to the make up and changes in GHCN temperature data. In particular he covers changes over time in the sampling of GHCN stations. He repeats a familiar note: over time the stations representing the temperature data set have changed. There is, as most people know, a fall off in stations reporting shortly after 1990 and then again in 2005. To be sure there are other issues that he raises as well. Those issues, such as UHI, will not be addressed here. Instead, the focus will be on one particular issue: altitude. We confine our discussion to that narrow point in order to remove misunderstandings and refocus the issue where it rightly belongs.

McKitrick writes:

Figure 1-8 shows the mean altitude above sea level in the GHCN record. The steady increase is consistent with a move inland of the network coverage, and also increased sampling in mountainous locations. The sample collapse in 1990 is clearly visible as a drop not only in numbers but also in altitude, implying the remote high-altitude sites tended to be lost in favour of sites in valley and coastal locations. This happened a second time in 2005. Since low-altitude sites tend to be more influenced by agriculture, urbanization and other land surface modification, the failure to maintain consistent altitude of the sample detracts from its statistical continuity.

There are several claims here.

  1. The increase in altitude is consistent with a move inland and out of valleys
  2. The increase in altitude is consistent with more sampling in mountainous locations.
  3. Low level sites tend to be influenced by agriculture, urbanization and other land use modifications

A simple study of the metadata available in the GHCN  database shows that the stations that were dropped do not have the characteristics that McKitrick supposes. As Nick Stokes documents, the process of dropping stations is more related to dropping coverage  in certain countries rather than a direct effort to drop high altitude stations . McKitrick also get the topography specifics wrong.  He supposes that the drop in thermometers shifts the data out of mountainous inland areas into the valleys and low level coastal areas, areas dominated by urbanization and land use changes. That supposition is not entirely accurate as a cursory look at the metadata shows.

There are two significant periods when stations are dropped; Post 1990 and again in 2005. As Stokes show below.

FIGURE 1: Station drop and average altitude of stations.

The decrease in altitude is not caused by a move into valleys, lowland and coastal areas. As the following figures show, the percentage of coastal stations is stable, mountainous stations are still represented and the altitude loss more likely comes from the move out of mountainous valleys .

A simple summary of the total inventory shows this

ALL STATIONS Count Total Percent
Coastal 2180 7280 29.95
Lake 443 7280 6.09
Inland 4657 7280 63.97

TABLE 1: Count of Coastal Stations

The greatest drop in stations occurs in the 1990-1995 period and the 2005 period, as shown above McKitrick supposes that the drop in altitude means a heavier weighting for coastal stations. The data do not support this

Dropped Stations 90-95 Count Total Percent
Coastal 487 1609 30.27
Lake 86 1609 5.34
Inland 1036 1609 64.39
Dropped in 2005-06
Coastal 104 1109 9.38
Lake 77 1109 6.94
Inland 928 1109 83.68

TABLE 2: Count of Coastal Stations dropped

The great march of the thermometers was not a trip to the beach. Neither was the drop in altitude the result of losing a higher percentage of  “mountainous” stations.

FIGURE 2: Distribution of Altitude for the entire GHCN Inventory

Minimum 1st Qu Median Mean 3rd Qu Max NA
-224.0 38.0 192.0 419.9 533.0 4670 142

TABLE 3: descriptive statistics for Altitude of the entire dataset

We can assess the claim about the march of thermometers down the mountains in two ways. First, by looking at the actual distribution of dropped stations.

FIGURE 3 Distribution of altitude for stations dropped in 1990-95

Minimum 1st Qu Median Mean 3rd Qu Max NA
-21.0 40.0 183.0 441 589.2 4613.0 29

TABLE 4:  Descriptive statistics for the Altitude of dropped stations

The character of stations dropped in the 2005 time frame are slightly different. That distribution is depicted below

FIGURE 4 Distribution of altitude for stations dropped in 2005-06

Minimum 1st Qu Median Mean 3rd Qu Max NA
–59 143.0 291.0 509.7 681.0 2763.0 0

TABLE 5:  Descriptive statistics for the Altitude of dropped stations 2005-06

The mean of those dropped is slightly higher than the average station. That hardly supports the contention of thermometers marching out of the mountains. We can put this issue to rest with the following observation from the metadata. GHCN metadata captures the topography surrounding the stations. There are four classifications FL, HI, MT and MV: flat, hilly, mountain and mountain valley. The table below hints at what was unique about the dropout.

Type Entire Dataset Dropped after90-95 Dropped 2005-06 Total of two major movements
Flat 2779 455 (16%) 504 (23%) 959 (43%)
Hilly 3006 688 (23%) 447 (15%) 1135 (38%)
Mountain 61 15 (25%) 3 (5%) 18 (30%)
Mountain Valley 1434 451(31%) 155 (11%) 606 (42%)

TABLE 6 Station drop out by topography type

There wasn’t shift into valleys as McKitrick supposes, but rather mountain valley sites were dropped.  Thermometers left the flatlands and the mountainous valleys. That resulted in a slight decrease in the overall altitude.

That brings us to McKitrick’s third critical claim. McKitrick claims that the dropping of thermometers over weights places more likely to suffer from urbanization and differential land use.  “Low level sites tend to be influenced by agriculture, urbanization and other land use modifications.” The primary concern that Dr. McKitrick voices is that the statistical integrity of the data may have been compromised. That claim needs to be turned into a testable hypothesis. What exactly has been compromised? We can think of two possible concerns. The first concern is that by dropping higher altitude mountain valley stations one is dropping stations that are colder. Since temperature decreases with altitude this would seem to be a reasonable concern. However, it is not. Some people make this claim, but McKitrick does not. He doesn’t because he is aware that the anomaly method prevents this kind of bias. When we create a global anomaly we prevent this kind of bias from entering the calculation by scaling the measurements of station by the mean of that station. Thus, a station located at 4000m may be at -5C, but if that station is always at -5C its anomaly will be zero. Likewise, a station at sea level in Death Valley that is constantly 110F will also have an anomaly of zero. Anomaly captures the departure from the mean of that station.

What this means is that as long as high altitude stations warm or cool at the same rate as low altitude stations, removing them or adding them will not bias the result.

To answer the question of whether dropping or adding higher altitude stations impacts the trend we have several analytical approaches. First, we could add back in stations. But we can’t add back in GHCN stations that were discontinued. The alternative is to add stations from other databases.  Those studies indicate that adding addition stations does not change the trends:

http://www.yaleclimatemediaforum.org/2010/08/an-alternative-land-temperature-record-may-help-allay-critics-data-concerns/

http://moyhu.blogspot.com/2010/07/using-templs-on-alternative-land.html

http://moyhu.blogspot.com/2010/07/arctic-trends-using-gsod-temperature.html

http://moyhu.blogspot.com/2010/07/revisiting-bolivia.html

http://moyhu.blogspot.com/2010/07/global-landocean-gsod-and-ghcn-data.html

The other approach is to randomly remove more stations from GHCN and measure the effect. If we fear that GHCN has biased the sample by dropping higher altitude stations, we can drop more stations and measure the effect. There are two ways to do this. A Monte Carlo approach and an approach that divides the existing data into subsets:

Nick Stokes has conducted the Monte Carlo experiments. In his approach stations are randomly removed  and global averages are recomputed. Stations were removed based on a randomization approach that preferentially removed high altitude stations. This test gives us an estimate of the Standard Error as well.

Period Trend of All Re-Sampled s.d
1900-2009 0.0731 0.0723 0.00179
1979-2009 0.2512 0.2462 0.00324
Mean Altitude 392m 331m

Table 7 Monte Carlo test of altitude sensitivity

This particular test consists of selecting all the stations whose series end after 1990. There are 4814 such stations. The sensitivity to altitude reduction was performed by randomly removing higher altitude stations. The results indicate little to no interaction between altitude and temperature trend in the very stations end after the 1990 period.

The other approach, dividing the sample, was approached in two different ways by Zeke Hausfather and Steven Mosher. Hausfather, approached the problem using a paired approach. Grid cells are selected for processing if the have stations both above and below 300m. This eliminates cells that are represented by a single station.  Series are then constructed for the stations that lie above 300m and below 300m.

Period Elevation > 300m Elevation <300m
1900-2009 .04 .05
1960-2009 .23 .19
1978-2009 .34 .28

Table 8. Comparison of trend versus altitude for paired station testing

FIGURE 5: Comparison of temperature Anomaly for above mean and below mean stations

This test indicates that higher elevation stations tend to see higher rates of warming rather than lower rates of warming. Thus, dropping them, does not bias the temperature record upward. The concern lies in the other direction. If anything the evidence points to this: dropping higher altitude stations post 1990 has lead to a small underestimation of the warming trend.

Finally, Mosher, extending the work of Broberg tested the sensitivity of altitude by dividing the existing sample in the following way, by raw altitude and by topography.

  1. A series containing all stations.
  2. A series of lower altitude stations Altitude < 200m
  3. A series of higher altitude stations Altitude >300m
  4. All Stations in Mountain Valleys
  5. A series of stations at very high altitude. Altitude >400m

The results of that test are shown below

FIGURE 6 Global anomaly.  Smoothing performed for display purpose only with a 21 point binomial  filter

The purple series is the highest altitude stations. The red series lower elevation series. Green is the mountain valley stations. A cursory look at the “trend” indicates that the higher elevation stations warm slightly faster than the lower elevation, confirming Hausfather. Dropping higher elevation stations, if it has any effect whatsoever works to lower the average.  Stations at lower altitudes tend to warm less rapidly than stations at higher elevations. So quite the opposite of what people assume, the dropping of higher altitude stations is more likely to underestimate the warming rather than over estimate the warming.

Conclusion:

The distribution of altitude does change with time in GHCN v2.mean data. That change does not signal a march of thermometers to places with higher rates of warming. The decrease in altitude is not associated with a move toward or away from coasts. The decrease is not clearly associated with a move away mountainous regions and into valleys, but rather a movement out of mountain valley and flatland regions. Yet, mountain valleys do not warm or cool in any differential manner. Changing altitude does not bias the final trends in any appreciable way.

Regardless of the differential characteristics associated with higher elevation, changes in temperature trends is not clearly or demonstrably one of them.  For now, we have no  evidence whatsoever that marching thermometers up and down hills makes any contribution to a overestimation of the warming trend.

Dr. McKitrick presented a series of concerns with GHCN. We have eliminated the concern over changes in the distribution of altitude. That merits a correction to his paper. The concerns he raised about latitude, and airports and UHI will be addressed in forthcoming pieces. Given the preliminary work done on airports. (and here) and latitude to date, we can confidently say that the entire debate will come down to two basic issues: UHI and adjustments, the issues over latitude changes and sampling at airports will fold into those discussions. So, here is where the debate stands. The concerns that people have had about methodology have been addressed. As McKitrick notes, the various independent methods get the same answers. The concern about altitude bias has been addressed. As we’ve argued before, the real issue with temperature series is the metadata, its related microsite and UHI issues and adjustments made prior to entry in the GHCN database.

Special thanks to Ron Broberg for editorial support.

References:

A Critical Review of Global Surface Temperature Data Products. Ross McKitrick, Ph.D. July 26, 2010

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
161 Comments
Inline Feedbacks
View all comments
apocryphic
August 19, 2010 9:21 pm

For a period of nearly 35 years high altitude stations appeared to indicate cooler temperatures. What causes the bias oscillation in Figure 5?

August 19, 2010 9:39 pm

Wow. Lot of work involved in that analysis. Thanks for sharing your assessment.
I have a rather fundamental question about GHCN data series. When stations were dropped as in 1990 – 1995 were the historical temperature values/anomalies for those locations purged/excluded from data sets used to compute global average temperatures or temperature anomalies? That is, is the GHCN data series downsized backward in time as well as going forward? If so, then it seems to me your conclusion that, “Changing altitude does not bias the final trends in any appreciable way” is not strongly supported by your analysis.
Table 5 shows the average anomalies reported for stations with altitude of 300 from the early 1950s till the mid-1980s. The divergence in those means is frequently 0.1 degree or more. From 1998 forward, the relationship between the two anomaly series reverses and values for stations 300 meters. Tables 4 and 5 report that more than half of stations dropped from GHCN in the 1990s and in 2005-2006 were positioned at altitude of less than 300 meters. That majority-or-better weighting factor for <300 meter altitude stations together with the relative temperature anomaly patterns observed over time suggests dropping stations from GHCN contributed to steeping of the post-1950 temperature anomaly trend through 1998 and retarded the decline in temperature anomaly trend post-1998.
FWIW

Darren Parker
August 19, 2010 9:54 pm

What I would really love to see is a study on diurnal variations from overnight lows to mid-day highs and how they have changed over the years.

Amino Acids in Meteorites
August 19, 2010 9:55 pm

latitude says:
August 19, 2010 at 8:47 am
When you’re talking about reading temps in 1/100ths and 1/10ths of a degree,
I see you’re talking about the same thing I am, 1/100ths and 1/10ths of a degree. I see it in comments now. I hope no one thinks I copied you.
We are probably seeing similar things. The focus on anomaly is a distraction to the real issue which is temperature. Anomalies can look virtually the same and be called “virtually the same”. But that is misleading. The uninformed person will think nothing is wrong between the data sets when hearing that, for example, that GISS and NOAA are not different than any other data set.
But when you talk about temperatures having only a 1/100th to 1/10th degrees difference, as in the GISS set, between being the hottest ever or not being the hottest ever, and that 1/10th to 1/100th of a degree is the only thing needed to make the media, and people like James Hansen, talk about dangerous global warming, then the average person will know global warming alarm is ridiculous.

Amino Acids in Meteorites
August 19, 2010 10:11 pm

E.M.Smith says:
August 19, 2010 at 10:52 am
So, IMHO, we have crappy data and get crappy results from it. Admiring the uniformity of the crappiness does not yield much comfort.
LOL!
I know you were trying to make a serious point. But it’s a good thing there was no milk in my mouth at the time. 😉

Dan Murphy
August 19, 2010 10:16 pm

Steven,
Thanks for the several posts. If I correctly understand your various points and references, the rating system was adapted from one in use by the French, and the error range stated for each CRN category was the maximum range of error, and not the mean error. And that a station may change rating category through the seasons due to environmental factors, say proximity to trees, which would tend to bias summer temperatures, but less so the winter ones. Reasonable enough, but note that I did not assert a possible error in the +2C range, but more in the range of 3-5 times +.15C, or +.45C to +.75C.
Based upon what you’ve shared with me, it seems clear that without knowing the mean error of the stations in a given CRN category, simply knowing what percent of stations fall in a particular category doesn’t produce a meaningful answer as to the temperature bias in the records. Dr. Leroy’s associate showed a +.1-+.15 error range over CRN2-CRN4 stations, but that was without the inclusion of CRN5 stations. You mentioned a study with a too small sample size which suggested that error range MIGHT be too conservative, and the only way to tell was to gather the data. Which is what was done by Anthony’s Surface Stations project.
One thing seems clear to me from ChristianP’s comments and from the sites I surveyed for the Surface Stations project, is that the status of stations needs to be constantly monitored, and that adjustments need to be regularly updated to reflect current conditions at the site. For example, I was told that the shed at the Telluride site had been built there about 11 months earlier, and that the City had put it there without asking or telling NWS manager for that site. Yet when he found out about it, nothing was done about the station until we did the site survey, and then they immediately pulled the station from that location entirely.
ChristianP also mentioned they were using a modern Stevenson Screen (in two sizes, I take it) and that reminds me that each site I surveyed had been “upgraded” to the MMTS from the Stevenson screens originally in place. In each case the placement of the MMTS was MUCH closer to buildings and other structures which would bias the temperature record upward, if not properly adjusted for. The MMTS has a data cable which must be run inside a building to a data recorder. The cable must be put in a trench. Someone must dig the trench. With a hand shovel! Naturally, most trenches were not dug very far from the structure the data recorder was put in. Not supposed to be that way, but try digging a 50 foot long trench in mountainous soil with a hand shovel sometime. (Much less out to the 100 foot standard!) I’ve looked at other surveys at the Surface Stations project, and this seems to be pretty common for other areas of the country as well. But, this shouldn’t necessarily make a station unfit to use if adjustments for each site are properly and regularly done, but that’s one of the real questions, isn’t it?
Steven, like yourself, I’m willing to be convinced with the data. In the absence of good and convincing data, my default position is that the null hypothesis is preferable: That any warming in the record can be explained by natural climate variations and observation errors. My sincere thanks for your ongoing contributions here at WUWT.
Dan

Rex from NZ
August 19, 2010 10:38 pm

I’ll repeat an earlier question, and a new one.
People use the term GRINS … what does it mean?
And what does UHI mean?
Thanks.

Amino Acids in Meteorites
August 19, 2010 10:55 pm

Rex from NZ says:
August 19, 2010 at 10:38 pm
And what does UHI mean?
Urban Heat Island. Look at the top of the page. There is a “Glossary” tab.

Alan Sutherland
August 19, 2010 10:55 pm

I am with GeoChemist and Tallbloke. The implicit asumption behind this post is to “prove” McKitrick is mostly wrong and that temperatures match the “pace” of “global warming” with all the connotations that its man made, going to be catastrophic and it is all because of CO2. I don’t buy this package either. I live in NZ and here we have “adjusted” temperatures which match “adjusted” CO2 to prove CAGW. As a consequence, we have an emissions trading scheme. There are very few real scientists who are not taking an advocacy position.
Nick Stokes is a confirmed warmer and spouts on about the long term – “since records began” – trends, especially the Arctic. I can agree that some temperatures have gone up a bit over the last 100 years – so what? There are places where the temperature has gone down. If CO2 causes warming how can this be? Why must we use averages to hide those places declining. If physics can explain CO2’s effect on global temperatures, how come this physical effect has not happened since the start of the century? Arm waving that it doesn’t have to happen straight away is all I hear. If I drop a weight, I expect it to fall to the ground. If it doesn’t drop to the ground until one hour later, surely there would need to be a new law to explain it – not arm waving.
The Arctic is another example. I’m told the long term trend is down, ice free by any time between 2010 and 2020. The trend is not long term – just since satelites began measuring. Nick believes he can ignore earlier records because that trend would show growth in ice at the arctic and Greenland.
Glaciers are melting, oh no. In New Zealand all of the large river valleys were formed by glaciers which no longer exist. The Haast River Valley is vast – the glacier gone and not because of CO2 or global warming. Please explain this in a way which shows that glaciers no longer melt unless caused by AGW.
The sea will engulf Tuvalu, oh no. But the Pacific ocean is not rising. The latest news in Australia is the Murray Darling river is drying up. Scientists are worried about the future effects of global warming on its recovery. Everyone here knows that excessive water from the Murray Darling was taken for irrigation projects, so why blame GW, let alone AGW or for that matter CAGW. No – scientists are saying that the recovery may not happen because of CO2. There again, the recovery could happen when excessive water take is stopped. Might, could, possibly, likely and so on.
I get sick of being manipulated. I am sick of hearing when its hot its Climate Change, when it snows its Climate Change, when it is merely cold its weather. I know a weather forecast more than a few days out is unreliable, but someone tells me that a climate scientist can predict long term weather. How many times have they done this accurately? Zero. When they have correctly predicted two consecutive 30 year climates, I could be more attentive – but I will never see it in my lifetime.
And the punchline is the problem can be solved if I pay between $10,000 and $100,000 a year to somebody called Al Gore and partners. Zeech.
Alan

davidmhoffer
August 19, 2010 10:57 pm

Steven Mosher says:
“Wow. The temperature is FLAT. that is one of the points of doing an anomaly.
To PREVENT the problems of doing a simple average of temperature. NOW, IF
the records had no missing data, well you could use temps. I had section about this in the paper but we pulled it because it was too obvious”
I’m afraid you missed my point. Yes, everything you said about anomaly comparison is true. But it still removes perspective. Hello Winnipeg, just so you know, because of global warming expect highs this summer of 39.7 degrees in August instead of 39.3. Oh, and this winter, your lows will be -39.7 instead of -40.5. I’d survey denizens of Winnipeg to see their reaction but with temps in the 30’s they’ve deserted the city for lake country and in the winter door to door surveys are not appreciated.
Anomalies have plenty of value, but to be meaningful they eventually have to be translated into the actual temperatures that actual people will experience in actual locations or the execrcise is menaingless. When you plot the magnitude of the anomalies against the temperature range that humans have lived at for centuries, you get a perspective as to just how insignificant the change is. Particularly when the coldest parts of the planet warm the most and the warmest parts of the planet the least. Winters warm more than summers and nights more than days. The average across all of this may be 0.6 degrees per century or something in that range on a global scale, but it still comes back to the same thing. An artcic reqion that warms will have slightly warmed summers but milder winters with most of the warming coming in the parts of the year where plant and animal life are both stressed to the limit by cold. OK Mr polar bear, you’ll have to survive -42 this winter instead of -49, but the really bad news is this summer you will have to put up with a swealtering +12 instead of +7.
Mr Polar Bear: “How do I get more of this “problem”?

Amino Acids in Meteorites
August 19, 2010 11:06 pm

Some people may not be understanding the difference in importance between anomaly and temperature. Think of it this way:
If you put back into the record all of the mountain, ‘mountainous valleys’, and all other temperature stations that are no longer used in the record, and instead, dropped all of the ones that are now in the record, the urban, and airport, etc., you would the same anomaly (virtually, theoretically) but you would have cooler temperatures.

August 19, 2010 11:49 pm

Hello everyone. Thank you Steve, Nick and Zeke. I am on holiday, and on a slow dial-up, so I will do my best to follow this thread but my online time is limited until September. Here are my main comments.
“McKitrick does note that the methods of computing a global anomaly average are sound.”
– I’m pretty sure I didn’t phrase it like that. I have noted that the various gridding methods currently in use tend to yield similar results once the choice of input data is made. This has been demonstrated by the various groups using GHCN data, including Muir Russell’s team. As to saying whether the methods are “sound”, that goes further than I prefer to.
“the process of dropping stations is more related to dropping coverage in certain countries rather than a direct effort to drop high altitude stations”
– Altitude is likely not the deciding factor in dropping a station. It is more likely that a station is dropped due to cost of collecting the data or something like that, and the change in mean altitude is a knock-on effect.
As I see it, you have looked at 3 topics: did the global mean altitude change, was it due to a relatively large loss of inland and mountain sites relative to coastal sites, and did it affect the trends.
On the first, your Figure 1 shows variability in mean altitude but not the 1990 and 2005 discontinuities in Fig 1-8 of my paper, and hardly any change in the mean, by the looks of it. It would be helpful if you explained how you computed mean altitude. Are you arguing that there was, in fact, no change in mean altitude?
Table 1: What is the date for this distribution of stations? Is it something like, all stations over the entire 20th century? It would be helpful to the reader to know what exactly the sample is. Even more helpful would be the counts and % at some key intervals, like 1970, 1980, 1990, 2000 and 2010. You could make your point much clearer and more succinctly if you did that kind of tabulation.
On the second issue, you say “McKitrick supposes that the drop in altitude means a heavier weighting for coastal stations. The data do not support this” But then you proceed to show the sample of dropped stations, post-2005, is skewed towards inland sites. Only 9% of stations dropped are coastal as opposed to 30% in the larger sample, and 83.7% of dropped stations are inland rather than 64% in the sample. This represents a bias towards dropping inland areas and retaining coasts. If you disagree with this reading, you could make the whole point clearer simply by showing the coastal/inland tabulations at some key intervals.
In Table 4 and surrounding text you appear to be arguing that the distribution is not consistent with falling altitude. Looking at your numbers, compared to the entire sample (Table 3), the sample that disappeared post-1990 has a slightly lower median altitude (183 vs 192) but has a higher mean altitude (441 vs 420), a higher 3rd quartile (589 vs 533) and almost the same max (4613 vs 4670). This indicates to me that the distribution of the 1990 fatalities was skewed to higher altitudes compared to the whole population. Then the stations dropped after 2005 are even more skewed: the mean is 510 (vs 420 for the whole sample) and the 3rd quartile is 681 (vs 533 for the whole sample). You then conclude: “That hardly supports the contention of thermometers marching out of the mountains.” Well, first, that’s a straw man. Second, your own numbers show that the distribution of dropped thermometers was, in fact, skewed to higher altitudes.
Another straw man: In one comment above Mosher says, presumably referring to my paper, “The claim in the paper was that the drop in altitude “comprimised the integrity of the data”.” Where is that quote from? What I said was “the failure to maintain consistent altitude of the sample detracts from its statistical continuity.” That’s a pretty carefully-phrased assertion, especially since I said it in the context of a series of charts showing the magnitude of discontinuities in the nature of the sample around 1990 and 2005. So far you are not convincing me that the altitude distribution remained continuous across those intervals.
Going on to Table 6, the 4-way division of landforms is intrinsically less interesting because it does not directly map onto altitude. Why look at these terms as indicators of altitude, when altitude itself is available? Presumably a “Mountain Valley” is itself a high elevation site, at least compared to a coastal valley. Likewise Flatland can be high-elevation flatland or low-elevation flatland. Your table is hard to read, again because it doesn’t have a date to define the sample with, it doesn’t show a before/after comparison, and the % calculations are not defined. Since columns don’t add to 100% we can eventually figure out what you’re doing, but it isn’t what you need to do to make your argument. You want to argue, presumably, that the different land forms each lost a similar percentage of their stations. But your numbers show that the losses were not similar across categories, they were Flat 34.5%, Hilly 37.8%, Mountain 29.5%, MV 42.2%.
On the 3rd issue, at one point you say “What this means is that as long as high altitude stations warm or cool at the same rate as low altitude stations, removing them or adding them will not bias the result.”
– No, even if their trends are the same, there can still be a bias in the overall average, if the discontinuity in the sample causes the mean anomaly to change in a way that cannot be corrected.
Regarding the comparisons of trends, I think you have made a reasonable case that, where they can be compared and all else being held equal, high elevation sites in the GHCN sample have similar or slightly higher trends than low-elevation sites. Whether this is a general principal, I do not know. I am aware of Christy’s work on California valley-vs-mountainous trends, and it seems plausible to me that his findings may apply more generally.
But as regards deciding whether “it matters”, this is very difficult and I would caution against making strong statements, since it’s easy to fail to find an effect when you lack the data to measure it directly. In effect you have 2 sets of data: X and Y. X terminates at 1990, Y continues in the form of y. And we know that X and Y have different sample characteristics. If you only use the information contained in y, Y and X to estimate x (the continuation of X after 1990), depending on how you do it you will tend to estimate it based on the portion of X explained by Y, in other words the correlated portion. Ideally what you should have is some other data, W, which is correlated with X but not with Y, and which continues after 1990. Then you can do some proper statistical modeling. It’s a mistake to be overly nihilistic about the surface data, but it’s also a mistake not to properly quantify the possible biases.
As you have alluded to, the big issue needing attention is (in my terminology) the application of ex post rather than ex ante testing to assess the quality of adjustments, gridding, and all the rest. The testable hypothesis is: If the surface data have been properly adjusted, and If the GCM’s are reliable in telling us the dominant influences on surface temperature trends, Then after the adjustment process is complete we should observe certain patterns in the data and not observe others. I expect you all know my thinking on this, based on the research I have done up to now, and why the IPCC’s treatment of the topic in the AR4 was so horrendously inadequate. I have 4 more projects in the works on this: one nearly in print, one in prep for a conference in the fall, and two just at the data collection stage.

E.M.Smith
Editor
August 19, 2010 11:53 pm

BillD says:
BillyBob says:
August 19, 2010 at 10:21 am
” Given the GHCN data, the answer one gets about the pace of global warming is not in serious dispute. ”
Considering that all of the GHCN anomaly calculators use the mean:
[…]
Having looked at the raw GHCN data, I can say the max is not going up. It is the min.
Therefore it is UHI.
BillyBob:
You are correct that the min is going up much faster than the max. One region where this is especially strong is in the Swiss Alps. One problem with your conclusion is that green house gases are expected to reduce night-time cooling and to have a greater effect on the min rather than the max temperature.

While it’s pretty clear it’s the MIN that gets hit, the reasons are many. Station siting, for one. Over asphalt, the warmth persists longer into the night than over grass. Over the decades airports have added far more night flights. When I was a kid, they were modestly rare. Now the business is 24 x 7 some places. That the temperature series warms in sync with the airport explosion of the Jet Age and has 90%+ thermometers at Airports in many countries is a blatant problem. (And that the “QA” process for USHCN demands a station be acceptable to it’s ASOS neighbors via a comparison to an average of them will also clip and replace more low going temperatures than highs… an average will never go as low as a single station.)
But the bottom line is that it’s much more ‘instrumentation error’ than anything to do with CO2. It’s the IR off the airport tarmac at night onto the badly sited thermometer, not the IR hitting a CO2 molecule in the stratosphere.
Mosher: Do a strict “self to self” anomaly process on the data comparing a thermometer ONLY with itself and there is a clear ‘shaft’ and a clear ‘blade” and it happens at a clear point in time when the “duplicate numbers” change. Easy to see, easy to show. I’ve done it a few different ways. A modified form of First Differences and a simple long baseline method. You have tended to treat all the duplicate numbers of one ordinal value as a group. You can’t. For some it’s a change from 2 to 3, for others with more ‘dups’ it may be a 4 to a 5. They have no specific meaning and are assigned sequentially. So do it by date and you get the blade.
One version:
http://chiefio.wordpress.com/2010/04/22/dmtdt-an-improved-version/
The one I like better, but is less ‘standard’:
http://chiefio.wordpress.com/2010/04/11/the-world-in-dtdt-graphs-of-temperature-anomalies/
The pattern of changes shows clearly there is impact in some months of the year and not in others, often going in different directions. It is driven by station and process changes, not by CO2. At it’s most basic, it is a ‘splice’ artifact from splicing together different types of stations over time via anomalies. Codes like GHCN will also do this ‘splice’ and get the splice artifacts, but does it by much more convoluted means via ‘homogenizing’.
You completely missed the point on Volatility. Read the link. It’s the STATIONS that have differential volatility. The weather cycles just put them at a warmer or colder end of their ranges on a periodic basis (substantially the definition of a major cycle.)
Not interested in toys. Interested in the real world. Any motorcycle rider can tell you that at night it gets a lot colder out where the plants grow and it is much more nicely warm over the tarmac and in town. Yeah, it’s testable, but be careful that you are not biasing the test… So you take a peach orchard, it transpires a load of water and cools off quickly. Tarmac not so much. (Ridden by many of each, even late at night and early mornings).
And yes, there will be some days that have more AHI than others. This is well known. But since airports are typically near cities, you also have the issue of from which way the wind blows. I’ve found an interesting example near Miami where you can watch the temps rise when the wind comes over Miami. Gave a several sigma event / excursion. You don’t need EVERY day to be extra warmed to bias your averages up over time. Just “enough”.
Per “step change”. Good luck with that.
Airports are far more dynamic than that, and on a time scale that covers the ‘warming’ found. SJC was a near nothing in 1940. In the 1960-70 era it had one terminal and you could walk from your car to boarded on the plane in 10 minutes. I did it in 5 once, seat to seat. Now it’s 2 major terminals and far more runway. Huge amount more traffic, loads more hours of operation. The same thing happens around the world. Airports are not built once in 1950 and frozen in time. The grow and expand. A Lot. Fuel use rises. We swap from pistons to jets with massive hot exhaust. We move from 707 to 737 to 747 to… and all the time the passenger load / year climbs. More than enough stations like that to move an average.
So pick long lived stations that have 100 years of time in service and are not airports and look at them. The “Industrial Revolution” does not warm them. (I think it was Tony B did that work).
Per the ‘cooling since the little ice age’. Don’t know where that comes from. We warmed out of the little ice age into about 1825, then did some cyclical warm / cold cycles with the 1930’s being quite warm, especially in the USA where most of the thermometers are located. But we’ve been cooling lately (despite hysterics from some folks) and are in a downhill slide since the PDO flipped.
One other point. It does little good to say a given station trend is all that matters when what is compared in codes like GIStemp is NOT a station trend. It is a Grid Box trend that is found by using one set of stations in the baseline and a different set in the present. The “trend” if found by comparing my 1967 VW with my present Mercedes SLC and finding that cars are faster now. Sure, there is a bunch of magic sauce applied to try to say that things have all been adjusted to make it fair. But it isn’t. THE basic issue in the present climate codes is the use of “box” anomalies computed by comparing one box of thermometers THEN to a different box of thermometers NOW. So any analysis / proof / validation done comparing trends of particular individual thermometers does nothing to show the validity of the method actually used. (It can show the nature of the data, and it shows no warming to speak of from 1825 to 1990, a ‘dip’ in the 1950-1980 range, and a ‘hockey blade’ in 1990).
And that is where the volatility issue comes in. A ‘high volatility’ group of stations records a low value in 1950-1980, then a ‘low volatility’ group of stations can never reach that low value again. They lack the ‘range’ to reach it. Explained in detail in the link with example stations data.
It is the failure to use actual “station to itself” anomalies that is the broken part of the process used by CRU and GISS and NCDC.
So you do a lot of ‘station A vs B’ warming trends. But you don’t do “Station A during cold PDO and station B during warm PDO with with A and B having different volatility ranges during each” then splicing the two series together to get a ‘trend’ via various box filling and homogenizing. You miss the trick so declare the magician really did saw the girl in half.
Per the comparison of pre-1990 stations. The problem is that we don’t have the data for them POST 1990. We don’t have both sides of the volatility problem to see. As long as you are doing “box to box” you can not find out what “station to itself” would have done with the stations that are not there. So that “pre-1990” with / without test is incomplete.
One final note on bias. The Meteorologists in Turkey did an analysis with all the available stations and compare it with the “GHCN kept” stations. They found cooling. GHCN finds warming. They have a published paper (I’ve put links up to it several times) and it basically shows that station selection bias turned a cooing trend into a warming trend in Turkey.
That’s one mighty big cockroach I see… wonder how many more in in the shadows…
Now one of the things I’ve noticed in wandering through GHCN that initially caused me some puzzlement was that when taken in aggregate things would often show no change, but when looked at in particular, there was an impact. This could be an accidental result of an attempt to avoid bias by a broken means (forcing specific changes to a standard no-change average) or it could be a direct result of assuring some metric is keep ‘nominal’ while the deck is being stacked.
I come from a forensics background, so I’m more used to this kind of thing than most. But, for example, if you have a company that orders equal number of buns and burgers, you expect the two counts to match. If someone is stealing burgers, buns build up to excess. So you cross check those two. A clever burger burgle would take both buns and burgers so the counts matched. Now audit the details and find that on even days burgers disappear and on odd days buns disappear… Hmmm…
It’s that kind of forensic mind set that leads to all the highly detailed (and sometimes tabular) comparisons I did. You ‘cut the deck’ every possible way, not just the way the dealer suggests… And that is also why I’m not very impressed by “studies” that show one particular cut of the deck finds everything matches up just fine. I would expect it to.
And that brings me to the Atlas Mountains.
I found the thermometers running INTO the mountains. Yet Morocco was ‘warming’. Odd. Until you realize that there is a cool ocean current off shore and the Atlas Mountains were much more desert warmed.
OK, now compare the ‘anomalies’ of those two ‘station to itself’ and you may not find much. But what happens when the cold thermometer is in the “box” at the start, then a warmer one in the “box” at the end? (Remember, we don’t have an anomaly between those two time periods for the thermometer to itself, only box to box…) And what if the cold thermometer was in during a particularly cold AMO to warm cycle, then moved to the desert where the waters can’t influence it. Gee, we can lock in that AMO warming trend and hold onto it. Avoiding the next cooling turn to the water and the breezes off shore.
So it’s not enough just to look at “by altitude”, though it gives some strong clues. It’s much better to look at “by altitude, by region” or “by altitude, by country”. And over aggregating things into an “average bucket” (or box…) just assures you are not looking up the right sleeves… While using a test of ‘station trend’ to validate ‘box trend’ is just missing the whole basis of the trick. (or error… to leave the metaphor behind).

E.M.Smith
Editor
August 20, 2010 12:02 am

Rex from NZ says:
I’ll repeat an earlier question, and a new one.
People use the term GRINS … what does it mean?
And what does UHI mean?

UHI is Urban Heat Island. AHI is Airport Heat Island.
“just for Grins” means “just for fun” or “as an example for fun illustration and not a formal proof”. Or, as my sons chem teacher derisively answered when I asked about dong practical chemistry for the kids: “Oh, you mean DEMONSTRATIONS”. (He didn’t see the point, being a dull and slow fellow.) So one might hear “Lets go do some cow tipping just for grins” or “assume I have a million dollars, just for grins, what would we do with it?.

D. King
August 20, 2010 12:24 am

Scan this video to 2:35 and play to 3:27

E.M.Smith
Editor
August 20, 2010 12:24 am

Dave says:
I have a rather fundamental question about GHCN data series. When stations were dropped as in 1990 – 1995 were the historical temperature values/anomalies for those locations purged/excluded from data sets used to compute global average temperatures or temperature anomalies?

The data are kept in. New data is not added. So one might have a station in Sacramento and one in San Francisco during the cold period, then drop any NEW data from Sacramento. GIStemp then makes up numbers for Sacramento based on the temperatures in San Francisco and the historic relationship between them.
Bolivia, for example, ends about 20 years ago. All the temperatures you see for Bolivia in the global maps are based on the OLD real thermometers compared to present made up ones.

That is, is the GHCN data series downsized backward in time as well as going forward? If so, then it seems to me your conclusion that, “Changing altitude does not bias the final trends in any appreciable way” is not strongly supported by your analysis.

While true, it’s worse than that. The missing data are often ‘made up’ via various means. Depending on the details it can be called ‘homogenizing’ or ‘The Reference Station Method’.
So take SF vs Sacramento and establish their relationship during a cool time period (when Sacramento is much cooler than SFO). Now during a warming time, ‘make up’ Sacramento by looking at SFO. SFO can be VERY HOT during hot excursions, so you will now make up a real scorcher for Sacramento. A value that will not be real. This is the volatility issue in the link above. SFO can get to 100-105F during hot excursions, about 5 F behind Sacto. But during cold excursions SFO may be 5 F warmer than Sacto. So establish your ‘cool’ offset as a minus 5 F, then apply it when SFO is really +5F and you get +10 F (when Sacto is really not that hot). As a fictional example.
Basically the cold range of Sacto is far colder than SFO while the warm range is only a tiny bit warmer. You can use that ‘delta V’ to your advantage via splicing over time and ‘fill in’ of missing data to create any anomaly trend you like. The artful addition and deletion of stations based on volatility and where you are in a long duration cycle like the PDO is all it takes. It exploits the small comparison period of The Reference Station Method to establish a fixed relationship when it really is more dynamic and it exploits the station volatility with long duration cycles (that is typically ignored in the codes) to let you make any anomaly trend you would like as a side effect of “box to box” anomalies.
Or more simply:
It’s all about the splice. What, when, and how.

tonyb
Editor
August 20, 2010 1:52 am

Steve Nick and Steve
Mosh
You have done some interesting work here, very well done. I know how difficult it is to write an article let alone put it up for public scrutiny.
I appreciate that in this article you are interested in the effects of Altitude, although as is always the way the discussion has broadened out considerably. As there are many laymen lukers here I wonder if it might be useful to put your article into a historic context for them?
I have long been interested in historic temperature records and in order to examine the frequent assertions that we have no instrumental records of the LIA established my site here.
http://climatereason.com/LittleIceAgeThermometers/
To set the scene I wrote about the reliability of global temperature records here.
“Article: History and reliability of global temperature records. Author: Tony Brown
This article (part 1 of a series of three) examines the period around 1850/80 when Global temperatures commence, and looks at the long history of reliable observations and records prior to the development of instrumental readings.”
http://wattsupwiththat.com/2009/11/14/little-ice-age-thermometers-%e2%80%93-history-and-reliability/
Within that article was a tabular version of Chiefios’ wonderful ‘March of the thermometers- which has been referenced in this thread several times.
In a fine example of Anglo American cooperation Verity Jones and Lucy Skywalker kindly put the information into an excellent graphical form which clearly shows the changing numbers of thermometers over the years at various locations. (It is towards the top of my article here).
http://climatereason.com/UHI/
Delving into the records -both instrumental and written/observational- it became increasingly clear that far from Global warming starting in 1880 we had seen rising temperatures since the 1690’s (the LIA in its severest form effectively ended by 1698)
It became clear that James Hansen in setting an arbitrary start date of 1880 for GISS had missed out on a whole history of rising temperatures which I wrote about here
Article: Three long temperature records in USA. Author: Tony Brown
This article links three long temperature records along the Hudson river in the USA. They illustrate that a start date of 1880 (Giss) misses out on the preceding warm climatic cycles and that UHI is a big factor in the increasingly urbanised temperature data sets from both Giss and Hadley/Cru
http://noconsensus.wordpress.com/2009/11/25/triplets-on-the-hudson-river/#comment-13064
and in a strictly UK context here;
http://noconsensus.wordpress.com/2010/01/06/bah-humbug/
The gentle slow rise over the centuries has been graphically reproduced with some of the longest data sets here
http://i47.tinypic.com/2zgt4ly.jpg
http://i45.tinypic.com/125rs3m.jpg
The UK figures from 1660 can be more clearly seen here together with its linear regression.
http://homepage.ntlworld.com/jdrake/Questioning_Climate/_sgg/m2_1.htm
So we have a world that has been generally warming for the past 300 plus years. However it does not appear to be a global thing-(perhaps why terminology has recently been shifted to ‘climate change’. There is plenty of evidence to show that there are a number of locations that have bucked the general warming trend over a statistically meaningful period and are static or cooling. I hope to do a post on this in due course with some colleagues.
The main point I wanted to get across is that it would be useful for researchers to see current temperatures as part of a very long established natural trend, rather than that it has come about recently and must therefore be ‘our’ fault..
I will leave the nuances of the discussion to others here especially as Ross and E M Smith have turned up who can discuss the key part of your study concerning altitude.
Tonyb

August 20, 2010 2:07 am

I’m currently working on a temp reconstruction with the criteria that it shows the average, max, and min.
The way I’m getting over the station dropout issue is that I’m doing the averaging at the very last step. I’m using the daily GHCNv2 unadjusted.
From that I’m taking max/min temps with the only criteria being that they don’t exceed world records and that they’re balanced. I add all of these temps (they’re in tenths of a degree so they are all integers) together for a given year (I just want to see yearly for the moment) and record the number of observations. These two figures along with the min and max for all the years that station records are then stored in a file.
When I need to calculate a region or country, I open up the temp station files for that area and then add all the temp sums and observations for each year and only then compute an average for a year.
If stations drop out or are added, all that happens is the overall sum of temps and the observation count rise or fall in step with each other, and the computed average is valid. It’s a much less compute intense way of getting a Global Average Temp, with the advantage that none of the information is lost.
I’m still about a month away from releasing it (I still have to earn a living 🙁 ), but what I can say is that it does show that min temps have risen more than max temps, and that that is what seems to be driving global temps up. This is not a signature one would expect from CO2 induced warming.

August 20, 2010 2:57 am

Ross,
I calculated the average altitude of 1990 fatalities – yes, there were 919 of them, ave altitude 550.6 m (high). This was the year of Turkey, Canada, China and Japan.
It seems to me that if there is a suggestion that selective reduction of high or high-latitude stations biases the trend, the most direct test is Monte Carlo simulation of such a reduction. “Try it and see”. I’ve tried various selection strategies without producing a major trend effect, but I’d be happy to try others if there’s something I’ve missed.

August 20, 2010 3:05 am

E.M.Smith says: August 20, 2010 at 12:24 am
“Bolivia, for example, ends about 20 years ago. All the temperatures you see for Bolivia in the global maps are based on the OLD real thermometers compared to present made up ones.”

We do have over thirty GSOD stations reporting in Bolivia during this period. The story they tell is not substantially different to the GHCN account deduced from neighboring stations.

Tony Rogers
August 20, 2010 3:29 am

I have a comment/observation about anomalies.
In the piece above, it states “When we create a global anomaly we prevent this kind of bias from entering the calculation by scaling the measurements of station by the mean of that station.”
This indeed would eliminate biases due to dropped stations. However, is this what is really done? Hansen 1999 section 4.2 states “As a final step, after all station records within 1200 km of a given grid point have been averaged, we subtract the 1951-1980 mean temperature for the grid point to obtain the estimated temperature anomaly time series of that grid point.”
There is a big difference in calculating an anomaly by subtracting the station average from the station data and subtracting the gridpoint average from gridpoint temperature. Following Hansen’s method, dropping colder stations in later years would bias the more recent temperature anomalies upwards.
Wouldn’t it?

August 20, 2010 3:33 am

“C James says: August 19, 2010 at 9:11 am
The real question to me is why are all of you bright guys spending so much time on verifying (or not) that the use of bad data, regardless of methodology, produces similar (or not) results? Why isn’t there a concerted effort on everyone’s part to go back to the raw data and start over?”

The analysis described here uses GHCN v2.mean, which is raw data. That is, it comes straight from the CLIMAT forms submitted by the various Met organisations. That’s what people like Zeke, Steven and I use. GISS and CRU may subsequently adjust it (with good reasons).