
Temperature averages of continuously reporting stations from the GISS dataset
Guest post by Michael Palmer, University of Waterloo, Canada
Abstract
The GISS dataset includes more than 600 stations within the U.S. that have been
in operation continuously throughout the 20th century. This brief report looks at
the average temperatures reported by those stations. The unadjusted data of both
rural and non-rural stations show a virtually flat trend across the century.
The Goddard Institute for Space Studies provides a surface temperature data set that
covers the entire globe, but for long periods of time contains mostly U.S. stations. For
each station, monthly temperature averages are tabulated, in both raw and adjusted
versions.
One problem with the calculation of long term averages from such data is the occurrence of discontinuities; most station records contain one or more gaps of one or more months. Such gaps could be due to anything from the clerk in charge being a quarter drunkard to instrument failure and replacement or relocation. At least in some examples, such discontinuities have given rise to “adjustments” that introduced spurious trends into the time series where none existed before.
1 Method: Calculation of yearly average temperatures
In this report, I used a very simple procedure to calculate yearly averages from raw
GISS monthly averages that deals with gaps without making any assumptions or adjustments.
Suppose we have 4 stations, A, B, C and D. Each station covers 4 time points, without
gaps:
In this case, we can obviously calculate the average temperatures as:
A more roundabout, but equivalent scheme for the calculation of T1 would be:
With a complete time series, this scheme offers no advantage over the first one. However, it can be applied quite naturally in the case of missing data points. Suppose now we have an incomplete data series, such as:
…where a dash denotes a missing data point. In this case, we can estimate the average temperatures as follows:
The upshot of this is that missing monthly Δtemperature values are simply dropped and replaced by the average (Δtemperature) from the other stations.
One advantage that may not be immediately obvious is that this scheme also removes
systematic errors due to change of instrument or instrument siting that may have occurred concomitantly with a data gap.
Suppose, for example, that data point B1 went missing because the instrument in station B broke down and was replaced, and that the calibration of the new instrument was offset by 1 degree relative to the old one. Since B2 is never compared to B0, this offset will not affect the calculation of the average temperature. Of course, spurious jumps not associated with gaps in the time series will not be eliminated.
In all following graphs, the temperature anomaly was calculated from unadjusted
GISS monthly averages according to the scheme just described. The code is written in
Python and is available upon request.
2 Temperature trends for all stations in GISS
The temperature trends for rural and non-rural US stations in GISS are shown in Figure
1.

This figure resembles other renderings of the same raw dataset. The most notable
feature in this graph is not in the temperature but in the station count. Both to the
left of 1900 and to the right of 2000 there is a steep drop in the number of available
stations. While this seems quite understandable before 1900, the even steeper drop
after 2000 seems peculiar.
If we simply lop off these two time periods, we obtain the trends shown in Figure
2.

The upward slope of the average temperature is reduced; this reduction is more
pronounced with non-rural stations, and the remaining difference between rural and
non-rural stations is negligible.
3 Continuously reporting stations
There are several examples of long-running temperature records that fail to show any
substantial long-term warming signal; examples are the Central England Temperature record and the one from Hohenpeissenberg, Bavaria. It therefore seemed of interest to look for long-running US stations in the GISS dataset. Here, I selected for stations that had continuously reported at least one monthly average value (but usually many more) for each year between 1900 and 2000. This criterion yielded 335 rural stations and 278 non-rural ones.
The temperature trends of these stations are shown in Figure 3.

While the sequence and the amplitudes of upward and downward peaks are closely similar to those seen in Figure 2, the trends for both rural and non-rural stations are virtually zero. Therefore, the average temperature anomaly reported by long-running stations in the GISS dataset does not show any evidence of long-term warming.
Figure 3 also shows the average monthly data point coverage, which is above 90%
for all but the first few years. The less than 10% of all raw data points that are missing
are unlikely to have a major impact on the calculated temperature trend.
4 Discussion
The number of US stations in the GISS dataset is high and reasonably stable during the 20th century. In the 21st century, the number of stations has dropped precipitously. In particular, rural stations have almost entirely been weeded out, to the point that the GISS dataset no longer seems to offer a valid basis for comparison of the present to the past. If we confine the calculation of average temperatures to the 20th century, there remains an upward trend of approximately 0.35 degrees.

Interestingly, this trend is virtually the same with rural and non-rural stations.
The slight upward temperature trend observed in the average temperature of all
stations disappears entirely if the input data is restricted to long-running stations only, that is those stations that have reported monthly averages for at least one month in every year from 1900 to 2000. This discrepancy remains to be explained.
While the long-running stations represent a minority of all stations, they would
seem most likely to have been looked after with consistent quality. The fact that their
average temperature trend runs lower than the overall average and shows no net warming in the 20th century should therefore not be dismissed out of hand.
Disclaimer
I am not a climate scientist and claim no expertise relevant to this subject other than
basic arithmetics. In case I have overlooked equivalent previous work, this is due to my ignorance of the field, is not deliberate and will be amended upon request.



Brian says:
“Changing your position from argument to argument is not acceptable.”
I am willing to change my opinion if new facts warrant it, but my position has remained unchanged for a very long time. It is this:
A doubling of CO2 will probably result in a ≈1°C rise in temperature, ± ≈0.5°C. The additional warmth will be entirely beneficial. It will result in millions of new arable acres in places like Siberia, Mongolia and Canada. Furthermore, any current and projected rise in CO2 will be entirely beneficial to the biosphere. Agricultural production will continue to increase as a direct result of more CO2. There is no credible downside to a rise in CO2, a beneficial and harmless trace gas.
I have posted several hundred comments here that say essentially the same thing. I have been very consistent in this. If you can find a comment I made that contradicts anything in this post, please point it out.
@Ivor Ward says:
October 24, 2011 at 8:23 am
“Would anyone care to explain to me why the graph linked by Mr Palmer for Hohenpiessenberg at http://climatereason.com/LittleIceAgeThermometers/Hohenpeissenberg_Germany.html shows a virtual flat line and then one that MFK Boulder links to at http://preview.tinyurl.com/Hohenpeissenberg shows our favourite hockey stick, preferably without using the word “baloney” in the text even if you do spell it correctly.”
It is easy to see which one is bogus from the original data:
http://members.multimania.nl/ErrenWijlens/co2/t_hohenpeissenberg_200306.txt
Theo Goodwin says:
October 24, 2011 at 2:26 pm
That was a brutal drubbing of the one trick pony. Took him to school with that one you did. Wished I’d have written it!
Smokey says:
October 24, 2011 at 2:31 pm
“A doubling of CO2 will probably result in a ≈1°C rise in temperature”
If you change that to a maximum of 1C in the most arid places on the planet where evaporation and convection play little role in surface cooling then I’ll agree to it. Rather basic physics there without confounding factors. CO2 has very little effect over the oceans which are 71% of the planet’s surface because the ocean only gives up 20% of its solar heating through radiation and CO2 only slows down radiative cooling. CO2 DOES NOT slow down evaporative, convective, or conductive cooling. Over land surfaces, especially dry surfaces, the dominant mode of cooling is radiative. CO2 will have its maximum theoretical effect there and there-only.
@Smokey
So you basically accept the findings of climate research except for the consequences. So do you agree that 90% of the self-proclaimed “skeptics” are crazy for denying the earth is warming and that humans are causing (at least) much of it? What other scientific consensus’ do you deny? Evolution? The germ theory of disease? Do you think the evidence that smoking causes lung cancer is just a political ploy?
Climate-denial-gate has robbed the “skepticism” movement of whatever credibility it had.
Brian says:
October 24, 2011 at 1:38 pm
Brian, are you very young and innocent?
How else can it be explained that you think wikipedia is a source to be recited? Or are you being deceitful by purpose?
We all know about the frantical editing by the warmistas over at wikipedia.
Hope you have access to plenty Joules next winther;
http://notrickszone.com/2011/10/24/german-meteorologists-horror-winter-to-hit-central-europe/
DR_UK says:
October 24, 2011 at 1:22 pm
“But isn’t this a similar method of taking first differences that was discussed and criticised before? See Hu McCulloch’s 2010 post at Climate Audit http://climateaudit.org/2010/08/19/the-first-difference-method/.”
—
Thanks for the link. The idea is indeed similar. There is one difference, however: The CA post states: “Missing observations may simply be interpolated for the purposes of computing first differences (thereby splitting the 2 or more year observed difference into 2 or more equal interpolated differences).” In contrast to this approach, I did not fill the gaps with interpolated numbers.
Dave Springer
Maybe the earlier suggestion of using airliner temperature records was a better one.
George Turner says:
October 24, 2011 at 1:29 pm
If you follow some of the adjustments they make to the temperature record, we should worry less about the present getting warmer than the alarming rate at which the past keeps getting cooler. If present trends continue, millions of extra people in the 1940′s are going to freeze to death.
Thank you George!! This was the comment of the day for me. ( I hope my mum and dad aren’t affected because then I wouldn’t be here)
Brian says:
October 24, 2011 at 1:38 pm
Smokey,
This is the problem with arguing with [SNIP: – Policy Violation -REP] , they all have different arguments, and are willing to change them at the drop of a hat. “The earth isn’t warming!” “Ok maybe it is but it’s not humans!” “Ok it’s humans but it’s not harmful!” “Ignore the fact that I was wrong on my first two premises!”
Most scientists reject catastrophic AGW? I like that you slip “catastrophic in there.
In the last few days, Brian is only one of many who question “what do skeptics believe” and point out that “skeptics” are all over the board in their (our) beliefs.
In another thread I posted:
“From the best I’ve determined, this is what most of both “skeptics” and “lukewarmers” believe:
“There is no convincing scientific evidence that human release of carbon dioxide, methane, or other greenhouse gases is causing or will, in the foreseeable future, cause catastrophic heating of the Earth’s atmosphere and disruption of the Earth’s climate. Moreover, there is substantial scientific evidence that increases in atmospheric carbon dioxide produce many beneficial effects upon the natural plant and animal environments of the Earth.”
Many here know where that statement originates, but the source of the phrase isn’t as important as is the general sentiment it portrays. ”
We don’t slip “catastrophic” in anywhere – the position of the “warmists/alarmists”, the position that “requires” restriction of anthropogenic CO2 emissions, is that these same CO2 emissions are driving the climate toward catastrophic events.
We (skeptics/warmists) should not let a “warmist/alarmist” define what we believe.
Brian says:
October 24, 2011 at 2:49 pm
“What other scientific consensus’ do you deny? Evolution? The germ theory of disease? Do you think the evidence that smoking causes lung cancer is just a political ploy?”
—
None of these, actually. You may also be surprised to learn that not all of us own a gun.
Anyone who likens the profundity of our understanding of global warming, and the quality of the evidence to those pertaining to any of the above theories only proves their ignorance.
Louis says:
October 24, 2011 at 10:55 am
“Local temperatures can change several degrees in less than an hour.”
I’ve seen 20F drop in ten minutes here in south central Texas when a cold front blows through on a warm day. It’s enough to take your breath away. Like you opened up your freezer door and stuck your head inside.
The locals here say “There ain’t nothing that stands between Austin and the Arctic circle but barbed wire fence.”
Brian,
You’re beginning to sound like a lunatic. The large majority of scientific skeptics know that the planet has warmed naturally since the Little Ice Age [LIA]. You deliberately misrepresent the skeptical position by making truly absurd comments like: “…do you agree that 90% of the self-proclaimed ‘skeptics’ are crazy for denying the earth is warming…”
First, skeptics are not crazy. We simply ask for testable, real world evidence – per the scientific method – showing that human-produced CO2 causes measurable global warming. It may, but as of now there is no evidence that it does. None. The entire AGW facade is based on computer models and conjecture.
Next, you keep raising the red herring argument of “consensus”. Every last person on earth could believe that the moon is made out of green cheese. But the consensus would be wrong. Further, the canard that most scientists and engineers believe that human activity caused most of the ≈0.7°C warming over the past hundred and fifty years is provably wrong, as I’ve shown in the OISM links I posted. In fact, the alarmist crowd is in the minority, and your misguided belief system cannot change that verifiable fact.
Finally, your wacko comment about ‘climate-denial-gate’ confirms you as a member of the climate alarmist lunatic fringe. This is the internet’s “Best Science” site, not a tinfoil hat blog like tamino’s, Romm’s, Cook’s Skeptical Pseudo-Science, etc. Please take your wild-eyed conspiracy theories to them, they eat that stuff up.
• • •
Dave Springer,
The 1°C number is a ballpark figure, because we don’t know for certain what the sensitivity number is. It may even change depending on CO2 concentration and/or temperature. That’s why I give a 0.5°C error band. [Or fudge figure, if you like.☺] I personally think that when the question is definitively answered, the number will be ≤1°C.
phlogiston says:
October 24, 2011 at 3:04 pm
“Maybe the earlier suggestion of using airliner temperature records was a better one.”
Satellites are doing a fine job. Global coverage, 24/7, NBS traceability. Interestingly though they too were also “adjusted” around 1999 to go from showing a cooling trend to a warming trend.
It’s just a testament to the incredibly small effect (CO2) they’re trying to tease out of the data that pencil whipping within the error bars of the highest tech sensing gear can change a cooling trend to a warming trend. In reality I strongly suspect there is no detectable temperature trend from CO2 but there is indeed a trend and that trend is higher agricultural output because whatever else CO2 is it is definitely plant food and we’re fertilizing the atmosphere by burning fossil fuels.
Hansen’s own GISStemp data show that it is actually COOLER than his infamous “Scenario C” forecast, which was based on NO (i.e. ZERO, NONE, NOT ANY) man-made CO2 emissions after 2000!
http://www.real-science.com/doubt-temperatures-rising-fast-hansens-emissions
If Hansen paid any attention at all to HIS OWN DATA, as well as his own predictions, pronouncements, prognositcations and various other bloviations, he would have concluded years ago that CO2-based warming simply DOES NOT HAPPEN!
uhhh… the article references to ‘abstract’ – where’s the actual published paper to be found? What journal? Uhhh… what peer-reviewed journal? Surely, surely…. this can’t be a pre-release!!!
As Requested:
Off Averages & Anomalies Part 1A
In recent years a number of claims have been made about ‘problems’ with the surface temperature record: that it is faulty, biased, or even ‘being manipulated’. Many of the criticisms often seem to revolve around misunderstandings of how the calculations are done and thus exaggerated ideas of how vulnerable to error the analysis of the record is. In this series I intend to look at how the temperature records are built and why they are actually quite robust. In this first post (Part 1A) I am going to discuss the basic principles of how a reasonable surface temperature record should be assembled, Then in Part 1B I will look at how the major temperature products are built. Finally in Parts 2A and 2B I will then look at a number of the claims of ‘faults’ against this to see if they hold water or are exaggerated based on misconceptions.
How NOT to calculate the Surface Temperature
So, we have records from a whole bunch of meteorological stations from all around the world. They have measurements of daily maximum and minimum temperatures for various parts of the last century and beyond. And we want to know how much the world has warmed or not.
Sounds simple enough. Each day we add up all these station’s daily average temperatures together, divide by the number of stations and, voilá, we have the average temperature for the world that day. Then do that for the next day and the next and…. Now we know the world’s average temperature, each day, for all that measurement period. Then compare the first and last days and we know how much warming has happened – how big the ‘Temperature Anomaly’ is – between the two days. We are calculating the ‘Anomaly of the Averages’. Sounds fairly simple doesn’t it? What could go wrong?
Absolutely everything.
So what is wrong with the method I described above?
1. Every station may not have data for the entire period covered by the record. They have come and gone over the years for all sorts of reasons. Or a station may not have a continuous record. It may not be measured on weekends because there wasn’t the budget for someone to read the station then. Or it couldn’t be reached in the dead of winter.
Imagine we have 5 measuring stations, A to E that have the following temperatures on a Friday:
A = 15, B = 10, C = 5, D = 20 & E = 25
The average of these is (15+10+5+20+25)/5 = 15
Then on Saturday, the temperature at each station is 2 °C colder because a weather system is passing over. But nobody reads station C because it is high in the mountains and there is no budget for someone to go up there at the weekend. So the average we calculate from the data we have available on Saturday is:
(13+8+18+23)/4 = 15.5.
But if station C had been read as well it would have been:
(13+8+3+18+23)/5 = 13
This is what we should be calculating! So our missing reading has distorted the result.
We can’t just average stations together! If we do, every time a station from a warmer climate drops off the record, our average drops. Every time a station from a colder climate drops off, our average rises. And the reverse for adding stations. If stations report erratically then our record bounces erratically. We can’t have a consistent temperature record if our station list fluctuates and we are just averaging them. We need another answer!
2. Our temperature measurements aren’t from locations spaced evenly around the world. Much of the world isn’t covered at all – the 70% that is oceans. And even on land our stations are not evenly spread. How many stations are there in the roughly 1000 km between Maine and Washington DC, compared to the number in the roughly 4000 km between Perth & Darwin?
We need to allow for the fact that each station may represent the temperature of very different size regions. Just doing a simple average of all of them will mean that readings from areas with a higher station density will bias the result. Again, we can’t just average stations together!
We need to use what is called an Area Weighted Average. Do something like: take each station’s value, multiply it by the area it is covering, add all these together, and then divide by the total area. Now the world isn’t colder just because the New England states are having a bad winter!
3. And how good an indicator of its region is each station anyway? A station might be in a wind or rain shadow. It might be on a warm plain or higher in adjacent mountains, or in a deep valley that cools quicker as the Sun sets. It might get a lot more cloud cover at night or be prone to fogs that cause night-time insulation. So don’t we need a lot of stations to sample all these micro-climates to get a good reliable average? How small does each station’s ‘region’ need to be before its readings are a good indicator of that region? If we are averaging stations together we need a lot of stations!
4. Many sources of bias and errors can exist in the records. Were the samples always taken at the same time of day? If Daylight Savings Time was introduced, was the sampling time adjusted for this? Where log sheets for a station (in the good old days before new fangled electronic recording gizmos) written by someone with bad handwriting – is that a 7 or a 9? Did the measurement technology or their calibrations change? Has the station moved, or changed altitude? Are there local sources of biasing around the station? And do these biases cause one-off changes or a time-varying bias?
We can’t take the reading from a station at face-value. We need to check for problems. And if we find them we need to decide whether we can correct for the problem or need to throw that reading or maybe all that station’s data away. But each reading is a precious resource – we don’t have a time-machine to go back and take another reading. We shouldn’t reject it unless there is no alternative.
So, we have a Big Problem. If we just average the temperatures of stations together, even with the Area Weighting answer to problem #2, this doesn’t solve problems #1, #3 or #4. It seems we need a very large detailed network, which has existed for all of the history of the network, with no variations in stations, measurement instruments etc, and without any measurement problems or biases.
And we just don’t have that. Our station record is what it is. We don’t have that time machine. So do we give up? No!
How do stations’ climates change?
Let’s consider a few key questions. If we look at just one location over its entire measurement history, say down on the plains, what will the numbers look like? Seasons come and go; there are colder and warmer years. But what is the longer term average for this location? What is meant by ‘long term’? The World Meteorological Organisation (WMO) defines Climate as the average of Weather over a 30 year period. So if we look at a location averaged over something like a 30 year period and compare the same location averaged over a different 30 year period, the difference between the two is how much the average temperature for that location has changed. And what we find is that they don’t change by very much at all. Short term changes may be huge but the long term average is actually pretty stable.
And if we then look at a nearby location, say up in the mountains, we see the same thing: lots of variation but a fairly stable average with only a small long term change. But their averages are very different from each other. So although a station’s average change over time is quite small, an adjacent station can have a very different average even though its change is small as well. Something like this:
Comparing two adjacent stations
Next question: if each of our two stations averages only change by a small amount, how similar are the changes in their averages? This is not an idle question. It can be investigated, and the answer is: mostly by very little. Nearby locations will tend to have similar variations in their long term averages. If the plains warm long term by 0.5°C, it is likely that the nearby mountains will warm by say 0.4–0.6°C in the long term. Not by 1.5 or -1.5°C.
It is easy to see why this would tend to be the case. Adjacent stations will tend to have the same weather systems passing over them. So their individual weather patterns will tend to change in lockstep. And thus their long term averages will tend to be in lock-step as well. Santiago in Chile is down near sea level while the Andes right at its doorstep are huge mountains. But the same weather systems pass over both. The weather that Adelaide, Australia gets today, Melbourne will tend to get tomorrow.
Station Correlation Scatter Plots (HL87)Final question. If nearby locations have similar variations in their climate, irrespective of each station’s local climate, what do we mean by ‘nearby’? This too isn’t an idle question; it can be investigated, and the answer is many 100’s of kilometres at low latitudes, up to 1000 kilometres or more at high latitudes. In Climatology this is the concept of ‘Teleconnection’ – that the climates of different locations are correlated to each other over long distances.
Figure 3, from Hansen & Lebedeff 1987 (apologies for the poor quality, this is an older paper) plots the correlation coefficients versus separation for the annual mean temperature changes between randomly selected pairs of stations with at least 50 common years in their records. Each dot represents one station pair. They are plotted according to latitude zones: 64.2-90N, 44.4-64.2N, 23.6-44.4N, 0-23.6N, 0-23,6S, 23.6-44.4S, 44.4-64.2S.
Notice how the correlation coefficients are highest for stations closer together and less so as they stretch farther apart. These relationships are most clearly defined at mid to high northern latitudes and mid southern latitudes – the regions of the Earth with higher proportions of land to ocean.
This makes intuitive sense since surface air temperatures of the oceanic regions are influenced also by water temperatures, ocean currents etc instead of just air masses passing over them, while land temperatures don’t have this other factor. So land temperatures would be expected to have better correlation since movement of weather systems over them is a stronger factor in their local weather.
This is direct observational evidence of Teleconnection. Not just climatological theory but observation.
A better answer
So what if we do the following? Rather than averaging all our stations together, instead we start out by looking at each station separately. We calculate its long term average over some suitable reference period. Then we recalculate every reading for that station as a difference from that reference period average. We are comparing every reading from that station against its own long term average. Instead of a series of temperatures for a station, we now have a series of ‘Temperature Anomalies’ for that station. And then we repeat this for each individual station, using the same reference period to produce the long term average for each separate station.
Then, and only then, do we start calculating the Area Weighted Average of these Anomalies. We are now calculating the ‘Area Average of the Anomalies’ rather than the ‘Anomaly of the Area Averages’ – now there’s a mouthful. Think about this. We are averaging the changes, not averaging the absolute temperatures.
Does this give us a better result? In our imaginary ideal world where we have lots of stations, always reporting all the time, no missing readings, etc., then these two methods will give the same result.
The difference arises when we work in an imperfect world. Here is an example (for simplicity I am only doing simple averages here rather than area weighted averages):
Let’s look at stations A to E. Let’s say their individual long term reference average temperatures are:
A = 15, B = 10, C = 5, D = 20 & E = 25
Then for one day’s data their individual readings are:
A = 15.8, B = 10.4, C = 5.7, D = 20.4 & E = 25.3
Using the simple Anomaly of Averages method from earlier we have:
(15.8+10.4+5.7+20.9+25.3)/5 – (15+10+5+20+25)/5 = 0.52
While using our Average of Anomalies method we get:
((15.8-15) + (10.4-10) + (5.7-5) + (20.4-20) + (25.3-25))/5 = 0.52
Exactly the same!
However, if we remove station C as in our earlier example, things look very different. Anomaly of Averages gives us:
(15.8+10.4+20.4+25.3)/4 – (15+10+5+20+25)/5 = 2.975 !!
While Average of Anomalies gives us:
((15.8-15) + (10.4-10) + (20.4-20) + (25.3-25))/4 = 0.475
Obviously both values don’t match what the correct value would be if station C were included, but the second method is much closer to the correct value. Bearing in mind that Teleconnection means that adjacent stations will have similar changes in anomaly anyway, this ‘Average of Anomalies’ method is much less sensitive to variations in station availability.
Now let’s consider how this approach could be used when looking at station histories over long periods of time. Consider 3 stations in ‘adjacent’ locations. A has readings from 1900 to 1960. B has reading from 1930 to 2000 and C has readings from 1970 to today. A overlaps with B, B overlaps with C. But C doesn’t overlap with A. If our reference period is say 1930 – 1960, we can use the readings from A & B. But C doesn’t have any readings from our reference period. So how can we splice together A, B, & C to give a continuous record for this location?
Doesn’t this mean we can’t use C since we can’t reference it to out 1930-1960 baseline? And if we use a more recent reference period we lose A. Do we have to ignore C’s readings entirely? Surely that means that as the years roll by and the old stations disappear, eventually we will have no continuity to our record at all? That’s not good enough.
However there is a way we can ‘splice’ them together.
A & B have a common period from 1930-1960. And B & C have a common period from 1970-2000. So if we take the average of B from 1930 to 1960 and compare it to the same average from A for the same period we know how much their averages differ. Similarly we can compare the average of B from 1930-1960 to the average for B from 1970-2000 to see how much B has changed over the intervening period. Then we can compare B vs C over the 1970-2000 period to relate them together. Knowing these three differences, we can build a chain of relationships that links C1970-2000 to B1970-2000 to B1930-1960 to A1930-1960
Something like this:
‘Chaining’ station histories together
If we have this sort of overlap we can ‘stitch together’ a time series stretching beyond more than one station’s data. We have the means to carry forward our data series beyond the life (and death) of any one station, as long as there is enough time overlap between them. But we can only do this if we are using our Average of Anomalies method. The Anomaly of Averages method doesn’t allow us to do this.
So where has this got us in looking at our problems? The Average of Anomalies approach directly addresses problem #1. Area Weighted Averaging addresses problem #2. Teleconnection and comparing a station to itself helps us hugely with problem #3 – if fog provides local insulation, it probably always had, so any changes are less related to the local conditions and more to underlying climate changes. Local station bias issues still need to be investigated but if they don’t change over time, then they don’t introduce ongoing problems. For example, if a station is too close to an artificial heat source, then this biases that station’s temperature. But if this heat source has been a constant bias over the life of the station, then it cancels out when calculate the anomaly for the station. So this method also helps us with (although doesn’t completely solve) problem #4. In contrast, using the Anomaly of Averages method, local station biases and erratic station availability will compound each other making things worse.
So this looks like a better method.
Which is why all the surface temperature analyses use it!
The Average of Anomalies approach is used precisely because it avoids many of the problems and pitfalls.
In Part 1B I will look at how the main temperature records actually compile their trends.
@Smokey
“Dave Springer, The 1°C number is a ballpark figure, because we don’t know for certain what the sensitivity number is.”
Sensitivity is another word for feedbacks. 1C is what you get from a CO2 doubling in the absence of feedbacks and is a hard number that you can take to the bank in a dry atmosphere over dry land. Now you know. Of course there’s no such thing in the real world as a totally dry cloud-free atmosphere over a dry land so this is maximum theoretical no-feedback effect. Some arid regions may approximate it fairly well. Those regions will also approximate a black body fairly well too. Once liquid water or water vapor enters the picture all bets are off. Given that the earth is a 71% covered in water that pretty much means that 71% of the bets are off. For instance, there is very little atmospheric greenhouse effect. The earth is warmer than the moon not because of its atmosphere but because the surface is 71% covered by water that averages 12,000 feet deep. The atmosphere’s primary role is establishing a surface pressure in which water has a wide temperature range in which it can exist as a liquid. If the ocean weren’t there this planet would be as cold as the moon which has an average temperature of -23C.
Off Averages & Anomalies, Part 1B
In Part 1A we looked at how a reasonable temperature record needs to be compiled. If you haven’t already read part 1A, it might be worth reading it before 1B.
There are four major surface temperature analysis products produced at present: GISTemp from the Goddard Institute of Space Sciences (GISS); HadCRUT, a collaboration between the Hadley Research Center and the University of East Anglia Climate Research Unit (HadCRUT); The US National Oceanic And Atmospheric Administration’s (NOAA) National Climatic Data Center (NCDC); and the Japanese Meteorological Agency (JMA). Another major analysis effort is currently underway: the Berkeley Earth Surface Temperature Project (BEST), but as yet their results are preliminary.
GISTemp
We will look first specifically at the product from GISS, at how they do their Average of Anomalies, and their Area Weighting scheme. This product dates back to work undertaken at GISS in the early 1980s with the principle paper describing the method being Hansen & Lebedeff 1987 (HL87).
The following diagram illustrates the Average of Anomalies method used by HL87
Reference Station method for comparing stations series
This method is called the ‘Reference Station Method’. One station in the region to be analysed is chosen as station 1, the reference station. The next stations are 2, 3, 4, etc., to ‘N’. The average for each pair of stations (T1, T2), (T1, T3), etc. is calculated over the common reference period using the data series for each station T1(t), T2(t), etc., where “t” is the time of the temperature reading. So for each station their anomaly series is the individual readings – Tn(t) – minus the average value of Tn.
“δT” is the difference between their two averages. Simply calculating the two averages is sufficient to produce two series of anomalies, but GISTemp then shifts T2(t) down by δT, combines the values of T1(t) and T2(t) to produce a modified T1(t), and generates a new average for this (the diagram doesn’t show this, but the paper does describe it). Why are they doing this? Because this is where their Area Averaging scheme is included.
When combining T1(t) and T2(t) together, after adjusting for the difference in their averages, they still can’t just add them because that wouldn’t include any Area Weighting. Instead, each reading is multiplied by an Area Weighting factor based on the location of each station; these two values are then added together and divided by the combined area weighting for the two stations. So series T1(t) is now modified to be the area weighted average of series T1(t) and T2(t). Series T1(t) now needs to be averaged again since the values will have changed. Then they are ready to start incorporating data from station 3 etc. Finally, when all the stations have been combined together, the average is subtracted from the now heavily-modified T1(t) series, giving us a single series of Temperature Anomalies for the region being analysed.
So how are the Area Weighting values calculated? And how does GISTemp then average out larger regions or the entire globe?
They divide up the Earth into 80 equal area boxes – this means each box has sides of around 2500km. Then within each box they divide these up into 100 equal area smaller sub-boxes.
GISTemp Grids
They then calculate an anomaly for each sub-box using the method above. Which stations get included in this calculation? Every station within 1200 km of the centre of the sub-box. And the weighting for each station used simply diminishes in proportion to its distance from the centre of the sub-box. So a station 10km from the centre will have a weighting of 1190/1200 = 0.99167, while a station 1190 km from the centre will have a weighting of 10/1200 = 0.00833. In this way, stations closer to the centre have a much larger influence while those farther away an ever smaller influence. And this method can be used even if there are no stations directly in the sub-box, inferring its result from surrounding stations.
In the event that stations are extremely sparse and there were only 1 station within 1200 km, then that reading would be used for a sub-box. But as soon as you have even a handful of stations within range, their values will quickly start to balance out the result. And closer stations will tend to predominate. Then the sub-boxes are simply averaged together to produce an average for the larger box – we can do this without any further area averaging because we have already used area averaging within the sub-box and they are all of equal area. Then in turn the larger boxes can be averaged to produce results for latitude bands, hemispheres, or globally. Finally these results are then averaged over long time periods.
Remember our previous discussion of Teleconnection, and that long term climates are linked over significant distances. This is why this process can produce a meaningful result even when data is sparse. On the other hand, if we were trying to use this method to estimate daily temperatures in a sub-box, the results would be meaningless. The short term chaotic nature of daily weather would swamp any longer range relationships. But averaged out over longer time periods and larger areas, the noise starts to cancel out and underlying trends emerge. For this reason, the analysis used here will be inherently more accurate when looked at over larger times and distances. The monthly anomaly for one sub-box will be much less meaningful than the annual anomaly for the planet. And the 10-year average will be more meaningful again.
And why the range of 1200 km? This was determined in HL87 based on the correlation coefficients between stations shown in the earlier chart. The paper explains this choice:
“The 1200-km limit is the distance at which the average correlation coefficient of temperature variations falls to 0.5 at middle and high latitudes and 0.33 at low latitudes. Note that the global coverage defined in this way does not reach 50% until about 1900; the northern hemisphere obtains 50% coverage in about 1880 and the southern hemisphere in about 1940. Although the number of stations doubled in about 1950, this increased the area coverage by only about 10%, because the principal deficiency is ocean areas which remain uncovered even with the greater number of stations. For the same reason, the decrease in the number of stations in the early 1960s, (due to the shift from Smithsonian to Weather Bureau records), does not decrease the area coverage very much. If the 1200-km limit described above, which is somewhat arbitrary, is reduced to 800 km, the global area coverage by the stations in recent decades is reduced from about 80% to about 65%.”
Effect of station count on area coverage
It’s a trade-off between how much coverage we have of the land area and how good the correlation coefficient is. Note that the large increase in contributing station numbers in the 1950s and subsequent drop off in the mid-1970s does not have much of an impact on percentage station coverage – once you have enough stations, more doesn’t improve things much. And remember, this method only applies to calculating surface temperatures on land; ocean temperatures are calculated quite separately. When calculating the combined Land-Ocean temperature product, GISTemp uses land-based data in preference as long as there is a station within 100 km. Otherwise it uses ocean data. So in large land areas with sparse station coverage, it still calculates using the land-based method out to 1200 km. However, for an island in the middle of a large ocean, the land-based data from that island is only used out to 100 km. After that point, the ocean based data prevails. In this way data from small islands don’t influence the anomalies reported for large areas of ocean when ocean temperature data is available.
One aspect of this method is the order in which stations are merged together. This is done by ordering the list of stations used in calculating a sub-box by those that have the longest history of data first, with the stations with shorter histories last. So they are merging progressively shorter data series into a longer series. In principle, the method used to select the order in which the stations are being processed could have a small effect on the result. Selecting stations closer to the centre of the sub-box first is an alternative approach. HL87 considered this and found that the two techniques produced differences that were two orders of magnitude smaller than the observed temperature trends. And their chosen method was found to produce the lowest estimate of errors. They also looked at the 1200 km weighting radius and considered alternatives. Although this produced variations in temperature trends for smaller areas, it made no noticeable difference to zonal, hemispheric, or global trends.
The Others
The other temperature products use somewhat simpler methods.
HadCRUT (or really CRU, since the Hadley Centre contribution is Sea Surface Temperature data) calculate based on grid boxes that are xº by xº, with default value being 5º by 5º. At the equator, this means they are approximately 550 x 550 km, although much smaller at the polar regions. They then take a simple average all the anomalies for every station within that grid box. This is a much simpler area averaging scheme. Because they aren’t interpolating data like GISS, they are limited by the availability of stations as to how small their grid size can go, otherwise too many of their grids may have no station at all. And in grid boxes where there is no data available, they do not calculate anything. So they aren’t extrapolating / interpolating data. But equally, any large-scale temperature anomaly calculation such as for a hemisphere or the globe is effectively assuming that any uncalculated grid boxes are all actually at the calculated average temperature. However, to then build results for larger areas, they need to area weight the averages for the differing sizes of the grid boxes depending on the latitude they are at.
NCDC and JMA also use a 5º by 5º grid size and simply average anomalies for all stations within that grid then area weighted averaging is used to average grid boxes together. All three also combine land and sea anomaly data. Unlike HadCRU & NCDC, which use the same ocean data, JMA maintain their own separate database of SST data.
In this article and the Part 1A, I have tried to give you an overview of how and why surface temperatures are calculated, and why calculating anomalies then averaging them is far more robust than what might seem the more intuitive method of averaging then calculating the anomaly.
In parts 2A & 2B we will look at the implications of this for the various questions and criticisms that have been raised about the temperature record.
Off Averages & Anomalies Part 2A
In Part 1A and Part 1B we looked at how surface temperature trends are calculated, the importance of using Temperature Anomalies as your starting point before doing any averaging and why this can make our temperature record more robust.
In this Part 2A and a later Part 2B we will look at a number of the claims made about ‘problems’ in the record and how misperceptions about how the record is calculated can lead us to think that it is more fragile than it actually is. This should also be read in conjunction with earlier posts here at SkS on the evidence here, here & here that these ‘problems’ don’t have much impact. In this post I will focus on why they don’t have much impact.
If you hear a statement such as ‘They have dropped stations from cold locations so the result is now give a false warming bias’ and your first reaction is, yes, that would have that effect, then please, if you haven’t done so already, go and read Part 1A and Part 1B then come back here and continue.
Part 2A focuses on issues of broader station location. Part 2B will focus on issues related to the immediate station locale.
Now to the issues. What are the possible problems?
Urban Heat Island Effect
This is the first possible issue we will consider. The Urban Heat Island Effect (UHI) is a real phenomenon. In built-up urban areas the concentration of heat storing materials in buildings, roads, etc. such as concrete, bitumen, bricks and so on, and heat sources such as heaters, air-conditioners, lighting, cars, etc. all combine to produce a local ‘heat island’: a region where temperatures tend to be warmer than the surrounding rural land. This is well-known and you can even see its effects just looking at reports of daily temperatures. If we have weather stations inside such a heat island they will record higher temperatures than they would if they were in the surrounding country side. If we don’t make some sort of compensation for this then this could be a real source of bias in our result, and since we never see ‘cool islands’, its bias would be towards warming.
This is why the major temperature records include some method for compensating for it, either by applying a compensating adjustment to the broad result they calculate, or by trying to identify stations that have such an issue and adjusting them. GISTemp for example seeks to identify such urban stations and then adjust them so that the urban station’s long-term trend is brought into line with adjacent rural stations. There is also the question of identifying which stations are ‘urban’. Previous methods relied on records of how stations were classified, but this can change over time as cities grow out into the country, for example. GISTemp recently started using satellite observations of lights at night to identify urban regions – more light means more urban.
What other factors might limit or exaggerate the impact a heat island might have?
Is the UHI effect at the station growing, changing over time? Has the UHI effect in a city got steadily warmer, or has the area that is affected by the heat island expanded but the magnitude of the effect hasn’t changed. This will depend on things like how the density of the city changes, what sort of activities happen where, etc. For a station that has always been inside a city, say an inner city university, it will only be affected if the magnitude of the heat island effect increases. If the UHI at a site stays constant, then that isn’t a bias to the trend.
On the other hand, a previously rural station that has been engulfed by an expanding city will most definitely feel some warming and will show a trend during the period of its engulfing, although again how much will depend on circumstances. And this will look like a trend. If it has been engulfed by low density suburbia and its piece of ‘country’ has been preserved as a large park around it, the impact will be lower than if a complete satellite city has sprung up around it and it is on the pavement next to a 6 lane expressway.
But remember, the existing products include a compensation to try and remove UHI, UHI only impacts our long term temperature results if the magnitude of the effect is growing, and each station’s data still has to be added to the results for all other stations using Area Weighted Averaging. And since the vast majority of the Earths land area isn’t urban, UHI can only have a limited impact on the final result anyway. And the Oceans aren’t affected by UHI and they are 70% of the Earth’s surface.
Airports
One particular example sometimes cited is the number of stations located at airports, with images being painted of ‘all those hot jet exhausts’ distorting the results. Firstly we are interested in daily average temperatures not instantaneous values. So the station would need to get hit by a lot of jets.
Think about a medium-sized airport. At its busiest it might have one aircraft movement (take off or landing) per minute. Each takeoff involves less than a minute at full power while the rest of the take off and landing, 10 minutes or so of taxiing, is at relatively low power. For the rest of the one to several hours that the aircraft is on the ground, its engines are off. So for each jet at the airport, its average power output over its entire stay there is a very tiny percentage of its full power. And many airports have night-time curfews when no aircraft are flying. So how much do the jets contribute to any bias?
Consider instead that the airport is like a mini-city – buildings and lots of concrete and bitumen tarmac. But also lots of grassed land in between. So the real impact of an airport on any station located there will be more like a mini-UHI effect. But how much does an airport grow? Usually they have a fixed area of land set aside for them. The number of runways and taxiways doesn’t change much. And the area of apron around the terminal buildings doesn’t change that much over time. So the magnitude of this UHI effect is unlikely to change greatly over time unless the airport is growing rapidly.
If an airport is located in a rural area then any changes to the climate in the airport is going to be moderated by effects from surrounding countryside since it after all a mini-city not a city. If an airport has always been inside an urban area such a Le Guardia in New York then it is going to be adjusted for by the UHI compensations described above. And a rural airport that has been enveloped by its city will eventually have a UHI compensation applied. So the airports that are most likely to have a significant impact need to be and remain rural, be so big that moderating effects from the surrounding countryside don’t have much effect, and be expanding so that their bias keeps growing and thus isn’t compensated out by the analysis method. Then they need to dominate the temperature record for large areas, with few other adjacent stations. And then there are no airports on the oceans. So any airport that is likely to have an impact needs to be near a large growing city to generate the large and increasing traffic volumes to cause the airport to be large and growing, in a region that is sparsely populated otherwise so there are few other stations. And most large growing cities tend to be near other such cities.
Islands
There is one special case sometimes cited in relation to GISTemp: islands. If the only station on an island in the ocean is at an airport or has ‘problems’, that islands data will then supposedly be used for the temperature of the ocean up to 1200 km away in all directions, extending any problems over a large area. This claim is missing one key point: the temperature series used to determine global warming trends is the combined Land and Ocean series. And when land data isn’t available such as around an island, ocean data is substituted instead.
This is some data from a patch of ocean in the South Pacific (OK, it’s from around Tahiti, I’m a sucker for exotic locations). I calculated this by using GISTemp to calculate temperature anomalies for grids around the world for 1900 to 2010, using consecutively land only data, ocean only data and combined land & ocean data. I then calculated from the three values obtained from each grid point the percentage contribution to the combined land/ocean data of each of the two sources. The following graph shows the percentage contribution of the land data at each grid point. And for reference below I have listed the temperature stations in the area with their Lat/Long. Obviously this isn’t coming just from land only data and in grids too far from land the % contribution of land data falls to zero. Each 2º by 2º grid is approximately 200 x 200 km, much less than the 1200km averaging radius used by GISTemp.
% Land Contributionaround Tahiti
There aren’t enough stations
A common criticism is that there aren’t enough temperature stations to produce a good quality temperature record. A related criticism is that the number of stations being used in the analysis has dropped off in recent decades and that this might be distorting the result. On the Internet comments such as ‘Do you know how many stations they have in California?’ – By implication not enough – are not uncommon. This seems to reflect a common misperception that you need large numbers of stations to adequately capture the temperature signal with all its local variability.
However, as I discussed in Part 1A, the combination of calculating based on Anomalies and the climatological concept of Teleconnection means that we need far fewer stations than most people realise to capture the long-term temperature signal. If this isn’t clear, perhaps re-read Part 1A.
So how few stations do we need to still get an adequate result? Nick Stokes ran an interesting analysis using just 61 stations with long reporting histories from around the world. His results, plotting his curve against CruTEM3, although obviously much noisier than the data from the full global temperature still produced a recognisably similar temperature curve even with just 61 stations worldwide!
Just 61 Stations!Just 61 Stations – Smoothed!
So even a handful of stations get you quite close. What reducing station numbers does is diminish the smoothing effect that lots of stations gives. But the underlying trend remains quite robust even with far fewer stations. What is perhaps more important is if the reduction in station numbers reduces ‘station coverage’ – the percentage of the land surface with at least one station within ‘x’ kilometres of that location. But as we discussed in Part 1A, Teleconnection means that ‘x’ can be surprisingly large and still give meaningful results. And with Anomaly based calculations, the absolute temperature at the station isn’t relevant; it is the long term change in the station we are working with.
The Thermometers are Marching!
A related criticism is that the decline in used station count has disproportionately removed stations from colder climates and thus introduced a false warming bias to the record. This has been labelled “The March of the Thermometers”. With the secondary ‘conspiracy theory’ type claim that this is intentional, all part of the ‘fudging’ of the data. This can seem intuitively reasonable – surely if you remove cold data from your calculations the result will look warmer. And if that is the result then, hey, that could be deliberate.
But the apparent reasonableness of this idea rests on a mathematical misconception which we discussed in detail in Part 1A. If we average together the absolute temperatures from all the sites then most certainly removing colder stations would produce a warm bias. Which is one of the most important reasons why it isn’t done that way! Using that approach (what I called the Anomaly of Averages method) would produce a very unstable, unreliable temperature record indeed.
Instead what is done is to calculate the Anomaly for each station relative to its own history then average these anomalies (what I called the Average of Anomalies method).
Since we are interested in how much each station has changed compared to itself, removing a cold station will not cause a warming bias. Removing a cooling station would! The hottest station in the world could still be a cooling station if its long term average was dropping. 50 °C down to 49 °C is still cooling. Removing that station would add a warming bias. However, removing a station whose average has gone from -5 °C up to -4 °C would add a cooling bias since you have removed a warming station.
We are averaging the changes in the stations, not their absolute values. And remember that Teleconnection means that stations relatively close to each other tend to have climates that follow each other. So removing one station won’t have much effect if ‘adjacent’ stations are showing similar long term changes. So for station removals to add a definite warming bias we would need to remove stations that have or are showing less warming, remove other adjacent stations that might be doing the same, but leave any stations that are showing more warming. If this station removal was happening randomly, there is no reason to think that any effect from this would be anything other than random, not a bias.
If this were part of some ‘wicked scheme’, then the schemers would need to carefully analyse all the world’s stations, look for the patterns of warming so they could cherry-pick the stations that would have the best impact for their scheme, and then ‘arrange’ for those station to become ‘unavailable’ from the supplier countries, while leaving the stations that support their scheme in place. And why would anyone want to remove stations in the Canadian Arctic for example as part of their ‘scheme’? Some of the highest rates of warming in the world is happening up there. Why remove them to make the warming look higher? Maybe someone is scheming. I’ll let you think about how likely that is.
But what if the pattern of station removals is driven by other factors – physical accessibility of the stations, operating budgets to keep them running etc.? Wouldn’t the stations more likely to be dropped be the ones in remote, difficult to reach, and thus expensive locations? Like mountains, arctic regions, poorer countries? Which are substantially where the ‘biasing’ stations are alleged to have disappeared from. If you drop ‘difficult’ stations you are very likely to remove Arctic and Mountain stations.
Could it also be that the people responsible for the ongoing temperature record realise that you don’t need that many stations for a reliable result and thus aren’t concerned about the decline in station numbers – why keep using stations that aren’t needed if they are harder to work with?
For example, here are the stats on stations used by GISTemp. The number of stations rose during the 60’s and dropped of during the 90’s but percentage coverage of the land surface only dropped off slightly. Where coverage is concerned, its not quantity that counts but quality.
Coverage from GISS
Station coverage from GISTemp
GISTemp ‘extrapolates’ 1200 kilometers
One particular criticism made of the GISTemp method is that ‘they use temporatures from 1200 km away’ usually spoken with a tome of incredulity and some suggestion that this number was plucked out of thin air.
Station Correlation Scatter Plots (HL87)As explained in Part 1A and Part 1B, the 1200 km area weighting scheme used by GISTemp is based on the known and observed phenomena of Teleconnection; that climates are connected over surprisingly long distances. The 1200 km range used by GISTemp was determined emprically to give the best balance between correlation between stations and area of coverage.
Figure 3, from Hansen & Lebedeff 1987 (apologies for the poor quality, this is an older paper) plots the correlation coefficients versus separation for the annual mean temperature changes between randomly selected pairs of stations with at least 50 common years in their records. Each dot represents one station pair. They are plotted according to latitude zones: 64.2-90N, 44.4-64.2N, 23.6-44.4N, 0-23.6N, 0-23,6S, 23.6-44.4S, 44.4-64.2S.
When multiple stations are located within 1200 km of the centre of a grid point, the value calculated is the weighted average of their individual anomolies. A station 10 km from the centre has 100 times the weighting of a station 1000 km from the centre. And as discussed under the section on islands previously, for small islands, the ocean data predominates not the land data.
One area of some contention is temperatures in the Arctic Ocean. Unlike the Antarctic, the Arctic does not have temperature stations out on the ice. So the neasest temperature stations are on the coast around the Arctic Ocean, Greenland and some islands. And ocean temperature data can’t be used instead since this is not available for the Arctic Ocean.
Other temperature products don’t calulate a result for theArctic Ocean. The result is that when compiling the Global trend, the headline figure most people are interested in, this method effectively assumes that the Arctic Ocean is warming at the same rate as the global average. Yet we know the land around then Arctic is warming faster than the global average so it seems unreasonable to suggest that the ocean isn’t, Satellite temperature measurements up to 82.5 N support this as does the decline of Arctic sea ice here, here & here.
So it seems reasonable that the Arctic Ocean would be warming at a rate comparable to the land. Since the GISTemp method is based on empirical data regarding teleconnection, projecting this out seems to me the better option since we know the alternative method will produce an underestimate. Many parts of the Arctic Ocean are significantly less than 1200 km from land, with the main region where this isn’t the case being between Alaska & East Siberia and the North Pole.
Certainly the implied suggestion that GISTemp’s estimates of Arctic Ocean anomalies are false isn’t justified. It may not be perfect but it is better than any of the alternatives.
In this post we have looked at some of the reasons why the temperature trend may be more robust with respect to factors affecting the broader region in which stations are located than might seem the case. The method used to calculate temperature trends does seem to provide good protection against these kinds of problems
In Part 2B we will continue, looking at issues very local to a station and why these aren’t as serious as many might think…
Of Averages & Anomalies Part 2B
In Part 1A and Part 1B we looked at how surface temperature trends are calculated, the importance of using Temperature Anomalies as your starting point before doing any averaging and why this can make our temperature record more robust.
In Part 2A and in this Part 2B we will look at a number of the claims made about ‘problems’ in the record, and how misperceptions about how the record is calculated can lead us to think that it is more fragile than it actually is. This should also be read in conjunction with earlier posts here at SkS on the evidence here, here & here that these ‘problems’ don’t have much impact. In this post I will focus on why they don’t have much impact.
If you hear a statement such as ‘They have dropped stations from cold locations so the result is now give a false warming bias’ and your first reaction is, yes, that would have that effect, then please, if you haven’t done so already, go and read Part 1A and Part 1B then come back here and continue.
Part 2A focused on issues of broader station location. Part 2B focuses on issues related to the immediate station locale.
Now to the issues. What are the possible problems?
Problems with ‘bad’ stations
One issue that has received considerable attention is the question of the ‘quality’ of surface observation stations, particularly in the US. How well do the stations in the observation network meet quality standards with respect to location and avoidance of local biasing issues, and how much might this impact on the accuracy of the temperature record.
The upshot of investigations into this is that, at least in the US, a substantial proportion of stations have poor location quality ratings. However, analysis of the impact of the site quality problems by a number of independent analysts suggests that these problems have had almost no impact on the accuracy of the long term temperature record. How could this be? Surely that is the whole point of these quality rankings – poor quality sites can give bad results. So why wouldn’t they?
The definition of the best quality sites, Category 1 is as follows:
“Flat and horizontal ground surrounded by a clear surface with a slope below 1/3 (<19º). Grass/low vegetation ground cover 3 degrees.”
Down to Category 5:
“(error ≥ 5ºC) – Temperature sensor located next to/above an artificial heating source, such a building, roof top, parking lot, or concrete surface”
Lets consider a few of these factors. And remember, we are interested in factors that have an impact on long-term changes in the temperature readings at a site. If a factor results in a bias in the reading but this bias does not change over time, then it will not impact on the analysis since we are interested in changes – static biases get cancelled out in the analysis and have no long-term impact. Firstly, let’s look at the standard enclosure used for a meterological measurement station – the Stevenson Screen:
Stevenson Screen
The screen is designed to isolate the instruments inside from outside influences, particularly radiant effects from its surrounds and rain. It is usually made from a material such as wood or similar that is a fairly good insulator and isn’t going to change temperature too much because of radiant heating/cooling from its surroundings. The double-slatted design suppresses air movement from wind through the enclosure, minimising wind chill effects and restricting rain entry onto the instruments. The double-slatted design also means that any air rising from beneath the enclosure isn’t being preferentially drawn into or out of the box. And the design of the base allows air movement from below while shielding from radiation from below.
So what are the problems which can change the category of a temperature monitoring station to lower than 1?
Slope > 19º
A problem may arise if the station isn’t located on sufficiently flat ground. This can produce air movements that are caused by temperature, resulting in warmer air possibly moving towards the station. However, unless there have been really major earthworks around the site, this factor doesn’t change over time and is unlikely to have a long-term changing impact.
Grass/low vegetation ground cover >10 centimeters high
This can impact on air movements around the station. Also, if the vegetation changes substantially – low grass to shrubs and trees – then this could change water evaporation rates around the station and alter air temperatures. Major increases in vegetation might have a cooling effect on the station due to evaporative effects, while declines in vegetation back to Category 1 standards might have a warming impact. However, unless there is a regular and progressive change in the vegetation pattern around the station, this would not produce an ongoing change of any bias. If maintenance of vegetation around the station over its lifetime has been poor or erratic, then the bias may fluctuate up and down. This would create shorter term fluctuations in the bias but this would tend to cancel out in the longer term.
Shading when the sun elevation >3 degrees
If the degree to which the station and its surrounds are shaded over the course of the day changes, this can alter local heating. Primarily this is going to impact as a result of shading causing differing heating/cooling of the ground under/around the enclosure, resulting in changes in the temperature and flow rate of rising air up through the enclosure. Unless the cause of the shading varies over long, multi-year time frames such as trees growing or buildings rising, the shading effect is not a long-term changing biasing factor. Depending on the cause of the shading, this may cause changes in the bias over the course of a day and over the seasons, but as a multi-year bias, this would remain constant.
Not far enough from large bodies of water
This too is a static bias. The body of water would have a cooling effect due to evaporation that would vary with daily weather conditions and the seasons but would not be a multi-year biasing factor.
Static artificial heating sources
Essentially surfaces such a brick, concrete, bitumen, etc. that can act as local heat stores, greater than normal grass covered earth would be, that can then release heat either radiantly or by heating the surrounding air. These can be vertical structures, horizontal surfaces away from the enclosure, or a horizontal surface beneath the enclosure. The enclosures are designed to minimise radiant heat penetration into the enclosure from its surrounds, so the major impact of such static heating sources is going to be from heating surrounding air which may then pass through the enclosure. This will be worst when such a surface is very close to the enclosure, particularly beneath it, generating rising warmer air into the box.
Also an important factor will be the extent to which any such surfaces tend to form a partial ‘room’ around the enclosure, restricting horizontal air movement. Any such surface will tend to heat the air near/above it, causing that air to rise. More air is then drawn in to replace this, potentially flowing over or through the enclosure. If the distances involved and the geometry of the site result in this new air being warmer than the general surroundings, this could provide a warming bias for the site. Conversely if this replacement air is being drawn from a location that isn’t warmer then there may be no bias at all, possibly even a cooling effect. Ambient winds may also blow warmed air towards, or away from enclosure, depending on wind direction. And the effect of any such bias will vary over the course of the day and the seasons.
However, since the main source of any such bias is the amount and layout of such surfaces and sunlight, these biases won’t change over multi-year time frames unless the area of the surfaces is changing. This could be due to construction, or changes in shading of these surfaces such as by trees growing or building construction nearby. And some of these shading changes could actual reduce the bias over time, resulting in a long-term cooling trend. Also to be considered is whether the site is included within a region that is or becomes urban, in which case the UHI adjustments mentioned previously may cancel out any bias completely. And we still have to allow for area weighting of data from such a site when averaged over the Earth’s land surface. And this doesn’t affect the oceans at all.
Dynamic artificial heating sources
These are similar to the static surfaces, but they are things that actively pump heated air into the environment. Things such as Air Conditioner condensers, Exhaust fans, Heater flues, Cooling towers, Vehicle exhausts, etc. As with the static sources, a key issue here is geometry. They are generating hot air which will tend to rise unless winds blow it towards the enclosure. Does any such device actively blow warm air towards the enclosure? Or does its operation tend to draw air in from elsewhere and over the enclosure? How distant is the device and what is the geometry?
Also how long does the device operate for; 24/7 or intermittently? A station may be next to a large car park, but unless there is continuous activity, even thousands of cars have no extra impact if they are all parked and empty. Does an Air-conditioner run 24/7 or just 9-5 weekdays? Is it a reverse cycle A/C unit also used for heating in winter or at night, in which case it will pump out colder air then that doesn’t rise? How much do these activities vary with the seasons? And ultimately do these activities grow in magnitude over multi-year time frames? Otherwise they again contribute to short-term intra-annual biasing but not multi-year effects. And they may be cancelled out anyway by UHI compensations.
Conclusions about ‘Bad’ Stations
The US network certainly isn’t as good as it should be. There are certainly factors operating there that influence short term daily and seasonal readings and these may have important implications for use in daily Meteorological forecasting which rely on absolute temperatures. However, for long term multi-year Climatological uses, it is perhaps easy to overestimate the impact of these problems.
It easy to understand how our subjective impressions, standing near a poor quality site, seeing an A/C roaring away or feeling the radiant heat from a concrete parking lot nearby, could lead us to think this is a big issue. But the combination of the screening properties of the enclosures, long-term averaging, anomaly-based averaging, and UHI compensation will certainly tend to remove many biases that do not have long term-trend changes. And area averaging over the Earth’s land surface combined with the fact that most of the Earth is water reduces any impact even further.
So it isn’t surprising that the long term temperature trend data doesn’t seem to be significantly affected by station quality issues. That is not to say that there may not be noticeable impacts on shorter term measures – local and seasonal trends and possibly daily temperature range (DTR) effects for example. But for the headline Global Temperature Anomaly, which is a main indicator of Climate Change, station quality issues appear to be a very minor issue, something that ‘all comes out in the wash’.
Station Homogenisation issues
Finally we come to ‘Station Homogenisation’ – the process of reviewing station data records looking for errors that are a result of how the measurement was taken, rather than what the temperature actually was.
A common misconception is that ‘the thermometer never lies’. That the raw data is the gold standard. As anyone who works in any field involving instrumentation knows, this isn’t true; there are always ‘issues’ that you have to monitor for. Any instrument, even a simple thermometer will have its own built in biases.
Sometimes there will be readings that are just plain whacky. And surrounding influences can have an impact. A thermometer out in the sunshine will have a different reading from one shaded by your hand for a few minutes. A caretaker who can work quickly taking the readings when the enclosure door is open will produce a different bias from one who works slowly, or reads the instruments in a different order. Bias and error is everywhere.
If readings at a station weren’t always taken at the same time of day, this can introduce biases. Changes in the instruments used can introduce a bias. Some readings can be just plain wrong. Imagine some scenarios:
The caretaker of a station may have had a ‘big night out’ and not read the thermometer very accurately. There is an error there but we probably can’t detect it.
The caretaker of a station may have a ‘big night out’ every Friday night. Now there might be a regular error in Saturday’s readings. With a pattern like this, we might be able to detect it with statistical analysis. We might be able to correct it but only if we are certain enough.
That caretaker might have had one ‘REALLY big night out’ and next morning broke the thermometer. He replaced it but did he record that fact in the station log? If he did, we know that a change of bias has been introduced between the two thermometers. Then we can compare readings from before and after and try to find the change. But only after we have years’ worth of data from both thermometers. And if he didn’t log it, then we only spot a problem if that station seems to have a strange change compared to nearby stations.
Over time the Stevenson Screen may have fallen into disrepair, resulting in a slow changing bias as outside influences start to penetrate. Then the site is updated with a new screen. Biases removed, although the new screen may have its own small bias. If we now about the change we can try to compensate for it. Eventually when we have enough data from before and after.
The caretaker at the station in Ushuaia right at the southern tip of Argentina records data through the early 1900’s. In Spanish, with poor handwriting – really hard to tell 7’s and 9’s apart. The log sheets are sent to Buenos Aires where the data from this and many other stations are collated and typed up onto summary sheets by a clerk with an old battered typewriter. Then they are filed away; 40 years later they are extracted, faded and old, photocopied on a poor quality early copier and mailed to the US for incorporation into climate databases. Where they must be copied into the database again by hand. How many errors have crept in during that process?
So, we can’t simply take the raw data at face value. It has noise in it. We need to analyse this data looking for problems and correcting them when we are confident enough of the correction. But also being careful that we don’t introduce errors through unjustified corrections. This requires care and judgment and it is sometimes a real detective story. And often corrections cannot be made until many, many years later because you need lots of data before you can spot changes in bias.
So this process of working through the data, trying to make it more accurate is ongoing.
But what of its impact on the temperature record? Again, if the biases at a station don’t change over time, they don’t affect our analysis. Individual errors matter but they will tend to be random, some higher, some lower so when we average over large areas and long time periods, they tend to cancel out. Again, it is problems that cause changing biases that matter. And analysis of changes due to Homogenisation in the record indicate that there are as many cooling changes as warming ones. Such as this from Brohan et al 2006:
Homogenisation Distribution
Conclusion
So, Part 1A looked at how we should calculate the temperature record and why the method used is very important to the result. And that this doesn’t necessarily match our intuitive idea of how it should be done; in this our intuition is often wrong. In Part 1B looked at how we DO calculate the temperature record, that is using the method outlined in part one and that the area weighting scheme used by one record is based on empirical evidence. In Part 2A we looked at some of the areas where the temperature record has been criticised with respect to its broader locale. And in this post we have explored issues related to the immediate surrounds of the station.
I think we have seen that there are many reasons why we tend to overestimate the effect of these problems. This conclusion is consistent with the evidence here, here & here from various analyses that show that these possible problems haven’t had any significant effect on the result.
My conclusion is that we can have a strong confidence in the results produced for the global temperature trend. Any problems will show up more in short-term patterns such as seasonal, monthly and daily trends. But the headline global numbers look pretty robust.
You will have to make up your own mind but I hope I have been able to give you some food for thought when you are thinking about this.
Brian
1. Scientific truth is not determined by holding a vote. It is determined by collecting and evaluating evidence. The history of science is littered with examples of where the consensus was wrong.
2. Not one of the major scientific institutions you allude to has actually ballotted its members on CAGW. The committees just presume to speak on behalf of their membership.
3. See the paper from which the 97% agreement figure was derived. The question on man’s influence on the climate was not well-formed. It did not restrict itself to global rather than local effects, nor was it specific to CO2 ( as opposed to land use changes, etc ) and it also relied on the respondent’s interpretation of what was meant by “significant”.
4. The questionnaire was circulated to those working in “earth sciences”. So, no solar physicists ? There are quite a lot of people who think that big yellow thing in the sky might have something to do with the climate. No cosmologists – what about the cosmic ray theory? No specialists in thermodynamics, many of whom question the scientific principle of the greenhouse effect ? No botanists, whose stomata studies directly contradict the ice core CO2 records ? etc, etc.
5. Finally, the 97% represented only 75 scientists, all of whom were described as climatologists. This probably means they worked for the institutions which produce the climate models and, …. wait for it ….. they believe in their own models !
Have you heard of the Little Ice Age? Click that link and you will see two things. The first is the obligatory “I believe in global warming like you, so don’t cut off my funding!” message (see Climategate and it’s emails showing what is done to those who fail to support AGW). At the very least, this shows that you cannot say that the scientist here has an anti AGW bias. Second, there is this, yet another site (Wikipedia) that starts with the obligatory “we support AGW!” message, trying to say that the little ice age (according to the IPCC) was not worldwide, and then is followed by actual scientific evidence from , respectively, Europe, North America, Central America, Africa, Antarctica, Australia, New Zealand and Pacific Islands, and South America, which shows that the LIA was, indeed, worldwide, and that the IPCC was wrong or flat out lying. And if you look here, you see a 300 year record of 5 European cities showing about 150 years of cooling, followed by 150 years of warming. We also have records of the cold people wrote about then, such as the frost fairs held on the top of the ice of the frozen river Thames in England, and how George Washington dragged cannon across the frozen Hudson River. Conclusion, it was colder then. Then about 150 years ago, it started to warm up, and warmed up rapidly.
Thus, by the time we get to 1900 or so, it has warmed up from it’s formerly cold temperature. and since then the temperature has remained about the same. Thus we can clearly say “the world is warming”, since it is clearly warmer than the little ice age, and doubt modern, unprecedented warming since 1950, since we can see that for the last 100 years or a bit more, the temperature has not changed nearly as much as it did when the climate changed from the little ice age to the modern warm period.
To sum up:
There was a Little Ice Age, so called because it was cold.
It was consistently colder than now for quite some time, perhaps 300 years.
The Little Ice Age ended, and it warmed up rapidly.
For the last 100 years or more, in the USA at least, it has remained about the same, warmer than the Little Ice Age.
If global warming were true, it would not have remained consistently the same, it would have warmed rapidly after 1950, it did not.
This is according to the most reliable temperature measurement station we have, the ones that have remained constant for 100+ years.
Or in one sentence, it has definitely warmed up since the Little Ice Age, and has remained fairly warm for over 100 years, but has not warmed much more in that 100 years from what it was 100 years ago.
You just need a little perspective, compared to 200 years ago, the world has warmed, compared to 100 years ago, the world has not warmed.
Erratum : the question did restrict itself to global effects. Sorry.
Legatus says:
October 24, 2011 at 3:45 pm
Or in one sentence, it has definitely warmed up since the Little Ice Age, and has remained fairly warm for over 100 years, but has not warmed much more in that 100 years from what it was 100 years ago.
You just need a little perspective, compared to 200 years ago, the world has warmed, compared to 100 years ago, the world has not warmed.
It’s the last hundred years temperature data they’d been playing with as in New Zealand, the second mentions Salinger’s connection with CRU: http://climaterealists.com/index.php?id=6151
http://www.climateconversation.wordshine.co.nz/tag/nz-temperature-records/
It’s taken around forty years to put the record straight..