GISS Step 1: Does it influence the trend?

Guest post by John Goetz

The GISStemp Step 1 code combines “scribal records” (multiple temperature records collected at presumably the same station) into a single, continuous record. There are multiple detailed posts on Climate Audit (including this one) that describe the Step 1 process, known affectionately as The Bias Method.

On the surface seems like a reasonable concept, and in reading HL87 the description of the algorithm makes complete sense. In simple terms, HL87 says that:

  1. The longest available record is compared with the next longest record, and the period of overlap between the two records is identified.
  2. The average temperature during the period of overlap is calculated for each station.
  3. The difference between the average temperature for the longer station and shorter station is calculated, and that difference (a bias) is added to all temperatures of the shorter station to bias it – bringing it in line with the longer station.
  4. The two records can now be combined as one, and the process repeats for additional records.

In looking at numerous stations with multiple records, more often than not the temperatures during the period of overlap are identical, so one would expect the bias to be zero. However, we often see a slight bias existing in the GISS results for such stations, and over the course of combining multiple records, that bias can be several tenths of a degree.

This was one of Steve McIntyre’s many puzzles, and we eventually figured out why we were getting bias when two records with identical overlap periods were combined: GISStemp estimates the averages during the overlap period.

GISStemp does not take the monthly data during the overlap period and simply average it. Instead, it calculates seasonal averages from monthly averages (for example, winter is Dec-Jan-Feb), and then it calculates annual averages from the four seasonal averages. If a single monthly average is missing, the seasonal average is estimated. This estimate is based on historical data found in the individual scribal record. If two records are missing the same data point (say, March 1989), but one record covers 1900 – 1990 and the other 1987 – 2009, they will each produce a different estimate for March, 1989.  All other data points might match during the period of overlap, but a bias will be introduced nonetheless.

The GISS algorithm forces at least one estimation to always occur. The records used begin with January data, but the winter season includes the previous December. That December datapoint is always missing from the first year of a scribal record, which means the first winter season and first annual temperature in each scribal record is estimated. Thus, if two stations overlap from January 1987 through December 1990 (a common occurance), and all overlapping temperatures are identical, a bias will be applied because the 1987 annual temperature for the newer record will be estimated.

Obviously, the bias could go either way: it could warm or cool the older records. With a large enough sample size, one would expect the average bias to be near zero. So what does the average bias really look like? Using the GISStemp logs from June, 2009, the average bias on a yearly basis across 7006 scribal records was:

BiasAdjustment

Advertisements

  Subscribe  
newest oldest most voted
Notify of
David Ermer

Since the temperatures in the past were colder than they are today, this all makes sense.(?)
Reply: No … the temperatures in the past for stations with multiple records have been cooled by an average additional 0.08C. – John

Filipe

The net effect is a tenth of degree in more than 100 years, that’s not much. The “bias” goes to about zero after ~1990. Are you sure the curve isn’t simply due to rounding effects? That’s the kind of curve I’d expect if we had gained an extra digit (going from half a tenth accuracy to half an hundredth).
Not that I don’t find strange the other aspect you talk about, that need to “always estimate.” Weird, we’re talking about really bad programming skills there, that is the kind of thing that could be easily avoided.

Sam Vilain

Perhaps you can also explain the relevance to the data series, given that the average adjustment is < 0.1⁰C ?
Reply: I forget what the increase in global temperatures is purported to be since 1880, but I believe it is somewhere in the neighborhood of 0.8 C. Roughly 0.08 C – or 10% – seems to be due to the process of combining scribal records, and nothing more. You decide the significance of this single process step (one of many). – John

Allen63

I guess I’m confused because I have not studied more than a couple examples of original scribal data sheets.
Why would there frequently be two or more overlapping scribal records from the same station. Were two or more people reading the temperatures at different times on the same days and writing them down on separate lists?

blcjr

John, you’ve explained this well, and it certainly seems like a flawed method, but how much of the trend since 1880 does it actually account for? Just using some rough numbers, since 1880, the trend line rise in temps has been about 0.75° C. Looking at your chart, it looks like the trend line rise accounted for by this bias is about 0.10° C. Am I right (about the 0.10° C)? That would make the bias account for about 13% of the total supposed warming.
Does that sound like an in the ballpark estimate of the significance of this?

John F. Hultquist

I can understand why a researcher (data technician) might start down this path focusing on the desire for long-running series of temperature data. At some point that person or group should have stopped and asked a few questions of the sort: “What are we doing to the data?” or “What are the alternatives?” or “How do different time periods regarding a lengthily warming or cooling influence the outcomes?”
If they did these things it seems they chose a method that gave them a warming bias. If they did not do these things, than shame on them!
Are there no “facts” to work with in climate research?

Boris Gimbarzevsky

It should be easy enough to test if this method alway produces a positive bias and that would be to create a set of artificial records in which the values were drawn at random from 3 separate distributions: one where the mean trend was decreasing, another in which there was no trend and one where the mean trend was upward. This would be a simple enough program to setup and run a few thousand iterations for each scenario. If the method used to combine records is unbiased then the final dataset should not differ significantly from the original dataset for which we have the advantage of knowing what the generating function is.

Anthony, this is a bit OT. But don’t you know these guys?
http://www.ametsoc.org/policy/2009geoengineeringclimate_amsstatement.html
I know these guys too. They had a secret island full of hot compliant babes in “Our Man Flint” where they CONTROLLED THE WEATHER!!!!

Nelson

Perhaps more importantly, since the adjustment was essentially flat until 1940, then went almost straight up, all of the bias occurred between the previous warm period of the mid-30s and the mid-90s.
The net result is that the recent warming thru 1998 looks more dramatic relative to the 30s by the 0.07⁰C introduced via the adjustment process.

Mac

10% increase just from the process? If a business was fudging it’s books by 10% they would be dragged into a courtroom and prosecuted.
Off topic: http://scienceandpublicpolicy.org/images/stories/papers/originals/climate_money.pdf
tracked this back from drudge. Basically its a report on all the money spent by the gov’t on hyping and researching climate change. Anthony Watts and Steve McIntyre are mentioned several times for their volunteer work. I wonder how the graph on page 3 compares to the recent rise in temperature.

The net effect is a tenth of degree in more than 100 years, that’s not much.
====
Ah but Filipe, the entire temperature increase over the past 100 years was less than 4/10 of one degree.
Are you telling me that an artificial bias in an artificial database deliberately inserted into the record by GISS has created 1/3 of the entire “global warming” that the most extreme of the AGW extremists can actually find?

David

Ten percent is significant.

Harold Vance

I’ll take the +0.01 Year 2020 leaps for a buck each.
GISS: Cooling the past to bring you a warmer and fuzzier future.

j.pickens

The important thing to note is that the creators of this scribal data averaging/estimating technique are either unaware of this bias, or aware of it.
If they are unaware of it, are they making efforts to correct their system?
If they are aware of it, why haven’t they already corrected their system.
Either way, it does not look good.
Why do all these biases seem to accrue to the older-colder, newer-hotter side of the ledger?

steven mosher

Interesting. It might be interesting to pass this on to the guys doing the
clear climate code project. Step one could be rewritten to change the method
of station combining and we could get a little bit closer to something that is accurate. As for the size of the bias, every little bit of improved accuracy counts. Kudos for your continued hard work on this wretched piece of code

Bob D

Nice work, John. As you say, 10% of the claimed warming may just be a result of poor programming. Another reason not to trust GISS temperatures, in my view.
Ignoring the red trend line, there seem to be three distinct steps, at -0.07, -0.03 and just below zero. It would be interesting to drill down and see what happened at the step changes around 1940-52 and again at 1994-ish.

Bruce

A few steps with bias. Drop all the rural records. Only use urban weathers stations and airports, pretend UHI is accounted for and voila … fabricated warming.

Richard Sharpe

I don’t know about GissTemp whatever, but something is up with the weather in the San Francisco Bay Area.
We are having very cold nights and cool days in the middle of summer.

anna v

“Reply: I forget what the increase in global temperatures is purported to be since 1880, but I believe it is somewhere in the neighborhood of 0.8 C. Roughly 0.08 C – or 10% – seems to be due to the process of combining scribal records, and nothing more. You decide the significance of this single process step (one of many). – John”
Every little bit helps, like the money in the church collection.
I presume these numbers are further corrected for UHI and then used for corrections over 1000kms? Once on the slippery road it keeps slipping :).

Filipe

Just to clarify my point. Consider you have a large set of points randomly distributed according to a uniform distribution between 0 and 10. The “true” average of these points is 5. Consider now that all the points are truncated to integers. The average of the truncated points is 4.5. If the points are instead truncated to the first decimal place, then the average is 4.95 and so on.
In a system with truncation, and with accuracy increasing with time, even with a true flat slope, one would get a positive slope just from the truncation. But I’m not sure this applies here, are these measures considered as true rounding or simple truncation?

Richard

The GISS record is highly suspect. I said so on a warmists blog and have now been banned from there. They do not tolerate dissent.
Why is there such a big trend difference beween the Land based temperatures of GISS and Hadley compared with the satellite data over the same time period? Does anyone know?
Snowman if you come here and read this – hi. We could chat here. I’ve been banned at the other place

Richard

Felipe – “The net effect is a tenth of degree in more than 100 years, that’s not much.” The total warming over this period is 6 tenths of a degree, so a bias of one tenths of a degree would be significant?

AnonyMoose

Oh, my. A review can’t even get through step 1 without finding an error. That slightly changes my opinion of the chances of other errors being present.
I think the obvious question to ask is: “How many times has this procedure been reviewed?”
These people have been using this procedure for years, they should have been examining it often. Why didn’t everyone examining the procedure find this problem when they started looking at the process?
I’ve read of scientists who hired an outsider, trained them to use a copy of their equipment, and had them study their original material to see if the outsider made the same discovery they did. Why aren’t these scientists having people examine their equipment regularly?

Scott Gibson

Filipe, you mention the possibility of rounding effects… I live in the Arizona desert, and every day the temperature is rounded up at the end of the day. The result is that certain temperatures are shown for the high and low until the evening when it begins to cool, then at the high rises at least one degree later at night. I figure they justify that as the actual temperature has to be at least a fraction higher, and therefore is rounded up, always. If you don’t believe me, watch the daily noaa temperatures for Tucson.
I wouldn’t be surprised to find this kind of bias in the GISS too.
When I was younger, people said the weather service reported lower than actual temperatures during the summer so as to not scare off tourists, though I can’t vouch for it, and it may have been an urban myth.

Fluffy Clouds (Tim L)

Amazing!!!!!!!!!!
no snip’s needed for shortness…. LOL
” it’s better to be snipped than band for life! ”
I like this place.
A.W. keep moving on that publishing of the stations.
10% added to 20% parking lot error could be a bunch
nite nite

rbateman

.pickens (20:17:21) :
Why do all these biases seem to accrue to the older-colder, newer-hotter side of the ledger?
Remember the blinking global temp graphs posted in other articles, where the left half is lowered and right half raised? Near the end of the Left half, it lowers again. The right hand hockey sticks.
They started in peak and valleys.
You start in the valley after the 30’s to lower it. When it can’t lower any more, you start with the output graph and pick the next valley and head left, etc.
Do the same but opposite for the right to raise it.
This is their 2nd go at it, and they have operated this time on previously altered data.
The difference graph which we see has austere step function sto it.
The whole idea of thiers is to slowly alter the global temp report in stages, hoping that the masses won’t notice the sleight of hand.
Think of it as a calibration flat.
Add it to their latest global temp graph and you have the true graph before the double butchery.

David Ball

My pick for quote of the week !!!!! ~~~~~~~~~” Why aren’t these scientists having people examine their equipment regularly?” :^]

David Ball

Thank you AnonyMoose. I believe the point you were making was that the scientific method has been abandoned , if I am not mistaken.

E.M.Smith

AnonyMoose (21:15:48) : Oh, my. A review can’t even get through step 1 without finding an error. That slightly changes my opinion of the chances of other errors being present.
It’s even worse than that… The first step in GIStemp is actually STEP0, and that step does a couple of suspect things all by itself. (Cherry picks 1880 and deletes any data older than that. Takes an “offset” between USHCN and GHCN for up to 10 years from the present back to no earlier than 1980 then “adjusts” ALL past data by subtracting that “offset” – supposedly to remove the NOAA adjustments for things like TOBS, UHI, Equipment, etc., …)
The fact is that at EVERY step in GIStemp, there are odd decisions, introduction of demonstrable errors, questionable techniques used to fabricate data, etc. And yes, having read every line of the code a couple of times now, I’m comfortable with the term “fabricate” for the process of creating “data” where there are none… This posting gives one example from one of the steps, but in fact there are several steps where “odd things” are done to “fill in missing bits”. I can think of no better term for this than “fabrication”. The deliberate construction of a temperature series where none exists.
I think the obvious question to ask is: “How many times has this procedure been reviewed?”
These people have been using this procedure for years, they should have been examining it often. Why didn’t everyone examining the procedure find this problem when they started looking at the process?

Frankly, the code will send anyone screaming from the room. I’ve done a “data flow listing” of what files go in, and come out, from each bit of the STEP0 process. It is at:
http://chiefio.wordpress.com/2009/07/21/gistemp_data_go_round/
I intend to add STEP1, 2, 3, 4, 5 over time; then put a bit of “what happens with each change” description in it. (It’s a work in progress…).
But take a look at it. It’s just a list of program, input, output, next program. All for just STEP0. The spaghetti is horrid. So before any “code review” could even begin to ask the question “Is this process valid” it gets stymied with the question “Just what the heck IS the process and just where the heck DOES the data go?!? ”
I’m working to take some of the worst of the Data-Go-Round behaviour and simplify it, specifically so that I can more rationally say just what is going on where in the processing.
One example: At the end of STEP0, the output file has the name of v2.mean. This is then moved to STEP0/to_next_step/v2.mean
_comb and the script then advises you to, by hand, move it to STEP1/to_next_step/v2.mean_comb.
So with no processing at all done to the file it goes through three names… And ends up in STEP1/to_next_step where a rational person might expect to find the output of STEP1, to be handed to STEP2, but instead finds the output of STEP0 being handed to STEP1.
Those kinds of Logical Landmines are scattered through the whole thing. I can only work on it about 4 hours at a shot before I have to take a “sanity break” to keep a tidy mind…
Part of what I’m doing now is fixing that kind of silly endless Data-Go-Round behaviour. Have one file name for a thing, in one place. Not three. And have the name reflect reality… like maybe finding input files for a STEP in input_files rather than in to_next_step…
FWIW, I’ve come to figure out that “input_files” directory actually means “external_site_data_files” while “to_next_step” actually means “inter_step_datasets”. The code is full of that kind of thing. You can only spend so long trying to answer “what is is” before you either give up, or need a sanity break…
So I can’t imagine anyone doing a decent code review on this without re-writing it first. At least the worst bits of it.
I’ve read of scientists who hired an outsider, trained them to use a copy of their equipment, and had them study their original material to see if the outsider made the same discovery they did. Why aren’t these scientists having people examine their equipment regularly?
IMHO, this code is more of a glorified “hand tool”. Something that someone cooked up to let them play with the data. There are lots of places where you can insert “plug numbers” to see what happens. Places where a parameter is passed in, rather than set in the code; and output files left laying about where you can compare them to other runs.
So you can try different radius choices for “anomaly boxes” to get “references stations” data for adjusting their temperatures. Some steps set it to 1000 km, others to 1500 km, one to 1200 km (and in the code it chooses to execute another program on the data ONLY in the case where the parameter value is 1200; so one is left to wonder why… what makes 1200 km “special”, and if it is special, why is it passed in as a parameter that can be changed at run time by the operator?) It’s structure says that the intent is to play with settings by hand and cherry pick.
As near as I can tell, the “review” only consists of looking at the papers written by GISS, not at the actual code.

Norm

What happened in 1994 and forward?

E.M.Smith

AnonyMoose (21:15:48) : Oh, my. A review can’t even get through step 1 without finding an error. That slightly changes my opinion of the chances of other errors being present.
It’s even worse than that… The first step in GIStemp is actually STEP0, and that step does a couple of suspect things all by itself. (Cherry picks 1880 and deletes any data older than that. Takes an “offset” between USHCN and GHCN for up to 10 years from the present back to no earlier than 1980 then “adjusts” ALL past data by subtracting that “offset” – supposedly to remove the NOAA adjustments for things like TOBS, UHI, Equipment, etc., …)
The fact is that at EVERY step in GIStemp, there are odd decisions, introduction of demonstrable errors, questionable techniques used to fabricate data, etc. And yes, having read every line of the code a couple of times now, I’m comfortable with the term “fabricate” for the process of creating “data” where there are none… This posting gives one example from one of the steps, but in fact there are several steps where “odd things” are done to “fill in missing bits”. I can think of no better term for this than “fabrication”. The deliberate construction of a temperature series where none exists.
I think the obvious question to ask is: “How many times has this procedure been reviewed?”
These people have been using this procedure for years, they should have been examining it often. Why didn’t everyone examining the procedure find this problem when they started looking at the process?

Frankly, the code will send anyone screaming from the room. I’ve done a “data flow listing” of what files go in, and come out, from each bit of the STEP0 process. It is at:
http://chiefio.wordpress.com/2009/07/21/gistemp_data_go_round/
I intend to add STEP1, 2, 3, 4, 5 over time; then put a bit of “what happens with each change” description in it. (It’s a work in progress…).
But take a look at it. It’s just a list of program, input, output, next program. All for just STEP0. The spaghetti is horrid. So before any “code review” could even begin to ask the question “Is this process valid” it gets stymied with the question “Just what the heck IS the process and just where the heck DOES the data go?!? ”
I’m working to take some of the worst of the Data-Go-Round behaviour and simplify it, specifically so that I can more rationally say just what is going on where in the processing.
One example: At the end of STEP0, the output file has the name of v2.mean. This is then moved to STEP0/to_next_step/v2.mean_comb and the script then advises you to, by hand, move it to STEP1/to_next_step/v2.mean_comb.
So with no processing at all done to the file it goes through three names… And ends up in STEP1/to_next_step where a rational person might expect to find the output of STEP1, to be handed to STEP2, but instead finds the output of STEP0 being handed to STEP1.
Those kinds of Logical Landmines are scattered through the whole thing. I can only work on it about 4 hours at a shot before I have to take a “sanity break” to keep a tidy mind…
Part of what I’m doing now is fixing that kind of silly endless Data-Go-Round behaviour. Have one file name for a thing, in one place. Not three. And have the name reflect reality… like maybe finding input files for a STEP in input_files rather than in to_next_step…
FWIW, I’ve come to figure out that “input_files” directory actually means “external_site_data_files” while “to_next_step” actually means “inter_step_datasets”. The code is full of that kind of thing. You can only spend so long trying to answer “what is is” before you either give up, or need a sanity break…
So I can’t imagine anyone doing a decent code review on this without re-writing it first. At least the worst bits of it.
I’ve read of scientists who hired an outsider, trained them to use a copy of their equipment, and had them study their original material to see if the outsider made the same discovery they did. Why aren’t these scientists having people examine their equipment regularly?
IMHO, this code is more of a glorified “hand tool”. Something that someone cooked up to let them play with the data. There are lots of places where you can insert “plug numbers” to see what happens. Places where a parameter is passed in, rather than set in the code; and output files left laying about where you can compare them to other runs.
So you can try different radius choices for “anomaly boxes” to get “references stations” data for adjusting their temperatures. Some steps set it to 1000 km, others to 1500 km, one to 1200 km (and in the code it chooses to execute another program on the data ONLY in the case where the parameter value is 1200; so one is left to wonder why… what makes 1200 km “special”, and if it is special, why is it passed in as a parameter that can be changed at run time by the operator?) It’s structure says that the intent is to play with settings by hand and cherry pick.
As near as I can tell, the “review” only consists of looking at the papers written by GISS, not at the actual code.

Terry

Is there a way to get the raw temps vs. the adjusted temps graphed/animated/whatever, and delivered to a news agency or reporter? The blink charts of what was vs. what “is” are fairly damning.
I think this would be a compelling “Here is what we saw, and here is what they’re saying” story. Then ask Hansen, Mann, Schmidt, et al why the temperature observations from 1900-2008 are/need to be adJusted monthly, and in retrospect. I have yet to hear a valid explanation of why it makes sense to adjust temp observations from 10/20/50/100 years ago every month when new data comes in.
Just a thought/question.

E.M.Smith

Per the 1200km custom bit, it’s in STEP3. This is a bit from the top of the script that controls how STEP3 runs. I’ve elided the housekeeping bits of the script.
The bold bit at the top sets the radius to 1200 by default, but if the script is started with another value, the 1200 gets changed. So something like:
do_comb_step3 2000
would cause the rad value to be assigned that of the first parameter $1 (in this case, 2000).
do_comb_step3:
label=’GHCN.CL.PA’ ; rad=1200
if [[ $# -gt 0 ]] ; then rad=$1 ; fi
[…]
echo “Doing toSBBXgrid 1880 $rad > to.SBBXgrid.1880.$label.$rad.log ”
toSBBXgrid 1880 $rad > to.SBBXgrid.1880.$label.$rad.log
[…]
if [[ $rad -eq 1200 ]] ; then ../src/zonav $label ; fi
So ONLY in the case where a radius of 1200 km was used, the script calls the zonav script for further processing. One wonders why, say, 1100 km ought not to be zone averaged…
Maybe it’s a way to cause failure if someone overrides the value? To prevent a hand test from making it into production? But if that’s the case, why was 1200 chosen, what makes it the “right” choice? Or maybe it’s something more. There is no way to know…
So when evaluating STEP3, I’ll need to spend a while thinking about what zonav does and what the effect would be of NOT running it with a radius of 1199 vs running it with 1200; and wonder “Why?”…
(“Why? Don’t ask why. Down that path lies insanity and ruin. -e.m.smith”)

Frank Lansner

John Goetz, thankyou so much or these excellent and important findings.
It must have been a HUGE work to do, But you did it 🙂
Question:
1) The 0,08 K warming trend after 1940 from no less than 7006 scribal records, does this mean that we in average has 0.08 K of the GISS-temperature global warming from this error? Or what does it mean?
0,08K would be around 20% of the global warming since 1940.
2) This consequent tendensy, that the errors favor global warming, is it possible to give a statistical estimat: How likely is it, that we get this warming trend out of 7006 records?
Should not these overlapping records yield a ZERO trend?
Is the likelyness of such a warming trend occuring from 7000 records something like 1:7000.000.000.000.000.000 ?
If so, how close is this to be a proof that data by human influence is favouring a global warming signal?

VG

The final decision (slow in developing but it will) are the actual temperatures that are occuring at this time especially in the US and Europe. Most of the highly populated States in the USA have been experience below “anomalies” and people are experiencing this.. subsequently the surveys are shifting to show more and more skepticism. It doesn’t matter how much GISS et al try to increase temps because they just ain’t rising LOL (BTW AMSU temps have jumped dramatically during the past two weeks proving that there is NO AGW (unless warmistas would contend that it’s all started suddenly) hahaha.

Allan M R MacRae
VG

Anthony: another developing story. I think you could safely state now that snow is falling in Buenos Aires (at least Provincia). maybe we should wait until tomorrow…
http://momento24.com/en/2009/07/22/buenos-aires-province-snow-falls-in-the-south/
This backs up the previous recent posting about snow in BA 2008. There is an intense pool of cold air around Paraguay, Uruguay and Argentina (see COLA). BBC weather is not showing though any significant Through/Front going through unless its a stationery one slightly to the north

“Norm (22:30:23) : What happened in 1994 and forward?”
– A tipping point has been reached!

Stoic

Totally O/T, but there are 2 articles in yesterday’s and today’s Financial Times. Today’s is headed ‘Atkins puts moral case for climate change’. !!!!!! (my excalamation marks). You can find the stories at ft.com and search for ws atkins. The FT is a hugely influential, usually objective, newspaper read by businessmen and moneymen all over the world. Can I suggest that WUWT visitors write to the Editor of the FT to comment on these stories? The editor’s address is letters.editor@ft.com Please include your (physical) address and telephone number.
Regards
S

Nelson (19:46:17) :
Perhaps more importantly, since the adjustment was essentially flat until 1940, then went almost straight up, all of the bias occurred between the previous warm period of the mid-30s and the mid-90s.
The net result is that the recent warming thru 1998 looks more dramatic relative to the 30s by the 0.07⁰C introduced via the adjustment process.

This is a good point. There actually looks to be a slight cooling bias between ~1910 and ~1940 which might not be much – but it could be enough to ensure the 1910-1940 warming trend matches 1975-2005 warming trend. Particularly as there is a warming bias after 1975.
I hope I’m reading this right. If I am the ‘ocean effect’ must be increased and the CO2 signal must be reduced.
Before jumping to any conclusions it might be worth checking out the Hadley record.
Filipe (19:11:45) :
The net effect is a tenth of degree in more than 100 years, that’s not much.

Not sure I agree with this. The overall trend might not be affected too much but looking at the pattern of biases, the peaks and troughs in the temperature record could change quite a bit.
The “bias” goes to about zero after ~1990. Are you sure the curve isn’t simply due to rounding effects? That’s the kind of curve I’d expect if we had gained an extra digit (going from half a tenth accuracy to half an hundredth).
Why? The average of +0.1 and -0.1 is zero as is the average of -0.01 and +0.01.
I might have completely misinterpreted this post as I read it very quickly so I’m happy to be corrected on anything.

Here’s a theory to examine in your surfacestations research: that the temperature record warming trend is a bias of the growth in airline flights in the latter 20th century, and the congested air travel system: more jets sitting on the tarmac, engines running waiting to take off, is going to put a lot more hot exhaust gasses in the area of the airport weather station than planes starting up and taking off. If you correct temperature data for changes in airport congestion and flight delay times, what would the result be?

tallbloke

Great work John Goetz. The Team had better get the wagons circled, before all the wheels have come off.

VG

Well its now snowing in Ezeiza airport Buenos aires so much for AGW
http://www.perfil.com/contenidos/2009/07/22/noticia_0033.html spanish

Intuitively I feel that this method would spread UHI affects through the data. Just my initial reaction – not founded on anything hard. The reason?
The longer serires is most likely to be the oldest series. Older sites are more prone to have been subject to urban growth around them. Such sites will show positive temperature bias compared with less effected sites (i.e. the shorter series).
So by adding the “bias” from the longer series to the shorter one you would be adding a UHI component from one series to the next, propogating it through the data.
I would be interested in seeing a comparison of the two groups of data – the longer versus the shorter – and some analysis of comaprable UHI exposure between the two.

Solomon Green

Actually the bias starts before step 1. When I was in LA, it was well-known that the valleys were always warmer than downtown. The choice of proxy stations will always introduce bias, no matter how small the distance between stations. In Richmond Park London, which is several times the size of Central Park New York, but where traffic is confined to a single perimeter road, for many years there was a basin which was noticeably warmer than adjoining areas only a few hundred yards away. This discrepancy seems to have disappeared recently, for no obvious reason.
The merging of any temperature records even from closely neighbouring stations must always be suspect. Averaging is not an option. The discrepancies should be consistent from reading to reading for there to be any confidence in the splicing.
If however splicing is essential then before any attempt to splice the two sets of data, over the whole overlap period the old-fashioned, elementary technique of analysis of differences should be applied to the raw data to determine whether at least the first and second differences are randomly distributed or not.

Ron de Haan

CNN Biaha Blanca Argetina, winterconditions, snow, temp -18 degree Celsius?

Ron de Haan

So, AGW is indeed a computer made problem.
Let’s move from virtual to real world again and start solving real world problems.
Thanks for this clear and significant posting John.
The time has come to provide the US Senators with a quick course “How climate data is framed and what to do about it”.
As Cap & Trade and the current Climate Bill are under severe attack, politicians who reject this bill come up with disturbing alternatives.
We want them to decide to do NOTHING, NOTHING AT ALL!

Curiousgeorge

Would anyone like their bank account or 401K handled in this manner? I don’t think so.

jmrSubury

Does HadCRUT (2009/06 0.503) use any of the same adjustments as GISS? — John M Reynolds

TerryS

Re: Filipe (20:45:34) :
They don’t truncate, they round. This means your random set always averages out to 5, not 4.5 or 4.95 or 4.995 etc.