Circularity of homogenization methods

Guest post by David R.B. Stockwell PhD

I read with interest GHCN’s Dodgy Adjustments In Iceland by Paul Homewood on distortions of the mean temperature plots for Stykkisholmur, a small town in the west of Iceland by GHCN homogenization adjustments.

The validity of the homogenization process is also being challenged in a talk I am giving shortly in Sydney, at the annual conference of the Australian Environment Foundation on the 30th of October 2012, based on a manuscript uploaded to the viXra archive, called “Is Temperature or the Temperature Record Rising?”

The proposition is that commonly used homogenization techniques are circular — a logical fallacy in which “the reasoner begins with what he or she is trying to end up with.” Results derived from a circularity are essentially just restatements of the assumptions. Because the assumption is not tested, the conclusion (in this case the global temperature record) is not supported.

I present a number of arguments to support this view. 

First, a little proof. If S is the target temperature series, and R is the regional climatology, then most algorithms that detect abrupt shifts in the mean level of temperature readings, also known as inhomogeneities, come down to testing for changes in the difference between R and S, i.e. D=S-R. The homogenization of S, or H(S), is the adjustment of S by the magnitude of the change in the difference series D.

When this homogenization process is written out as an equation, it is clear that homogenization of S is simply the replacement of S with the regional climatology R.

H(S) = S-D = S-(S-R) = R

While homogenization algorithms do not apply D to S exactly, they do apply the shifts in baseline to S, and so coerce the trend in S to the trend in the regional climatology.

The coercion to the regional trend is strongest in series that differ most from the regional trend, and happens irrespective of any contrary evidence. That is why “the reasoner ends up with what they began with”.

Second, I show bad adjustments like Stykkisholmur, from the Riverina region of Australia. This area has good, long temperature records, and has also been heavily irrigated, and so might be expected to show less warming than other areas. With a nonhomogenized method called AWAP, a surface fit of temperature trend last century shows cooling in the Riverina (circle on map 1. below). A surface fit with the recently-developed, homogenized, ACORN temperature network (2.) shows warming in the same region!

Below are the raw minimum temperature records for four towns in the Riverina (in blue). The temperatures are largely constant or falling over the last century, as are their neighbors (in gray). The red line tracks the adjustments in the homogenized dataset, some over a degree, that have coerced the cooling trend in these towns to warming.

clip_image004

It is not doubted that raw data contains errors. But independent estimates of the false alarm rate (FARs) using simulated data show regional homogenization methods can exceed 50%, an unacceptable high rate that far exceeds the generally accepted 5% or 1% errors rates typically accepted in scientific methods. Homogenization techniques are adding more errors than they remove.

The problem of latent circularity is a theme I developed on the hockey-stick, in Reconstruction of past climate using series with red noise. The flaw common to the hockey-stick and homogenization is “data peeking” which produces high rates of false positives, thus generating the desired result with implausibly high levels of significance.

Data peeking allows one to delete the data you need to achieve significance, use random noise proxies to produce a hockey-stick shape, or in the case of homogenization, adjust a deviant target series into the overall trend.

To avoid the pitfall of circularity, I would think the determination of adjustments would need to be completely independent of the larger trends, which would rule out most commonly used homogenization methods. The adjustments would also need to be far fewer, and individually significant, as errors no larger than noise cannot be detected reliably.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

137 Comments
Inline Feedbacks
View all comments
October 16, 2012 3:55 pm

laterite (David Stockwell),
I still have to see the first one hundred year time series without any inhomogeneities.
To study the variability of the Australian mean temperature, I would definitely prefer to use the homogenized data of the BoM over using one single series in a region known to be not representative for all of Australia because irrigation was introduced in the period of analysis.
Did you already read the article of Blair Trewin on the new homogenized daily Australian dataset? Just published online. Worth reading.

cohenite
October 16, 2012 4:59 pm

Victor Venema: The false alarm rates I quote in the paper are from Matthew J. Menne and Claude N. Williams. Homogenization of Temperature Series via Pairwise Comparisons. Journal of Climate, 22(7):1700{1717, April 2009, not my analysis. It is they who found FARs around or above 50% for reference homogenization. My ‘extended abstract’ is an example to show that deviant records are adjusted to the trend of the reference, whatever it is. The use of Australia is not important.
The process is like this:
1. The target record is compared with a reference 2. Because the probability of finding a jump on the relative difference between the target and the reference is greater than the probability of finding a jump, more jumps are found (high FARs).
3. After finding the break, if the the target is adjusted relative to the reference, then the trend of the target is coerced towards the reference. If a true break, the trend is biased towards the reference.
Because the trends of the targets have been determined by “peeking” at the overall network, one cannot then make a reliable statement about the overall trend of the network. That would be circularity.
Specific methods, like pairwise as described in Menne and Williams may mitigate this problem. I wouldn;t know without much more thought.
Blair Trewins ‘adaption’ of M&W has enough departures from M&W to invalidate the use of M&W as a source, IMHO. I have been going through the ‘wall of words’ on the ACORN study and there are a great many issues I take exception too.
For example, I believe the widespread use of a quadratic to fit Australian temperatures in the ACORN reports is unjustified, as robust empirical fluctuation analysis shows there is no significant change in trend over the last 100 years. Its transparent alarmism IMHO to use an unjustified quadratic, as the quadratic is suggestive of accelerating warming.
I can’t possibly get rebuttals published for every infraction, and try to put a stake through the heart of the problem.
You seem like a reasonable person. Perhaps you could tell me offline if an approach I am thinking of that is simple and avoids circularity has been tried before?
Posted for:

David Stockwell

October 16, 2012 5:07 pm

laterite says:
October 16, 2012 at 2:11 pm
[ . . . ] Some methods such as pairs analysis seem to significantly diminish the false alarm rate, but my interest would be in “why?” and whether that fits into a standard theoretical framework as recognized by statisticians.

– – – – – – –
laterite / David Stockwell,
Your profession is interesting. I think there is a good future in statistical auditing and statistical consulting on climate science research projects.
I am jealous. : )
John

markx
October 16, 2012 6:44 pm

Victor Venema says: October 15, 2012 at 8:24 am
“…Homogenisation is used to be able study large scale climate variability in a more accurate way. Removing the too low trend for an irrigated region, is what homogenisation is supposed to do. Just as it should remove the too high trend in case of urbanisation….”
It seems it should be very important that the original records are maintained and are readily accessible. The fact that regional records are first homogenized, then averaged is worrying, when they could perhaps simply be averaged.
The potential for ‘interpretation bias’ should be considered.

markx
October 16, 2012 7:24 pm

Victor Venema says: October 16, 2012 at 2:19 am

“…..Another option would be to remove urban areas from the dataset. …..
……The disadvantage of this approach is that you have to assume that your information on the urban/rural character of the stations is perfect and that you remove more data as needed as only a period of the data will typically be affected by urbanization and the rest of the data is useful and helpful in reducing the errors.
All in all, homogenizing the data is more reliable and accurate as removing part of the data…..”

Victor, I appreciate the good discussion.
But I feel (IMHO) the above conclusion fails on logic. If we don’t know the degree of effect from urbanization, or the degree to which rural stations are really tracking the temperature, then any adjustment must simply be an estimate, or best guess.
I’m not sure you can say one is better than the other.
My case would be the original records should be meticulously retained and readily available. The very world “homogenization” is worrying.

Tilo Reber
October 16, 2012 9:25 pm

Homogenization simply takes UHI and mixes it into the record. And with the majority of stations being subjected to UHI, the result is to drive the temperature record up. But it should never be mixed into the record. It should be removed. A record with UHI mixed in will yield a higher trend than one with UHI removed. I’ve been making this point for years.

October 16, 2012 9:48 pm

Hi Victor Venema,
I’m one of the non-“scientist” who’s comments are tolerated here. I ask questions, make wise cracks, make statements that I hope might be sometimes insightful. Sometimes my questions are stupid question that can annoy people who are more informed than I. (Sorry, Ric Werme, about the ‘stream of consciousness’ questions about Curiosity and water on Mars.) But such comments as I usually make are tolerated. People attempt to answer my questions. (Ric directed me to a NASA website.)
Comment on whatever site you like. Here you’ll be “put to the test” and have to put up with wise cracks from people like me if you’re a Gorephile or a Hansenite or a Mannequin. If your comments here are deleted, it won’t be because you disagree with the blog’s “consensus” but because you’ve been consistently and persistently disagreeable.
PS Where I work on one particular day, at one particulare moment, I had access to and checked 3 different temperature sensors. One read 87*F. One read 89*F. One read 106*F. None of them is more than 10 miles from the other at most. All were within 4 or 5 miles of me. One was just a few hundred yards away. Homogenized, what was the temperature where I was that day?

October 16, 2012 10:38 pm

Victor writes, “To study the variability of the Australian mean temperature, I would definitely prefer to use the homogenized data of the BoM over using one single series in a region known to be not representative for all of Australia because irrigation was introduced in the period of analysis.”
I would not prefer to use either method prior to determining which method is more reliable.
The only way to validate if homogenization data of the BoM is better is to to do a very careful analysis of the raw data for a large number of sites and account specifically for the factors impacting the temperature for each one by hand – a manual homogenization procedure. Preferably with more than one person analyzing each set of data independently. Then test the homogenization algorithm against the result to see what the differences are between the data that was manually homogenized and the data from the homogenization algorithm. Analyze any differences between the results of both methods and determine what causes those differences if any.
After a large study like the above is done, then I think it would be possible to determine which data is preferable.

Evan Thomas
October 16, 2012 10:59 pm

Tallbloke has a case study on the temp. records of Alice Springs, a small rural town in pretty much the centre of Australia. The records had been adjusted by the BoM. Records of the nearest (which are many hundreds of km. away) even smaller towns were cited. Worth a look if this matter is of interest to you. Cheers from now sunny Sydney.

richardscourtney
October 17, 2012 1:11 am

BobG:
At October 16, 2012 at 10:38 pm you write

Victor writes,

To study the variability of the Australian mean temperature, I would definitely prefer to use the homogenized data of the BoM over using one single series in a region known to be not representative for all of Australia because irrigation was introduced in the period of analysis.

I would not prefer to use either method prior to determining which method is more reliable.

Yes! Absolutely!
I point out that Victor Venema has not answered my post addressed to him at October 16, 2012 at 5:26 am. It included this

No data is ever “perfect”. So, in a real science “you” determine that data emulates reality with adequate reliability, accuracy and precision to provide sufficient confidence that the data is adequate for conclusions to be drawn from it.
It is pseudoscience in its purest form to claim that imperfections in the data should be ignored if the data can provide a desired “answer”.
Therefore, it is necessary for the researcher to provide evidence that the data he/she uses has the necessary reliability, accuracy and precision for his/her conclusions to be valid. In the case of data homogenisation that has not been done. Indeed, the different research teams who provide the various global (and hemispheric) temperature data sets use different homogenisation methods and do not publish evaluations of the different effects of their different methods.
In the above article David Stockwell provides several pieces of evidence which demonstrate that GHCN homogenisation completely alters the empirical data in some cases such that the sign of temperature change is reversed; e.g. compare his figures numbered 1 and 2. That altered data is then used as input to determine a value of global temperature.
It is up to those who conduct the homogenisation to demonstrate that such alterations improve the reliability, accuracy and precision of the data. Claims that such alterations should be taken on trust are a rejection of science.

Richard

October 17, 2012 4:29 am

Dear David Stockwell, did you read Menne and Claude N. Williams, “Homogenization of Temperature Series via Pairwise Comparisons”? Or did you get this chunk of information from a “sceptic” blog trying to mislead the public by selective quoting?
If you read the paper, you will see that these false alarm rates (FAR) are for the application of the homogenization method SNHT to a very difficult case. (You could even have cited a FAR of 100% for the case without any breaks, but I guess in that case people would have started thinking.) This was a case in which the regional climate signal used as reference was computed from only 5 stations with strong inhomogeneities (up to 2 times the noise). In this case the simple SNHT method interprets the inhomogeneities in the composite reference (a reference based on multiple stations) as breaks in the station.
SNHT was developed for manual homogenization, to guide a climatologist working carefully in the way described by BobG above. Such a climatologist would typically use more stations to compute the reference, at least 10. He would make sure to select stations that do not contain large inhomogeneities and would first homogenize the stations with the largest inhomogeneities.
The goal of Menne and Williams was to develop an automatic homogenization algorithm, because the climate network in the USA is too large to perform the homogenization by hand. Furthermore, as the work at NOAA, the climate sceptics would not accept the careful manual work suggested by BobG and claim that the malicious climatologist inserted the climate trend by homogenization and should use automatic methods, which can be tested independently. You cannot have it both ways. What you can do is compare the results of manual homogenization with the results of automatic methods and then you will see that the results are very similar. For such an automatic algorithm, Menne and Williams did not see SNHT with a composite reference as a good solution and thus preferred the pair wise method, which indeed produced a very small FAR.
By the way, the FAR is not a good indication of the quality of a homogenization method. If the break is detected in the year before or after the real break, this detection would be counted as a false alarm, whereas for the homogenization of the data and their trends this is no problem. Especially, for small breaks the uncertainty in the date of the break is larger. Thus a very good method, which is able to detect many small breaks, may well have a high FAR. It is better to compute how accurately the true trend and the decadal variability are reproduced in the homogenized data. The FAR is interesting for understanding how the homogenization algorithms work, but it is not a good indicator of the quality of the homogenized data.
Just because Trewin improved/changed the pair wise algorithm does not mean that either version is wrong. Both algorithms improve the quality of raw data. Maybe the new version of Trewin is more accurate, maybe it also just fits better to the statistical characteristics of the Australian climate network.

October 17, 2012 5:15 am

markx says: “Victor Venema: “…All in all, homogenizing the data is more reliable and accurate as removing part of the data…..”
But I feel (IMHO) the above conclusion fails on logic. If we don’t know the degree of effect from urbanization, or the degree to which rural stations are really tracking the temperature, then any adjustment must simply be an estimate, or best guess. I’m not sure you can say one is better than the other.”

That is the advantage of homogenization: we do not have to know in advance how strong the effect of urbanization in the city was. We see the magnitude of this in the difference time series between the station in the city and the surrounding rural stations.
Guessing which stations are affected by urbanization, not only now (where surfacestations could help in the USA), but during its entire life time, is difficult and error prone.

October 17, 2012 5:25 am

Gunga Din says: “I’m one of the non-“scientist” who’s comments ..”
Most people are non-scientist. What matters is the quality of the arguments.
Gunga Din says: “PS Where I work on one particular day, at one particulare moment, I had access to and checked 3 different temperature sensors. One read 87*F. One read 89*F. One read 106*F. None of them is more than 10 miles from the other at most. All were within 4 or 5 miles of me. One was just a few hundred yards away. Homogenized, what was the temperature where I was that day?”
After homogenization the temperatures at these stations would still be different. Homogenization makes the data temporally most consistent, it does not average (or even smooth as Anthony falsely claims) the observations of multiple stations. Having so many stations close together is great. That means that they will be highly correlated (if they are of good quality; is the one reading 106F on a wall in the sun?) and that the difference time series between the stations will only contain little weather noise (and some measurement noise). Thus it should be possible to see very small inhomogeneities and correct them very accurately.

October 17, 2012 5:34 am

BobG says: “After a large study like the above is done, then I think it would be possible to determine which data is preferable.”
I am sure that using homogenized data is better than using a single station to study the climate of a continent. Especially if the temperature at this single station is reduced by the introduction of irrigation during the study period.
Validation studies of homogenization methods are regularly performed. A recent blind benchmarking study of mine was very similar to the way you like the validation of homogenization methods to be done. Any comments on this paper are very welcome. We plan to perform similar validation studies in future. Thus if you have good suggestions, we could implement them in the next study.

October 17, 2012 5:42 am

Richardscourtney: “I point out that Victor Venema has not answered my post addressed”
Dear Richard, I have a day job and am not a dog that jumps through hoops. If you have any specific comments or questions, and not just misquotations of my comments and general accusations, which are simply untrue from my perspective, you have a better chance of getting an answer.
Richardscourtney: “It is up to those who conduct the homogenisation to demonstrate that such alterations improve the reliability, accuracy and precision of the data. Claims that such alterations should be taken on trust are a rejection of science.”
Could you maybe indicate specifically in which way you see my validation study as insufficient? Maybe I could then point you to further studies that also included those aspects or consider them in future studies. Such a specific comment would be more helpful.

richardscourtney
October 17, 2012 7:10 am

Victor Venema:
Thankyou for your post addressed to me at October 17, 2012 at 5:42 am which answers a point I first put to you in my post at October 16, 2012 at 5:26 am. And I apologise if my pointing out you had overlooked my post but had answered four subsequent posts “interrupted [your] day job”.
My post said

In the above article David Stockwell provides several pieces of evidence which demonstrate that GHCN homogenisation completely alters the empirical data in some cases such that the sign of temperature change is reversed; e.g. compare his figures numbered 1 and 2. That altered data is then used as input to determine a value of global temperature.
It is up to those who conduct the homogenisation to demonstrate that such alterations improve the reliability, accuracy and precision of the data. Claims that such alterations should be taken on trust are a rejection of science.”

You have replied saying

Could you maybe indicate specifically in which way you see my validation study as insufficient? Maybe I could then point you to further studies that also included those aspects or consider them in future studies. Such a specific comment would be more helpful.

Your “validation study” says

To reliably study the real development of the climate, non-climatic changes have to be removed.

OK. But if that is valid then such “removal” must increase the reliability, accuracy and/or precision of the data for the stated purpose.
In the example I cited from Stockwell’s article, the “removal” has had extreme effects (e.g. changing measured cooling into warming) over large area. It is not obvious how or why “non-climatic changes” would have – or could have – provided such a large difference as exists between the measured data and the homogenised data. The nearest to an explanation in your “validation study” is provided by your comment on your intercomparison study which says

Some people remaining skeptical of climate change claim that adjustments applied to the data by climatologists, to correct for the issues described above, lead to overestimates of global warming. The results clearly show that homogenisation improves the quality of temperature records and makes the estimate of climatic trends more accurate.

But I am not interested in PNS nonsense about “quality”: I am interested in the scientific evaluations of data which are reliability, accuracy and precision. And I fail to understand how it is possible to know that an “estimate” is “more accurate” when there is no available calibration for the estimate.
In other words, my question is
(a) What “non-climatic changes” would require such large alteration to the data of the example?
and
(b) How does the “removal” of those “non-climatic changes” affect the reliability and the accuracy and the precision of the data?

I have failed to find anything in your “validation study” which hints at an answer to these basic questions which apply to all homogenised data and not only to the example.
I await your answer and thank you for it in anticipation.
Richard

richardscourtney
October 17, 2012 7:14 am

Moderators,
I have provided a post with severe formatting errors. Please discard it and replace it with this corrected version. Sorry.
Richard
_____________
Victor Venema:
Thankyou for your post addressed to me at October 17, 2012 at 5:42 am which answers a point I first put to you in my post at October 16, 2012 at 5:26 am. And I apologise if my pointing out you had overlooked my post but had answered four subsequent posts “interrupted [your] day job”.
My post said

In the above article David Stockwell provides several pieces of evidence which demonstrate that GHCN homogenisation completely alters the empirical data in some cases such that the sign of temperature change is reversed; e.g. compare his figures numbered 1 and 2. That altered data is then used as input to determine a value of global temperature.
It is up to those who conduct the homogenisation to demonstrate that such alterations improve the reliability, accuracy and precision of the data. Claims that such alterations should be taken on trust are a rejection of science.”

You have replied saying

Could you maybe indicate specifically in which way you see my validation study as insufficient? Maybe I could then point you to further studies that also included those aspects or consider them in future studies. Such a specific comment would be more helpful.

Your “validation study” says

To reliably study the real development of the climate, non-climatic changes have to be removed.

OK. But if that is valid then such “removal” must increase the reliability, accuracy and/or precision of the data for the stated purpose.
In the example I cited from Stockwell’s article, the “removal” has had extreme effects (e.g. changing measured cooling into warming) over large area. It is not obvious how or why “non-climatic changes” would have – or could have – provided such a large difference as exists between the measured data and the homogenised data. The nearest to an explanation in your “validation study” is provided by your comment on your intercomparison study which says

Some people remaining skeptical of climate change claim that adjustments applied to the data by climatologists, to correct for the issues described above, lead to overestimates of global warming. The results clearly show that homogenisation improves the quality of temperature records and makes the estimate of climatic trends more accurate.

But I am not interested in PNS nonsense about “quality”: I am interested in the scientific evaluations of data which are reliability, accuracy and precision. And I fail to understand how it is possible to know that an “estimate” is “more accurate” when there is no available calibration for the estimate.
In other words, my question is
(a) What “non-climatic changes” would require such large alteration to the data of the example?
and
(b) How does the “removal” of those “non-climatic changes” affect the reliability and the accuracy and the precision of the data?

I have failed to find anything in your “validation study” which hints at an answer to these basic questions which apply to all homogenised data and not only to the example.
I await your answer and thank you for it in anticipation.
Richard

richardscourtney
October 17, 2012 7:43 am

Victor Venema:
As an addendum to my post addressed to you at October 17, 2012 at 7:14 am, in fairness I think I should be clear about “where I am coming from”. This is explained in the item at
http://www.publications.parliament.uk/pa/cm200910/cmselect/cmsctech/memo/climatedata/uc0102.htm
and especially its Appendix B.
Richard

ferd berple
October 17, 2012 7:56 am

richardscourtney says:
October 17, 2012 at 7:14 am
In the above article David Stockwell provides several pieces of evidence which demonstrate that GHCN homogenisation completely alters the empirical data
=======
Agreed.
It is well established by study after study that humans are incapable of acting without bias. Our sub-conscious drives us to make mistakes in the direction of our beliefs, and such mistakes are incredibly difficult for us to recognize.
Thus, when a methodology is reviewed by one’s peers, if the peers have similar beliefs to your own, they will tend to miss your errors. If the peers have opposing views, they will tend to catch your errors.
Thus, the Precautionary Principle argues that if one wants to be sure that one’s work is correct, it should always be peer reviewed by someone with opposing beliefs. If they cannot spot an error, then it is likely there is no error.
However, if someone with similar beliefs peer reviews your work, it really says nothing about the quality of your work, because the reviewer is likely to miss the same mistakes as the author.
Unfortunately, Climate Science has a long history of seeking like minded reviewers, which has introduced substantial methodology error into the field, undermining the credibility of the results.

ferd berple
October 17, 2012 8:17 am

The example above, of Australia before and after temperature homogenization clearly shows that the methodology is distorting the results not improving them. The long term cooling trend in the interior of Australia suddenly becomes a warming trend. A mild warming in the north east suddenly becomes an intense hot spot. The problem is that the adjustments are feeding back into the adjustments, increasing the error rather than reducing it.
On this basis Australians are facing a massive CO2 tax, which will force them to export their coal to China at reduced prices, rather than use it at home to produce low cost electricity. The Chinese will say thank you very much, burn the Ozzie coal to produce CO2, and make a pile of money in the process. All paid for by the Australian tax payer.
Shows what a few dollars invested in the right places can accomplish over time. The Chinese are turning Australia into their vassal state without ever firing a shot.

October 17, 2012 8:32 am

richardscourtney says: “In the above article David Stockwell provides several pieces of evidence which demonstrate that GHCN homogenisation completely alters the empirical data”
Ferd Berple thank you for repeating that statement; I missed that one. That is a clear statement and thus lends itself to an answer and a statement which is obviously wrong. Stockwell studied SNHT using a composite reference (computed the wrong way) and the GHCN uses a pairwise homogenization method.

October 17, 2012 8:55 am

richardscourtney: “In the example I cited from Stockwell’s article, the “removal” has had extreme effects (e.g. changing measured cooling into warming) over large area.”
Stockwell use the wrong reference and thus corrupted that data. Thus you cannot make any inference based on his study about people using homogenization methods the way they are supposed to be used.
That also answers your question: “(a) What “non-climatic changes” would require such large alteration to the data of the example?”
The are many more examples of non-climatic changes mentioned on my blog. Another example is mention in a paper I am just reading by Winkler on the quality of thermometers used before 1900. The glass had a different chemical composition at the time and thus a tendency to shrink in the first few years, which led to too high temperatures, about half a degree. This problem was discovered in 1842, long before post normal science.
richardscourtney: “But I am not interested in PNS nonsense about “quality”: I am interested in the scientific evaluations of data which are reliability, accuracy and precision.”
If you are really interested, then read the article on the validation study. You will find the root mean square error (which the normal newspaper readers calls quality, you may call it accuracy) in the trends in the raw data and in the homogenized data. You will see that after homogenization the errors are much smaller as in the raw data for temperature, especially for annual mean temperature. You will also see that the size of the remain error is small compared to the trend we had in the 20th century and the uncertainty in the trends in a real dataset will be smaller because metadata (station histories) are used to make the results more accurate and because the global mean temperature averages over many more stations.
That also answers your question: “(b) How does the “removal” of those “non-climatic changes” affect the reliability and the accuracy and the precision of the data?”

richardscourtney
October 17, 2012 9:08 am

Victor Venema:
Your comment at October 17, 2012 at 8:32 am says

richardscourtney says: “In the above article David Stockwell provides several pieces of evidence which demonstrate that GHCN homogenisation completely alters the empirical data”
Ferd Berple thank you for repeating that statement; I missed that one. That is a clear statement and thus lends itself to an answer and a statement which is obviously wrong. Stockwell studied SNHT using a composite reference (computed the wrong way) and the GHCN uses a pairwise homogenization method.

OK. I accept your statement saying
“Stockwell studied SNHT using a composite reference (computed the wrong way) and the GHCN uses a pairwise homogenization method,”
I mention but shall ignore that it has taken until now for you to have noticed what you now say is a fundamental flaw in Stockwell’s article and that you failed to notice my statement until after I had posted it three times and Fred Berple commented on it.
Much more important is what does the GHCN “pairwise homogenization method” do to the data and what are the answers to my questions with respect to the effect(s) of that method?
Richard

October 17, 2012 9:14 am

Richard: “Much more important is what does the GHCN “pairwise homogenization method” do to the data and what are the answers to my questions with respect to the effect(s) of that method?”
Science can only answer specific questions. If you ask it so generally, all I can answer childishly is that the pairwise homogenization method homogenizes the data. And that the effect is that on average the trend in the homogenized data is closer to the true trend as the trend in the raw data. The same goes for natural climatic decadal variability.

Reply to  Victor Venema
October 17, 2012 9:22 am

“…on average the trend in the homogenized data is closer to the true trend as the trend in the raw data. ”
That’s wronger than wrong based on what we learned:
http://wattsupwiththat.files.wordpress.com/2012/07/watts_et_al_2012-figure20-conus-compliant-nonc-noaa.png
And no, I’m not interested in your protests about this specifically…because this shows up in many surface datasets after homogenization is applied.
The only thing homogenization (in its current use in surface data) is good at is smearing around data error so that good data is polluted by bad data. The failure to remove bad data is why homogenization does this. If there was quality control done to the data to choose the best station data as we have done then it wouldn’t be much an issue. Homogenization itself is valid statistically, but as this image shows, in the current application by climate science it makes muddy water out of clean water when you don’t pay attention to data quality control:

October 17, 2012 9:31 am

[snip – reword that and I’ll allow it – Anthony]