Unique paper looks for natural factors in station data–shows significant probabilities of natural signal

Since we have been paying a lot of attention to the surface record given the recent revelations surrounding the adjustments to data and NOAA’s release of the State of the Climate  Report claiming that the USA had the hottest year ever (but the globe did not) this paper seems like a good one to review.

How Natural is the Recent Centennial Warming? An Analysis of 2249 Surface Temperature Records

Horst-Joachim Lu¨decke, Rainer Link, and Friedrich-Karl Ewert

EIKE, European Institute for Climate and Energy, PO.Box 11011, 07722 Jena, GERMANY

Abstract. We evaluate to what extent the temperature rise in the past 100 years was a trend or a natural fluctuation and analyze 2249 world- wide monthly temperature records from GISS (NASA) with the 100-year period covering 1906-2005 and the two 50-year periods from 1906 to 1955 and 1956 to 2005. No global records are applied. The data document a strong urban heat island effect (UHI) and a warming with increasing station elevation.

For the period 1906-2005, we evaluate a global warm- ing of 0.58 0C as the mean for all records. This decreases to 0.41 0C if restricted to stations with a population of less than 1000 and below 800 meter above sea level. About a quarter of all the records for the 100-year period show a fall in temperatures. Our hypothesis for the analysis is – as generally in the papers concerned with long-term persistence of temper- ature records – that the observed temperature records are a combination of long-term correlated records with an additional trend, which is caused for instance by anthropogenic CO2, the UHI or other forcings. We apply the detrended fluctuation analysis (DFA) and evaluate Hurst exponents between 0.6 and 0.65 for the majority of stations, which is in excellent agreement with the literature and use a method only recently published, which is based on DFA, synthetic records and Monte Carlo simulation. As a result, the probabilities that the observed temperature series are natural have values roughly between 40% and 90%, depending on the stations characteristics and the periods considered. ’Natural’ means that we do not have within a defined confidence interval a definitely positive anthropogenic contribution and, therefore, only a marginal anthropogenic contribution can not be excluded.

Electronic version of an article published in International Journal of Modern Physics C, Vol. 22, No. 10, doi:10.1142/S0129183111016798

In this paper, we have used 2249 unadjusted monthly temperature records of 100 and 50 years in length and evaluated the temperature changes for the periods 1906-2005, 1906-1955, and 1956-2005. Our analysis was based exclusively on local records and applied DFA, Monte Carlo methods and synthetic records. The main results and conclusions are the following.

a) The mean of all stations shows 0.58 0C global warming from 1906 to 2005. If we consider only those stations with a population of under 1000 and below 800 meter above sea level this figure drops to 0.41 0C and would probably decrease even further if we were to take into account the warm biases caused by the worldwide reduction in rural stations during the 1990s, by changes to the screens and their environments, and by the appearance of automatic observing systems.

b) From 1906 to 2005, about a quarter of all records show falling temperatures.

This in itself is an indication that the observed temperature series are pre- dominantly natural fluctuations. ’Natural’ means that we do not have within a defined confidence interval a definitely positive anthropogenic contribution and, therefore, only a marginal anthropogenic contribution can not be ex- cluded. We evaluated – with a confidence interval of 95% – the probability that the observed global warming from 1906 to 2005 was a natural fluctuation as lying between 40% and 70%, depending on the station’s characteristics. For the period of 1906 to 1955 the probabilities are arranged between 80% and 90% and for 1956 to 2005 between 60% and 70%.

c) By separating stations into specific station groups, such as those with a defined minimum population, a strong UHI and elevation warming can be identified.

d) The vast majority of temperature stations are found on land and in the northern hemisphere, and have Hurst exponent of α 0.63 in such locations. However, two thirds of the Earth are covered with water, and the relatively few stations on islands or near oceans have higher Hurst exponent of α 0.7. Therefore, a real exponent for the entire Earth could be some- what higher than α 0.63. Records with higher exponents embody even higher probabilities for natural fluctuations.

Full PDF: http://www.eike-klima-energie.eu/uploads/media/How_natural.pdf

0 0 votes
Article Rating
51 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
January 11, 2013 1:57 pm

I find the last sentence of the abstract too torturous for my taste:”‘Natural’ means that we
do not have within a defined confidence interval a definitely positive anthropogenic contribution and, therefore, only a marginal anthropogenic contribution can not be excluded.” What do they mean?

Ben D.
January 11, 2013 2:57 pm

Leif Svalgaard says at 1:57 pm
Imo, anthropogenic contribution if any is marginal…

January 11, 2013 2:58 pm

Leif Svalgaard says:
January 11, 2013 at 1:57 pm
I find the last sentence of the abstract too torturous for my taste:”‘Natural’ means that we
do not have within a defined confidence interval a definitely positive anthropogenic contribution and, therefore, only a marginal anthropogenic contribution can not be excluded.” What do they mean?
===============================================================================
Mr. Layman here. It does sound contradictory. Perhaps, “only a marginal anthropogenic contribution can be included.” ?

john robertson
January 11, 2013 2:59 pm

No reference to Watts el al in the 84 references.
Nature means we cannot detect the hand of mann?
Not sure if this is an honest evaluation or an introduction of a weasel clause.
Personally I prefer E.M. Smiths explanation, as he’s covered most of this in a language I can understand.

Auto
January 11, 2013 3:00 pm

They can’t rule out some – small to minuscule to infinitesimal – AGW effect.

Jurgen
January 11, 2013 3:00 pm

Well I guess the meaning may be obscure but the implication seems to be we anthropomorphs are not natural entities.

GlynnMhor
January 11, 2013 3:04 pm

Leif has it right. Bafflegab with multiple negatives leaves the readers mystified.

January 11, 2013 3:05 pm

A technical paper with practical conclusions.
I think they mean to say that a marginal anthropogenic contribution can not be excluded.
To me this means that an important anthropogenic contribution can be excluded.

LazyTeenage
January 11, 2013 3:43 pm

For the period 1906-2005, we evaluate a global warm- ing of 0.58 0C as the mean for all records. This decreases to 0.41 0C if restricted to stations with a population of less than 1000 and below 800 meter above sea level.
————
Geeeee wizzz. They have discovered the urban heat island effect all by themselves.
And then simply waved their hands around busily and declared that all of the 0.4 degrees rise they found must be natural. With no evidence produced one way or the other. And ignoring the logic of the obvious: the trend “by itself” is not evidence of causation.
And just making stuff up by declaring “probably” with lots of speculation and no evidence.
And hope no one notices they ignore both the ocean temperature and satellite trends.
And no novelty to justify this as a scientific publication. This kind of analysis has been done to death already.

Lance Wallace
January 11, 2013 3:45 pm

Leif–They give a rather simpler definition on p. 7:
‘Natural’ denotes that there is no defi nite anthropogenic trend in the record.
Then on p. 9 there is a long section explaining how they establish a confidence level roughly corresponding to a 95% range within which an external (anthropogenic) trend can be excluded. They appear to be of the same opinion as Leif and several commenters here since they express dissatisfaction with their own definition, calling it “correct but somewhat clumsy”

Ian
January 11, 2013 3:48 pm

I think this is what the authors are saying is ’Natural’ means that we do not have within a defined confidence interval a definitely positive anthropogenic contribution but that said, a marginal anthropogenic contribution cannot be excluded. Or in other words temperature fluctuations in the period studied are probably mainly due to natural causes but with a small amount probably due to human actions

Gail Combs
January 11, 2013 3:51 pm

” ’Natural’ means that we do not have within a defined confidence interval a definitely positive anthropogenic contribution and, therefore, only a marginal anthropogenic contribution can not be excluded.”
….
I think that is the ‘Get Out of Peer Review Free Card’ They know darn well they can not come right out and say the study shows no sign of “a definitely positive anthropogenic contribution “ and get the paper published so they put in that bit of bafflegab to get it past the journal editor and reviewers.

Lance Wallace
January 11, 2013 3:54 pm

Figure 6 provides an estimate of the strength of UHI compared to stations with populations less than 1000: about 0.3 C higher for Pop=10,000, 0.5 C (100,000), 0.8 C (1,000,000), >1 C for
>4,000,000.
Considering the 100-year increase was only 0.58 C, this appears to indicate that UHI was a substantial fraction of the increase (as our host suggests in his latest draft paper.)
The authors used the GISS unadjusted monthly data, so did not allow the grid-based homogenization to mess up their estimates.

January 11, 2013 4:14 pm

I think the last paragraph of the abstract is a classic. Given the degree to which the doom-sayers are logic-challenged, it might head them off at the pass.

Birdieshooter
January 11, 2013 4:24 pm

@Leif they mean English is not their first language

Editor
January 11, 2013 5:02 pm

Leif – In all probably it’s either translated from German or it’s not in the writer’s first language. Maybe a bit of leeway should be granted on the exact wording.
General comment –
The study uses population as a station selection criterion. It would have been better to use the physical surroundings of the stations as in Watts 2012. Obviously that can’t be done (too many stations) if the information is not readily available.
The study’s stated probabilities in (b) look odd to me. I would expect the probability for any particular proposition (in this case that the observed global warming is a natural fluctuation) would be nearer to 50% (the “don’t know” value) for shorter study periods (because, if the study period is very short then you just can’t tell). Their probabilities go the other way. I *think* this may come from their use of linear trends, since counterintuitively the trends in the two halves of an interval can both be greater (or both less) than the trend of the whole (because the two half trends don’t have to meet in the middle). Or, it may come from the way the proposition is interpreted (“‘Natural’ denotes that there is no defi nite anthropogenic trend in the record.“). Either way, I *think* that means that their probabilities are incorrect. Unless some kind individual can explain it to me nice and simply, it looks like I’ll have to spend a lot more time on the paper in order to understand its findings.

Victor Venema
January 11, 2013 5:06 pm

That temperature trends for single stations are not statistically significant is no surprise. If you want to see changes in the global climate, you do so by computing and analyzing the global mean temperature. A local time series of a single station, or even an average over a signal nation, will show a much larger natural variability as the global mean temperature.
To make matters worse, the authors have used inhomogeneous data in which jumps (relocations, screen changes, instrument changes, etc.) and gradual trends (for example due to urbanization) are still present. This is a large part of the long term variability. The statement that the variability is natural is thus false, the variability is to a large part due to inhomogeneities.
Due to the inhomogeneities also the Hurst coefficient is overestimated and consequently the authors overestimate the uncertainty in the trends.
It is probably not a coincidence that the paper is published in a computational physics journal. Climate “skeptics” like such journals, as they are less likely to have reviewers that are knowledgeable. Which physicist knows that there are inhomogeneities in the climate record? I surely did not after studying physics. That is something I learned when I started working on climatological problems. Also a physicist is less likely to realize that it makes a difference whether you study a local station time series or a global average. Only a small fraction studies complex systems and this topic is typically not part of the physics curricula.

REPLY:
Well, you are entitled to your speculation, doesn’t mean it is right- Anthony

bw
January 11, 2013 5:54 pm

I’ve been saving about 30 stations of temp data listed on GISS site for some time. Just in the last few months there have been unusually large changes to the past data. About a third of the stations have had the last several years of data vanish. My saved text files show that most of those had shown significant cooling in those last few years. Three stations, Beaver City, Honolulu and Norfolk Island have had early 20th century data replaced to show massively cooler temps, resulting in what now appears to be an obvious warming trend. Almost every station has had substantial changes in at least of few years of data. One station, Hilo, has “vanished”
A few stations seem to be resisting alteration, Vostok, Halley and Davis in Antarctica. The Amundsen-Scott data are usually stabel but now show small changes.
There are thousands of stations accessible from the GISS page, but if those stations are being changed in the same proportions as the my small sampling, then there are massive problems with the GISS data. I can’t imagine any valid scientific justification for what has recently been happeining. I hope others who are watching past surface station data maintain their old data files.

Steve Garcia
January 11, 2013 8:44 pm

Sorry, boys and girls, for how long this comment is, but:
THIS is the paper I was looking for when I first became interested in global warming – as a neutral observer with no bone to pick. That was over a decade ago. I had never found it. Till now.
Thanks to Lüdecke, Link, and Ewert! FINALLY someone has done this study. Bravissimo!
And now that they have, what do they find?
1. Anthropogenic forcings are a minor player.
2. UHI is a considerable factor [duh] (…the real anthropogenic factor in global warming…)
3. The 1989 Dying of the Thermometers needs to be looked at and included
4. Global averages are a BAD way to do it, because, as the authors say:

Secondly, and of main interest here, establishing global records attenuate the extremes of uncorrelated local records. As a consequence, the standard deviation, which is a decisive attribute in our analysis, becomes unrealistically small in global records. [emphasis added]

Those two points – the attenuation and the standard deviation – about the Mannian global averaging I have ranted about more than once here, though not using the standard deviation specifically (though in my mind it is a natural fallout of the attenuation). In my mind, losing sight of the extremes is a sure sign that the methodology is WRONG.
Temporal Rectification of proxies to each other
I also think that the global proxies have extremes within each proxy that are lost when combining them and not trying to temporally rectify their extremes. I saw this on one spaghetti graph posted here about four years ago (the author I can’t recall), where extremes of the different proxies were time-shifted from each other. I made the point here (and I think also at CA) that the uncertainty factors apply to TIME as well as magnitude. Why? Because the time element in many proxies (if not all) is uncertain, and this makes extremes in one misalign with extremes of others. When homogenizing for combining, these extremes cannot but be minimized, when in reality all the extremes do show up in the individual proxy records, and if those extremes are flattened/attenuated by the combining method, then the method CANNOT be correct. Averaging in itself will attenuate. But without consciously attempting to keep the those extremes in the record, through to the final graph, the extremes will be flattened and perhaps lost altogether (which is what Mann’s Hockey Stick HANDLE is all about). The extremes are REAL in the data. If any method overly attenuates them, then that should be a red flag that a better method (probably more complex) needs to be applied. If that method doesn’t exist, then a careful development of such a method needs to be developed and vetted I.e., the homogenization cannot be left standing as the proper way to analyze and/or present the data. It is imperative to develop that better method; until that is done, everyone is farting in the wind.
I am just ecstatic that this study has finally been done. To me, THIS is the definitive paper, the one that can BEGIN the beginning of an assessment of our true climate history. It MAY even make it possible to have good climate models, once people get rid of the wrong assumptions and accept the strength of the natural variations. Just adding in correct individual UHI values for stations will be a great step forward (instead of adding a global UHI value out of sheer laziness).
Steve Garcia

Adrian Kerton
January 12, 2013 1:55 am

The Graph of Temperature vs. Number of Stations
http://www.uoguelph.ca/~rmckitri/research/nvst.html
Says it all?

Jessie
January 12, 2013 3:31 am

Feet2theFire- link to paper would be useful thanks.

polistra
January 12, 2013 7:51 am

Bravo. Simple thesis: Recent changes are unusual, but NOT global. Different areas are changing in DIFFERENT DIRECTIONS. Therefore you can’t possibly assume a global cause. You have to look for local or regional causes.

kim
January 12, 2013 8:21 am

Remove the fogging double negative and the last sentence means that a definitely positive anthropogenic contribution can be excluded. Now I’ll go read comments and see if anyone agrees.
==========

kim
January 12, 2013 8:26 am

Well, several and many agree. There is a metaphor with climate, foggy at times. It’s always been all about the albedo.
===========

kim
January 12, 2013 8:36 am

AK, @ 1:55 AM. Ungodly. Eyeballing, that’s nearly a two degree warp.
==================

kim
January 12, 2013 8:37 am

Victor re: Ross would be interesting.
===========

mpainter
January 12, 2013 8:48 am

Victor Venema says:
January 11, 2013 at 5:06 pm
“inhomogeneous data”; “variability….due to inhomogeneities”
==========================
You seem to favor “homogenization”. There are objections to this. Whatever arguments might be advanced in favor of homogenization, it is still data manipulation and data manipulation leads to adulteration of data and spurious results. Such is not science, but invention.

Victor Venema
January 12, 2013 2:05 pm

Dear M Painter, with your argumentation any computation performed using data would not be science. Science would only be looking at text files with raw data.
It would be nice if you had a somewhat more convincing argument why homogenization leads to spurious results.
Last year, I have written a paper validating homogenization algorithms. Because we knew, you guys think climatologists are evil, we performed the validation blind. The climatologists did not know were the inhomogeneities were, when they applied their homogenization methods. The results were clear: homogenization improves the quality of temperature data, after homogenization the trends are more accurate as before, after homogenization the decadal variability is more accurate as before.
We know from the breaks known in the station history that inhomogeneities in temperature data have a typical size of around 0.8°C and we know that they occur about every 15 to 20 years. The authors of the EIKE paper kept these inhomogeneities in their data. M Painter, do you see it as good science to claim that the variability in such a station time series is *natural*, when so much variability is artificial, is due to inhomogeneities, is due to changes in the measurement and not of the local climate?
If you want to do good science on single station data, you need to take the inhomogeneities into account.

richardscourtney
January 12, 2013 2:43 pm

Victor Venema:
In your post at January 12, 2013 at 2:05 pm you wrote

Last year, I have written a paper validating homogenization algorithms. Because we knew, you guys think climatologists are evil, we performed the validation blind. The climatologists did not know were the inhomogeneities were, when they applied their homogenization methods. The results were clear: homogenization improves the quality of temperature data, after homogenization the trends are more accurate as before, after homogenization the decadal variability is more accurate as before.

You seem to lack an adequately long memory.
Some months ago we discussed the paper you now again cite. It is rubbish.
Importantly, in that previous discussion I interpreted your words to mean you were involved in compilation of hemispheric and global temperature data sets, but you corrected me saying you were not and you are not.
Now you say, “Because we knew, you guys think climatologists are evil, we performed the validation blind”. That sentence is very revealing. Firstly, a scientist conducts a double blind study because that avoids bias and for no other reason. Secondly, we “guys” do NOT “think climatologists are evil” but we know of a few who are corrupt because the climategate emails reveal they are in their own words.
Your assertion that “homogenization improves the quality of temperature data” is bollocks!
The data is what it is and its “quality” is its reliability, accuracy and precision. Those attributes are defined by its acquisition and they cannot be improved by post processing.
It may be that according to your trial “after homogenization the trends are more accurate as before, after homogenization the decadal variability is more accurate as before” but that is NOT an improvement to the quality of the data. It is an alteration to the effect of processing the data: a trend is a statistical construct from the data.
Indeed, the trends of the unhomegenised data CANNOT be altered by homogenisation because those trends (i.e. of individual measurement sites) are what they are. The trends of the homogenised data are composite data from several sites and they cannot be compared to the trends of unhomogenised data because there is no unique and valid method for the comparison.
Richard

Victor Venema
January 12, 2013 3:28 pm

Dear Mr Courtney, while I do remember that you are the one who always links to an incomprehensible text at the British Parliament, I must have forgotten or missed the arguments why my validation study is “rubbish”. Is it too much to ask to bring me up to speed with some solid arguments?
You perform double blind studies with animals or humans to avoid bias. In case of algorithms you only need to do so if you do not trust the operator. I do trust scientists and would thus personally not see a pressing need for blind studies. And we also found that the algorithms which were known to be good from the validation studies by the scientists themselves, were actually good in the blind study.
I will ignore the sophistry. Time for bed.

richardscourtney
January 12, 2013 3:53 pm

Victor Venema:
Your post at January 12, 2013 at 3:28 pm concludes saying

I will ignore the sophistry. Time for bed.

I used no “sophistry” in my factual post at January 12, 2013 at 3:28 pm but your reply does, and it answers none of my points.
Onlookers can decide for themselves why you have avoided each of my points.
I hope your Mummy tucks you up comfortably and reads you a nice bedtime story.
Richard

January 12, 2013 5:49 pm

Victor writes “It would be nice if you had a somewhat more convincing argument why homogenization leads to spurious results.”
Any time you alter the data you’re running the risk of making it worse and not better. An example might be trees that grow near-ish a temperature station and cast more and more of a shadow on the surrounding area which in turn shows as a slight cooling trend over the years. Then the trees are removed with the associated “large” instant change in measured temps and there is a discontinuity in the record at that point.
Most homogenisation methods would see that long term trend as the “data” and the tree cutting as a station change. BEST for example might count that as a second series. But what is the data really saying?

Bill Illis
January 13, 2013 4:13 am

Victor Venema,
Can I ask what is the distribution of breakpoints between those that are a down breakpoints versus those that are up breakpoints?
Are the homogenization algorithms robust in terms of the underlying trend of the data. For example, if the underlying trend of the dataset is up at 0.1C per decade, do the algorithms detect more “down” breakpoints (going against the trend) than the number of “up” breakpoints (going with the trend). I think, mathematically, the homogenization algorithms have to neccessarily find more downs than ups because the unlying trend of the land temperature is up. The algorithms need to be rebalanced to remove this tendency.
Berekely BEST took the 5,700 stations they used and split them into 47,000 different individual records. Or, in other words, they found 8 different breakpoints in an average station (just mathematically, not by examining written station histories). That sounds like way too many to me which means that some of those breakpoints are, in fact, real changes in temperature and are not stations moves. They are just natural temperature variation.
Which therefore implies that (if some breakpoints are real changes) and the algorithm necessarily finds more down breaks because of the underlying trend, then the homogenization algorithm merely accentenuated the underlying trend.
Distribution of the breakpoints please and the temporal distribution as well (so we can see if the distribution changes in the 60 year up and down cycle of temperatures)?
I hope people understand what I wrote above because this is a very significant issue for Land temperature records. Everyone is using these homogenization algorithms now.

January 13, 2013 5:38 am

http://www.woodfortrees.org/plot/hadcrut4gl/from:1906/to:2013/plot/hadcrut4gl/from:1906/to:2013/trend/plot/hadcrut3vgl/from:1906/to:2013/plot/hadcrut3vgl/from:1906/to:2013/trend/plot/gistemp/from:1906/to:2013/plot/gistemp/from:1906/to:2013/trend/plot/hadsst2gl/from:1906/to:2013/plot/hadsst2gl/from:1906/to:2013/trend
the above graphs shows that the paper is correct showing an increase of about 0.6 over the period. I am however questioning the data base before 1920-1930 when people did not even know thermometers have to be calibrated every so often.
I have explained here
http://blogs.24.com/henryp/2012/10/02/best-sine-wave-fit-for-the-drop-in-global-maximum-temperatures/#comment-215
why I think almost all warming is natural warming

richardscourtney
January 13, 2013 6:56 am

Bill Illis:
At January 13, 2013 at 4:13 am you say

I hope people understand what I wrote above because this is a very significant issue for Land temperature records. Everyone is using these homogenization algorithms now.

Yes, they are but each team uses its own and unique homogenization algorithms. Which – if any – homogenisation method is right?
The matter is not trivial. In his post at January 12, 2013 at 2:05 pm, Victor Venema asserts

The results were clear: homogenization improves the quality of temperature data, after homogenization the trends are more accurate as before, after homogenization the decadal variability is more accurate as before.

His assertions are clearly bunkum when there is no possibility of an independent calibration to assess the accuracy of trends and/or the accuracy of their decadal variability. And any claim that analyses of constructed data can overcome the lack of calibration is pure sophistry: the analyses only assess the assumptions used to generate the constructed data. (Venema’s childish response at January 12, 2013 at 3:53 pm provides a strong indication that he knows this).
The homogenisations certainly do change the trends mentioned by Venema; e.g. see this
http://jonova.s3.amazonaws.com/graphs/giss/hansen-giss-1940-1980.gif
Perhaps Victor Venema can explain how he knows these changes increase the accuracy of the trends?
Richard

kim
January 13, 2013 11:28 am

Yeah, I know it’s shabbas, or at least a day of rest, but where is Victor Vanema Winkle? Is he improving data in his dreams?
==========

DavidG
January 13, 2013 12:43 pm

feet to fire- a link would be nice, no one is going to accept this claim on face value, no matter how good it sounds!

DavidG
January 13, 2013 12:49 pm

I found one on William Briggs blog.
http://wmbriggs.com/blog/?p=4630

phlogiston
January 13, 2013 2:57 pm

How much of the world has to be covered by cities before UHI becomes AGW? (i.e. AGW but NOT CAGW, no role for CO2, just vegetation loss, albedo and aggregated microclimate).

phlogiston
January 13, 2013 3:04 pm

Bill Illis says:
January 13, 2013 at 4:13 am
Victor Venema,
Can I ask what is the distribution of breakpoints between those that are a down breakpoints versus those that are up breakpoints?

I hope people understand what I wrote above because this is a very significant issue for Land temperature records. Everyone is using these homogenization algorithms now.

As I recall this important issue of “breakpoints” has also been addressed by E.M.Smith.

January 13, 2013 4:34 pm

Bill writes “I think, mathematically, the homogenization algorithms have to neccessarily find more downs than ups because the unlying trend of the land temperature is up.”
I’m no great expert but from what I’ve learned at places like Lucia’s blackboard where the discussion is highly technical it seems to be common practice to de-trend data before looking for discontinuities and anomolies.

Bill Illis
January 14, 2013 4:46 am

TimTheToolMan says:
January 13, 2013 at 4:34 pm
Bill writes “I think, mathematically, the homogenization algorithms have to neccessarily find more downs than ups because the unlying trend of the land temperature is up.”
I’m no great expert but from what I’ve learned at places like Lucia’s blackboard where the discussion is highly technical it seems to be common practice to de-trend data before looking for discontinuities and anomolies.
———————————
How do you de-trend 60 year up and down cycles?

Victor Venema
January 14, 2013 3:57 pm

TimTheToolMan says: “Any time you alter the data you’re running the risk of making it worse and not better.
Yes, you do, but you also have the chance of making the data better. Homogenization will not improve the trend estimate for every single station, but on average it makes the estimated trend more reliable.
TimTheToolMan says: “An example might be trees that grow near-ish a temperature station and cast more and more of a shadow on the surrounding area which in turn shows as a slight cooling trend over the years. Then the trees are removed with the associated “large” instant change in measured temps and there is a discontinuity in the record at that point.”
That is indeed an important point to consider. If you would perform absolute homogenization, using only the data from one station, this could easily happen and your trend estimate would have a large uncertainty after homogenization. You can only do this to remove very large breaks and will only do this if there is no possibility to use relative homogenization.
This is where the “detrending” comes in, which you mention in a latter comment. Typically people use relative homogenization nowadays. In this case, you use the difference between one station and its neighbors. The name comes from the concept of looking at one station relative to its neighbors. In this way, you remove the (nonlinear) regional climate trend. An added benefit is that you also remove a lot of the year to year variations and can thus see smaller inhomogeneities. The difference time series contains the inhomogeneities and some noise because the weather is different at both locations. In this difference time series you can see both trends in single stations (due to growing vegetation or due to urbanization) and the breaks. Then you estimate the size of the local trend and the size of the break on this difference time series. This concept is explained in more detail and with some example graphs on my blog, it also includes an example with a trend in one of the stations.

richardscourtney
January 14, 2013 4:25 pm

Victor Venema:
At January 14, 2013 at 3:57 pm you attempt to answer a point from TimTheToolMan.
Please be so kind as to answer the question I posed to you at January 13, 2013 at 6:56 am.
To save you needing to scroll up and find it, I copy it here

The homogenisations certainly do change the trends mentioned by Venema; e.g. see this
http://jonova.s3.amazonaws.com/graphs/giss/hansen-giss-1940-1980.gif
Perhaps Victor Venema can explain how he knows these changes increase the accuracy of the trends?

Richard

Victor Venema
January 14, 2013 4:50 pm

Bill Illis asks: “Can I ask what is the distribution of breakpoints between those that are a down breakpoints versus those that are up breakpoints?”
Whether trends in the data are a problem depends on where the trend comes from.
If the trend in due to regional or global climate change, the trend is no problem what so ever. As I just explained to the Tool Man, the first step in relative homogenization of temperature data is to compute the difference time series of the station you are interested in and its neighbours. If the measurement network is sufficiently dense, both station will experience the same trend and you will not see the trend in the difference time series any more. We detect and determine the size of the breaks on this difference time series. Consequently, it makes no difference whether the breaks go up or down or whether they go in the direction of the trend or against its direction.
In our validation study, we generated data, which had strong trends, both up and down. I included negative regional climate trends as I was curious whether algorithms, which are always used to correct upward trends, might have a programming error that makes them less good for downward trends, which was not the case. You can see a scatterplot with results for three algorithms in the top row of Figure 5. ACMANT (left) was a just developed method, when we validated it in this study and still had some problems with the trends, the new version is much better. The Craddock method (middle) is one of the best methods, the trends in the original data (the data before I put in the inhomogeneities) and in the homogenized data are very similar and lie almost perfectly on the diagonal. AnClim (right) did not do so good, but still improves the station trends. In the bottom row, you can see that for precipitation it is much more difficult to improve the trends. Mainly because precipitation is so highly variable and it is thus difficult to accurately estimate the size of the breaks.
However, if the trend is due to the inhomogeneities it is a different matter. Homogenization will improve the trends on average, but it is not almighty. Not all inhomogeneities can be found, only the largest ones. Thus if there is a bias in the trend, a part of this will remain after homogenization. How accurately the trends can be reproduced depends on the noise in the difference time series and thus on the strength of the cross-correlations between the stations. The cross-correlation decreases with distance between stations. Thus especially in data-poor regions and periods some bias may remain. One of the main deficiencies of our validation study was that we did not insert inhomogeneities that produced a bias in the trend. Another recent study by NOAA did include inhomogeneities that produced biases. Unfortunately, this NOAA dataset was unrealistically difficult. Thus you can conclude from it that homogenization is able to remove part of the trend bias, but not quantify how well homogenization would do in a real word setting. The benchmarking group of the ISTI is working on doing this for a global dataset.
An artificial positive trend due to inhomogeneities could be due to urbanization (increase in the size of the urban heat island effect). In the stations were this is an issue, this is a clear signal. However, most stations are not affected. Therefore this is expected to be a small effect. I unfortunately did not study this myself yet and can thus not explain much more.
An artificial negative trend could be due to better protection against solar radiation in modern climate stations, due to relocations from cities to airports and in the US due to a systematic change in the time of observation.
Most of the evidence suggests that the artificial negative trend is stronger. Consequently, the GHCN trend is 0.2°C stronger after homogenization. I personally expect that this correction is too small, that the GHCN trend estimate is too conservative. Sorry about that, I know that message is not popular around here.

Victor Venema
January 14, 2013 5:06 pm

Dear Mr. Courtney. Thank you for your polite question.
Yes, homogenization changes the trend and it should change the trend in case the inhomogeneities cause a bias.
Pointing to GISS, when discussing homogenization is a bit weird. I never looked into this dataset, but as far as I know, GISS uses raw data from GHCN without performing homogenization and only makes a simple correction for urbanization.
It would be nice if climate “sceptics” would stop linking to pictures and started linking to articles that explain how these pictures were computed. Without explanation, these pictures suggest a lot but say nothing. Another nice thing of linking to a article is that typically already some nice person explained in the comments what is wrong with the picture. That save a lot of time.
That homogenization improves the trend makes sense from first principles. That the software implementations actually improve the trends can be seen in numerical validation studies, for example the two mentioned in my previous answer to Bill Illis.

January 14, 2013 11:51 pm

Victor writes at his blog. “sometimes even the local newspapers are studied, but you cannot read all newspapers printed in the last century.”
You can certainly read the papers that pertain to periods where discontinuities occur and adjustments are going to happen. As well as possibly talking to the people who ran the station at the time. There is no substitute for actual data.

richardscourtney
January 15, 2013 5:51 am

Victor Venema:
You again use sophistry in your reply to me at January 14, 2013 at 5:06 pm when you say

It would be nice if climate “sceptics” would stop linking to pictures and started linking to articles that explain how these pictures were computed.

Say what!? I was asking YOU to explain it because YOU claim you know about homogenisation of global temperature data sets.
And your reply says

Pointing to GISS, when discussing homogenization is a bit weird. I never looked into this dataset, but as far as I know, GISS uses raw data from GHCN without performing homogenization and only makes a simple correction for urbanization.

If true, then that says the changes in the GISS data set are produced solely by “a simple correction for urbanization”. Taking your assertion as being correct then that “simple correction” changes from month to month such that it provides this
http://jonova.s3.amazonaws.com/graphs/giss/hansen-giss-1940-1980.gif
Also, Bill Illis wrote at January 13, 2013 at 4:13 am

I hope people understand what I wrote above because this is a very significant issue for Land temperature records. Everyone is using these homogenization algorithms now.

Subsequently, that has been cited and taken as a ‘given’ in this thread. But you do not mention it in your long reply to Bill Illis at January 14, 2013 at 4:50 pm. However, in that reply you admit that “a simple correction for urbanization” IS “homogenisation” when you write

An artificial positive trend due to inhomogeneities could be due to urbanization (increase in the size of the urban heat island effect).

In a previous post you had the gall to accuse me of “sophistry” when your posts show you to be the most extreme sophist it has ever been my misfortune to encounter.
So, I again ask you to answer the question which I have repeatedly put to you and which you have used sophistry to evade

The homogenisations certainly do change the trends mentioned by Venema; e.g. see this
http://jonova.s3.amazonaws.com/graphs/giss/hansen-giss-1940-1980.gif
Perhaps Victor Venema can explain how he knows these changes increase the accuracy of the trends?

Please note that I first presented that question in my post at January 13, 2013 at 6:56 am where I wrote

Yes, they are but each team uses its own and unique homogenization algorithms. Which – if any – homogenisation method is right?
The matter is not trivial. In his post at January 12, 2013 at 2:05 pm, Victor Venema asserts

The results were clear: homogenization improves the quality of temperature data, after homogenization the trends are more accurate as before, after homogenization the decadal variability is more accurate as before.

His assertions are clearly bunkum when there is no possibility of an independent calibration to assess the accuracy of trends and/or the accuracy of their decadal variability. And any claim that analyses of constructed data can overcome the lack of calibration is pure sophistry: the analyses only assess the assumptions used to generate the constructed data.

Your posts since then have not addressed any of those fundamental points; viz
1.
Different Teams use different homogenisation methods: which is right?.
2.
There is no possibility of independent calibration for effects of homogenisation.
3.
The ‘trials’ of homogenisation assess the ability to compensate for the assumptions in the constructed data so may be misleading concerning ability of homogenisation to address problems with real data.
Richard

Victor Venema
January 15, 2013 6:53 am

Dear Mr. Medhurst. It would have been nice if you had made your comment (also) below the post you are complaining about. There more people would have read the sentence. It would also have been nice if you had linked to the post so that people would know the context. There it would make more the impression that you are interested in understanding the issue better. Commenting only here gives me the impression that you mainly want to discredit.
Tim Medhurst says: “You can certainly read the papers that pertain to periods where discontinuities occur and adjustments are going to happen.”
The post you are referring to states that this is sometimes performed. Typically by a national climatologist who is interested in the local climate.
For a global dataset this is clearly not possible and not necessary. A global dataset contains thousands of stations and the metadata (station history) is in hundreds of languages. I am not sure whether the population is willing to pay the additional taxes to fund this project. Especially as it will not improve the reliability of the global temperature trend noticeably.
The main use of such metadata would be to precise the date of the break found by statistical homogenization. Another good use would be the validation of the performance of statistical homogenization methods. For the latter use, it is fortunately sufficient to have some regions with good metadata, which can then be used to validate the homogenization algorithms (next to validation studies using synthetic data, which have the advantage that you are sure you know all breaks, which you never are in reality).
Only implementing breaks known in the metadata, or preferentially implementing breaks known in metadata is dangerous because some types of breaks have better metadata as others. For example, the increase of the urban heat island is difficult to document, the move of the station out of the city will be documented. The example of TimTheToolMan of growing vegetation is similar. That a bush is growing is not often noted down, that is is cut may end up in the metadata. Thus only correcting inhomogeneities known in the metadata could lead to more biases in the trends.
Tim Medhurst says: “As well as possibly talking to the people who ran the station at the time.”
Also this is done by national climatologists or by inspectors of the national weather services, where do you think the metadata comes from? This is again naturally not possible for large datasets. No one wants to pay the additional taxes for little gain in quality. And most of the observers are long dead. If the Surface Stations project is continued for multiple decades, it will be a useful resource for future climatologists. Just having a time slice is nice, but of limited use.
Your comment is a typical example to the tactic of using impossible expectations (and moving goalposts). If you had a study that shows that such work is paramount and will not just change the fourth decimal, that may convince people to make such an enormous investment. As always in life, resources are limited and have to be spend well.

January 15, 2013 4:37 pm

The mathematical models used in statistical analysis depend on the assumption that all members of the population studied were obtained under identical circumstances at the same time, uising the same equip,ment, and preferably by the same observer. The”Global Suface Temperatuire Anomaly Record” violates every conceiivable conditon, plus the fact that none of the samples are representative of global temperature the degree to which they differ being itself variable. The widespread use of this record as representing a :”temperature trend” displays at one moment a profound ignorance of mathematics and a determination to make use of the most unreliable information if it serves the purpose of “saving the planet” This paper is merely an example of how it is posible to make something from re-arranging the deckchairs on the “Titanic”

Brian H
January 17, 2013 9:22 pm

The only anthropogenic influence which cannot be ruled out is a marginal one. All other “anthropogenic” attributions are ruled out.