Would You Like Your Temperature Data Homogenized, or Pasteurized?

A Smoldering Gun From Nashville, TN

Guest post by Basil Copeland

The hits just keep on coming. About the same time that Willis Eschenbach revealed “The Smoking Gun at Darwin Zero,” The UK’s Met Office released a “subset” of the HadCRUT3 data set used to monitor global temperatures. I grabbed a copy of “the subset” and then began looking for a location near me (I live in central Arkansas) that had a long and generally complete station record that I could compare to a “homogenized” set of data for the same station from the GISTemp data set. I quickly, and more or less randomly, decided to take a closer look at the data for Nashville, TN. In the HadCRUT3 subset, this is “72730” in the folder “72.” A direct link to the homogenized GISTemp data used is here. After transforming the row data to column data (see the end of the post for a “bleg” about this), the first thing I did was plot the differences between the two series:

click to enlarge

The GISTemp homogeneity adjustment looks a little hockey-stickish, and induces an upward trend by reducing older historical temperatures more than recent historical temperatures. This has the effect of turning what is a negative trend in the HadCRUT3 data into a positive trend in the GISTemp version:

click to enlarge

So what would appear to be a general cooling trend over the past ~130 years at this location when using the unadjusted HadCRUT3 data, becomes a warming trend when the homogeneity adjustment is supplied.

“There is nothing to see here, move along.” I do not buy that. Whether or not the homogeneity adjustment is warranted, it has an effect that calls into question just how much the earth has in fact warmed over the past 120-150 years (the period covered, roughly, by GISTemp and HadCRUT3). There has to be a better, more “robust” way of measuring temperature trends, that is not so sensitive that it turns negative trends into positive trends (which we’ve seen it do twice how, first with Darwin Zero, and now here with Nashville). I believe there is.

Temperature Data: Pasteurized versus Homogenized

In a recent series of posts, here, here, and with Anthony here, I’ve been promoting a method of analyzing temperature data that reveals the full range of natural climate variability. Metaphorically, this strikes me as trying to make a case for “pasteurizing” the data, rather than “homogenizing” it. In homogenization, the object is to “mix things up” so that it is “the same throughout.” When milk is homogenized, this prevents the cream from rising to the top, thus preventing us from seeing the “natural variability” that is in milk. But with temperature data, I want very much to see the natural variability in the data. And I cannot see that with linear trends fitted through homogenized data. It may be a hokey analogy, but I want my data pasteurized – as clean as it can be – but not homogenized so that I cannot see the true and full range of natural climate variability.

I believe that the only way to truly do this is by analyzing, or studying, how differences in the temperature data vary over time. And they do not simply vary in a constant direction. As everybody knows, temperatures sometimes trend upwards, and at other times downward. The method of studying how differences in the temperature data allows us to see this far more clearly than simply fitting trend lines to undifferenced data. In fact, it can prevent us from reaching the wrong conclusion, as in fitting a positive trend when the real trend has been negative. To demonstrate this, here is a plot of monthly seasonal differences for the GISTemp version of the Nashville, TN data set:

click to enlarge

Pay close attention as I describe what we’re seeing here. First, “sd” means “seasonal differences” (not “standard deviation”). That is, it is the year to year variation in each monthly observation, for example October 2009 compared to October 2008. Next, the “trend” is the result of smoothing with Hodrick-Prescott smoothing (lamnda = 14,400). The type of smoothing here is not as critical as is the decision to smooth the seasonal differences. If a reader prefers a different smoothing algorithm, have at at it. Just make sure you apply it to the seasonal differences, and that it not change the overall mean of the series. I.e., the mean of the seasonal differences, for GISTemp’s Nashville, TN data set, is -0.012647, whether smoothed or not. The smoothing simply helps us to see, a little more clearly, the regularity of warming and cooling trends over time. Now note clearly the sign of the mean seasonal difference: it is negative. Even in the GISTemp series, Nashville, TN has spent more time cooling (imagine here periods where the blue line in the chart above is below zero) than it has warming over the last ~130 years.

How can that be? Well, the method of analyzing differences is less sensitive – I.e. more “robust” — than fitting trend lines through the undifferenced data. “Step” type adjustments as we see with homogeneity adjustments only affect a single data point in the differenced series, but affect every data point (before or after it is applied) in the undifferenced series. We can see the effect of the GISTemp homogeneity adjustments here by comparing the previous figure with the following:

click to enlarge

Here, in the HadCRUT3 series, the mean seasonal difference is more negative, -0.014863 versus -0.012647. The GISTemp adjustments increases the average seasonal difference by 0.002216, making it less negative, but not enough so that the result becomes positive. In both cases we still come to the conclusion that “on the average” monthly seasonal differences in temperatures in Nashville have been negative over the last ~130 years.

An Important Caveat

So have we actually shown that, at least for Nashville, TN, there has been no net warming over the past ~130 years? No, not necessarily. The average monthly seasonal difference has indeed been negative over the past 130 years. But it may have been becoming “less negative.” Since I have more confidence, at this point, in the integrity of the HadCRUT3 data, than the GISTemp data, I’ll discuss this solely in the context of the HadCRUT3 data. In both the “original data” and in the blue “trend” shown in the above figure, there is a slight upward trend over the past ~130 years:

click to enlarge

Here, I’m only showing the fit relative to the smoothed (trend) data. (It is, however, exactly the same as the fit to the original, or unsmoothed, data.) Whereas the average seasonal difference for the HadCRUT3 data here was -0.014863, from the fit through the data it was only -0.007714 at the end of series (October 2009). Still cooling, but less so, and in that sense one could argue that there has been some “warming.” And overall – I.e. if a similar kind of analysis is applied to all of the stations in the HadCRUT3 data set (or “subset”) – I will not be surprised if there is not some evidence for warming. But that has never really be the issue. The issue has always been (a) how much warming, and (b) where has it come from?

I suggest that the above chart showing the fit through the smooth helps define the challenges we face in these issues. First, the light gray line depicts the range of natural climate variability on decadal time scales. This much – and it is very much of the data – is completely natural, and cannot be attributed to any kind of anthropogenic influence, whether UHI, land use/land cover changes, or, heaven forbid, greenhouse gases. If there is any anthropogenic impact here, it is in the blue line, what is in effect a trend in the trend. But even that is far from certain, for before we can conclude that, we have to rule out natural climate variability on centennial time scales. And we simply cannot do that with the instrumental temperature record, because it isn’t long enough. I hate to admit that, because it means either that we accept the depth of our ignorance here, or we look for answers in proxy data. And we’ve seen the mess that has been made of things in trying to rely on proxy data. I think we have to accept the depth of our ignorance, for now, and admit that we do not really have a clue about what might have caused the kind of upward drift we see in the blue trend line in the preceding figure. Of course, that means putting a hold on any radical socioeconomic transformations based on the notion that we know what in truth we do not know.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
203 Comments
Inline Feedbacks
View all comments
Basil
Editor
December 15, 2009 4:47 am

JJ,
So your “several” demonstrable errors reduce to two, one of which I’ve addressed, and the other of which is wrong.
“1) Error: HadCrut3 is ‘unadjusted’ data. Truth: HadCrut3 is homogenized data.”
You say it is homogenized. I say it is whatever CRU says it is, and copied the language above. Even if it is in some way adjusted (homogenized), it isn’t for the “homogeneity” adjustment of GISS, which is a purported UHI adjustment.
“2) Error: GISS homogenized data are derived from HadCrut3, or from a file that is the same as HadCrut3. Truth:GISS homogenized data are derived from GISS combined station data, and that is nothing like Hadcrut3.”
Did I ever actually say that “GISS homogenized data are derived from HadCRUT3, or from a file that is the same as HadCrut3”? I do think they both, in the Nashville case, go back to the same source, but I don’t think I ever made the claim as you state it. In any case, “for the record,” my purpose was to compare HadCRUT3, regardless of provenance, to GISS, with the latter’s homogeneity adjustment.
If I’m in full denial mode, you are for some reason determined that I should fall on my sword and issue a public retraction. I don’t think I can please you, and be true to myself. All I can do is try to explain myself, as best I can.
I am, however, concerned a bit with this:
“Also, please document and carefully check the trends that you quote for these data. I have fit least squares trend lines to the unadjusted and homogenized GISS data, and they both differ in sign from what you report here.”
That’s why I have asked you what you used for the missing value, though it would be surprising if any reasonable substitute for the missing value would account for this. I am more than happy to continue a dialog about this, but have no attention of engaging you further in the other matters. On this, for starters, maybe we could each just share some summary statistics for our GIS variables. For mine:
Summary Statistics, using the observations 1881:01 – 2009:10
for the variable ‘GIS’ (1546 valid observations)
Mean 15.150
Median 15.400
Minimum -4.5000
Maximum 30.500
Standard deviation 8.3482
C.V. 0.55104
Skewness -0.13403
Ex. kurtosis -1.2930
Summary Statistics, using the observations 1881:01 – 2009:10
for the variable ‘sd_GIS’ (1534 valid observations)
Mean -0.012647
Median 0.00000
Minimum -10.900
Maximum 9.5000
Standard deviation 2.6586
C.V. 210.22
Skewness 0.070627
Ex. kurtosis 0.79037
I don’t know if you have the ability to quickly repeat exactly the same data as above, but mean, median, and standard deviation shouldn’t be too hard. Let’s see how close we are in these figures, to see if we’re using anything close to the same data.
Basil

JJ
December 15, 2009 9:18 am

Basil,
“So your “several” demonstrable errors reduce to two, one of which I’ve addressed, and the other of which is wrong.”
No. Two fundamental errors remain, and they propagate throughout your article, creating more errors.
1) Error: HadCrut3 is ‘unadjusted’ data. Truth: HadCrut3 is homogenized data.
“You say it is homogenized. I say it is whatever CRU says it is, and copied the language above.”
CRU says it is homogenized! That is what ‘adjusted to account for non climatic influences’ means. And, as I mentioned to you earlier, those CRU homogenization adjustments are NOT small. Phlim Phlam Phil Jones added a LOT of ‘value’ to some of those data.
“Even if it is in some way adjusted (homogenized), …”
Then your assumption and multiple statements that it is ‘unadjusted data’, and the conclusions you draw while operating from that assumtion, are wrong.
Man up and admit that.
“… it isn’t for the “homogeneity” adjustment of GISS, which is a purported UHI adjustment.”
That has no bearing whatsover, as your article is not in anyway a specific address of UHI. FFS you only mention UHI once, in a list of possible anthropogenic effects, at the end of the article. Stop grasping at straws.
Man up an admit that you were wrong.
2) Error: GISS homogenized data are derived from HadCrut3, or from a file that is the same as HadCrut3. Truth:GISS homogenized data are derived from GISS combined station data, and that is nothing like Hadcrut3.”
“Did I ever actually say that “GISS homogenized data are derived from HadCRUT3, or from a file that is the same as HadCrut3″?”
YES! More than once. Here:
“I didn’t say HadCRUT3 was “raw.” I referred to it as ‘unadjusted.’ Now what I meant by that is that it should be the same as GISS before GISS applies its ‘homogeneity’ adjustment.” and again, here:
“If you go to the GISTemp web site, you get the option of downloading its “pseudo-raw” version of the data, i.e. the data before it applies its “homogeneity” adjustment. I could have used that, instead of HadCRUT3, and I believe that the results would have been similar, if not the same.”
You assumed that HADcrut3 was the same as GISS prior to adjustment. You assumed that substituting HADcrut3 for unadjusted GISS temp data would give the same results. The balance of your article makes sense under that assumption, and is non sensical otherwise. Stop contradicting your own words, man up, and admit that you were wrong.
“In any case, “for the record,” my purpose was to compare HadCRUT3, regardless of provenance, to GISS, with the latter’s homogeneity adjustment.”
Clearly, that was not the purpose in what you wrote. Your purpose was to assess the effect of the GISS temp homogeneity adjustment, demonstrate by comparison to ‘unadjusted’ data that the GISS temp homogeniety adjustment is inadequate, and offer your ‘sd method’ as a superior substitute. That is what you were trying to do. For the record, that would have been a sensible and interesting thing to have done. And the way you used HADcrut3 data, under your assumption that HADcrut3 data were unadjusted, is perfectly consistent with that approach.
The only problem is, your assumption that HADcrut3 data were the same as GISS temp data before the homogenization adjustment was wrong.
That is a simple error. It is easily corrected. You have access to the unadjusted GISS temp data. All you have to do to correct your error is actually use those data, instead of using HADcrut3 data that you thought were the same but arent. Why will you not simply do that?
Evidently, this is why:
[snip]
Correcting your error would require you to admit that you made and error. You would rather make a fool of yourself denying a simple, easily fixed error, than admit to having made a simple, easily fixed error.
You would rather Mann up, than man up.
Anthony should strongly reconsider further collaboration with someone who exhibits Hockey Team behaviours. This site is supposed to be dedicated to exposing and correcting instances of bad science, not committing and rationalizing them.

Basil
Editor
December 15, 2009 9:48 am

If anybody else wants to add to this, I’ll listen. But with each exchange, your tone becomes more and more strident. Plus, the ad hominem remarks — I ignored the “Mann” reference the first time, but now will not — betray something darker in your personality you need to get control of. But it looks like everybody else is moving on. I suggest you do the same. I’m done wasting my time with you.

1 7 8 9