Would You Like Your Temperature Data Homogenized, or Pasteurized?

A Smoldering Gun From Nashville, TN

Guest post by Basil Copeland

The hits just keep on coming. About the same time that Willis Eschenbach revealed “The Smoking Gun at Darwin Zero,” The UK’s Met Office released a “subset” of the HadCRUT3 data set used to monitor global temperatures. I grabbed a copy of “the subset” and then began looking for a location near me (I live in central Arkansas) that had a long and generally complete station record that I could compare to a “homogenized” set of data for the same station from the GISTemp data set. I quickly, and more or less randomly, decided to take a closer look at the data for Nashville, TN. In the HadCRUT3 subset, this is “72730” in the folder “72.” A direct link to the homogenized GISTemp data used is here. After transforming the row data to column data (see the end of the post for a “bleg” about this), the first thing I did was plot the differences between the two series:

click to enlarge

The GISTemp homogeneity adjustment looks a little hockey-stickish, and induces an upward trend by reducing older historical temperatures more than recent historical temperatures. This has the effect of turning what is a negative trend in the HadCRUT3 data into a positive trend in the GISTemp version:

click to enlarge


So what would appear to be a general cooling trend over the past ~130 years at this location when using the unadjusted HadCRUT3 data, becomes a warming trend when the homogeneity adjustment is supplied.

“There is nothing to see here, move along.” I do not buy that. Whether or not the homogeneity adjustment is warranted, it has an effect that calls into question just how much the earth has in fact warmed over the past 120-150 years (the period covered, roughly, by GISTemp and HadCRUT3). There has to be a better, more “robust” way of measuring temperature trends, that is not so sensitive that it turns negative trends into positive trends (which we’ve seen it do twice how, first with Darwin Zero, and now here with Nashville). I believe there is.

Temperature Data: Pasteurized versus Homogenized

In a recent series of posts, here, here, and with Anthony here, I’ve been promoting a method of analyzing temperature data that reveals the full range of natural climate variability. Metaphorically, this strikes me as trying to make a case for “pasteurizing” the data, rather than “homogenizing” it. In homogenization, the object is to “mix things up” so that it is “the same throughout.” When milk is homogenized, this prevents the cream from rising to the top, thus preventing us from seeing the “natural variability” that is in milk. But with temperature data, I want very much to see the natural variability in the data. And I cannot see that with linear trends fitted through homogenized data. It may be a hokey analogy, but I want my data pasteurized – as clean as it can be – but not homogenized so that I cannot see the true and full range of natural climate variability.

I believe that the only way to truly do this is by analyzing, or studying, how differences in the temperature data vary over time. And they do not simply vary in a constant direction. As everybody knows, temperatures sometimes trend upwards, and at other times downward. The method of studying how differences in the temperature data allows us to see this far more clearly than simply fitting trend lines to undifferenced data. In fact, it can prevent us from reaching the wrong conclusion, as in fitting a positive trend when the real trend has been negative. To demonstrate this, here is a plot of monthly seasonal differences for the GISTemp version of the Nashville, TN data set:

click to enlarge

Pay close attention as I describe what we’re seeing here. First, “sd” means “seasonal differences” (not “standard deviation”). That is, it is the year to year variation in each monthly observation, for example October 2009 compared to October 2008. Next, the “trend” is the result of smoothing with Hodrick-Prescott smoothing (lamnda = 14,400). The type of smoothing here is not as critical as is the decision to smooth the seasonal differences. If a reader prefers a different smoothing algorithm, have at at it. Just make sure you apply it to the seasonal differences, and that it not change the overall mean of the series. I.e., the mean of the seasonal differences, for GISTemp’s Nashville, TN data set, is -0.012647, whether smoothed or not. The smoothing simply helps us to see, a little more clearly, the regularity of warming and cooling trends over time. Now note clearly the sign of the mean seasonal difference: it is negative. Even in the GISTemp series, Nashville, TN has spent more time cooling (imagine here periods where the blue line in the chart above is below zero) than it has warming over the last ~130 years.

How can that be? Well, the method of analyzing differences is less sensitive – I.e. more “robust” — than fitting trend lines through the undifferenced data. “Step” type adjustments as we see with homogeneity adjustments only affect a single data point in the differenced series, but affect every data point (before or after it is applied) in the undifferenced series. We can see the effect of the GISTemp homogeneity adjustments here by comparing the previous figure with the following:

click to enlarge

Here, in the HadCRUT3 series, the mean seasonal difference is more negative, -0.014863 versus -0.012647. The GISTemp adjustments increases the average seasonal difference by 0.002216, making it less negative, but not enough so that the result becomes positive. In both cases we still come to the conclusion that “on the average” monthly seasonal differences in temperatures in Nashville have been negative over the last ~130 years.

An Important Caveat

So have we actually shown that, at least for Nashville, TN, there has been no net warming over the past ~130 years? No, not necessarily. The average monthly seasonal difference has indeed been negative over the past 130 years. But it may have been becoming “less negative.” Since I have more confidence, at this point, in the integrity of the HadCRUT3 data, than the GISTemp data, I’ll discuss this solely in the context of the HadCRUT3 data. In both the “original data” and in the blue “trend” shown in the above figure, there is a slight upward trend over the past ~130 years:

click to enlarge

Here, I’m only showing the fit relative to the smoothed (trend) data. (It is, however, exactly the same as the fit to the original, or unsmoothed, data.) Whereas the average seasonal difference for the HadCRUT3 data here was -0.014863, from the fit through the data it was only -0.007714 at the end of series (October 2009). Still cooling, but less so, and in that sense one could argue that there has been some “warming.” And overall – I.e. if a similar kind of analysis is applied to all of the stations in the HadCRUT3 data set (or “subset”) – I will not be surprised if there is not some evidence for warming. But that has never really be the issue. The issue has always been (a) how much warming, and (b) where has it come from?

I suggest that the above chart showing the fit through the smooth helps define the challenges we face in these issues. First, the light gray line depicts the range of natural climate variability on decadal time scales. This much – and it is very much of the data – is completely natural, and cannot be attributed to any kind of anthropogenic influence, whether UHI, land use/land cover changes, or, heaven forbid, greenhouse gases. If there is any anthropogenic impact here, it is in the blue line, what is in effect a trend in the trend. But even that is far from certain, for before we can conclude that, we have to rule out natural climate variability on centennial time scales. And we simply cannot do that with the instrumental temperature record, because it isn’t long enough. I hate to admit that, because it means either that we accept the depth of our ignorance here, or we look for answers in proxy data. And we’ve seen the mess that has been made of things in trying to rely on proxy data. I think we have to accept the depth of our ignorance, for now, and admit that we do not really have a clue about what might have caused the kind of upward drift we see in the blue trend line in the preceding figure. Of course, that means putting a hold on any radical socioeconomic transformations based on the notion that we know what in truth we do not know.

About these ads

204 thoughts on “Would You Like Your Temperature Data Homogenized, or Pasteurized?

  1. Why doesn’t somebody get the raw unadjusted data for the world and plot just the rural data to see if there is warming or not? Urban area data would naturally seem to me to rise over time as those areas grow larger (and hence get more cars, electrical appliances, roads, buildings, people, etc).

    In my view, if the raw unadjusted rural data shows no warming, then CO2 isn’t working.

    [REPLY - Even the rural stations are horribly sited. When I last totted it up, the CRN site rating was even worse for rural stations than urban (rural/urban as defined by USHCN1). Even so, the average urban station warmed 0.5C/century more than the average rural station. 9% of USHCN1 stations are classified as urban, 17% as suburban, and the rest, rural. ~ Evan]

  2. There seems to be a discrepancy in the BOM records re raw data and their anomaly graphs in their
Australian high-quality climate site data. A blogger on Andrew Bolt’s site noticed that when the mean temp for Cape Otway Lighthouse station was calculated from the raw data it was not reflected in the anomaly map.
 I checked Yamba Pilot Station and found a similar discrepancy straight away.
    1915 had a max av temp of 23.6C and a min av temp of 15.9C.

    2008 had a max av temp of 23.6C and a min av temp 0f 15.5C.

    Clearly, 1915 has a slightly higher mean av temp than 2008.
    Yet the anomaly graph shows 2008 higher than 1915 by 0.2C. Eh! It should be the other way around.
These discrepancies (which also show up in Cape Otway) give a false impression that the recent warming is greater than it really is. There must be many examples of this (NZ, Darwin, Arctic stations, etc).

    
Anomaly data at :-
    
http://reg.bom.gov.au/cgi-bin/climate/hqsites/site_data.cgi?variable=maxT&area=aus&station=058012&period=annual&dtype=anom&ave_yr=10
Raw data (max temps) at:-
http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_nccObsCode=36&p_display_type=dataFile&p_startYear=&p_stn_num=058012
and min temps at:-
http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_nccObsCode=38&p_display_type=dataFile&p_startYear=&p_stn_num=058012
    Seems like the service just went down due to maintenance problems. Should be back on line – hopefully all the same.

  3. I’ll issue a gripe here. Annualizing the temperatures. Why? There are four seasons every year. If you want to make the data less choppy, fine, but at least keep it in its natural rhythm. Do the Grand Poobahs have a reason for imposing the calendar’s will on the data?

  4. Headline “Nashville avoids global warming by using unadulterated data” I wonder how many other locations could avoid global warming by using unadulterated data. Why indeed maybe we could start a national trend. Maybe we could introduce a bill in congress to stop global warming by using unadulterated data, no that would not work as where is the tax in that.
    Great Work and thank you for your outstanding efforts.

  5. Basil, I felt a very slight breeze above me as read this. Definitely above my level of understanding. Hopefully we have some other readers who understand your data torture better than I do. I do however relate to your closing remarks about the level of our collective ignorance regarding attribution.

    Thanks . . . I think.

  6. Looking at the differences between the GISTemp and HadCRUT3 data (first graph), you can clearly see similar ‘stepwise’ decreases and increases as we have seen at Darwin. There is also the clear ‘reversal’ of the changes made in the early 60’s. That is significant, I think, since isn’t that when the tree-ring proxies were cut off, and thus these are the data ‘spliced’ on to the end of that data?

    It would appear that ‘hiding the decline’ is not enough, they had to provide some artificial ‘forcing’ upward of the remaining data. I have to say that I now reject any claims of any Global Warming until all the data have been independently verified. As discussed well here, another method of seeking trends is probably in order too.

    The trouble is that even sceptics admit ‘there has been warming’ but we are not certain if it is because of CO2. Now we are not even sure if there HAS been any warming at all! This could be the biggest scientific fiasco of modern times.

  7. You guys are on the right track, but what you have not done is gone into the scientific literature and looked at exactly how the adjustments are made… you will, i guarrantee, find this whole temperature monitoring and reporting bizzarre

  8. This is a creative idea…but…I fear you’re going to get a lot of misleading answers if there are large jumps in the dataset that aren’t the result of changes in the recording station and that aren’t the result of climate variability. If you gradually cool for 75 years and then rapidly warm for 30, you’re going to do more seasonal cooling than seasonal warming…that’s the argument the AGW crowd will use at any rate.

    I think the better approach is to make real adjustments to the data instead of statistically invented ones. Don’t look at station data as though it’s numbers on a map…look at it as though it’s a thermometer in a park. Take the raw data…adjust it for the urban heat island affect (which can be fairly accurately parameterized with a spatial correlation between population density and spatial climate temperature anomaly), adjust to account for real moves in the station instrument, adjust to account for anything you can actually document. Don’t smear the data like GISTEMP does…don’t apply arbitrary adjustment to the homogenized data the way HadCRUT3 does. Adjust how you think is appropriate and thoroughly document why you’re adjusting.

  9. There is something wrong with our local reported temps. Is there a reasonable explanation of why the WeatherChannel and Weather Underground are reporting temps well above actual temps in my town? Our GISS station has been shut down, though it was well located and well manned. Today I have noticed that the temps reported at our airport are well above the real temps. Not just a little but over ten degrees difference. We supposedly reached over 42F but we never saw above 26F. I wonder because if our GISS is shut down, where are our temps being reported and who is recording them? Do I sound paranoid? Yes, because the politicians are relying on a warming earth.

    [REPLY - No surprise here. Airports typically show a much faster warming over the last 30 years than non-APs, for a variety of reasons: Increased air traffic, urban encroachment, severe HO-83 equipment issues, etc. Even the best-sited systems in rural APs are seriously affected. ~ Evan]

  10. There’s a few things that emerge for me out of these recent postings about temperature datasets. These are all great posts by the way.

    The 1880’s are coming out every bit as warm as the 1940’s and the current period, suggesting a 60 year cycle. This 60 year cycle doesn’t really correlate with sunspot activity, or the lunar cycle. ENSO and the PDO seems to be an effect, not a cause.

    The fulcrum point for the warming seems to be happening around the 1980’s, in that years prior to that are adjusted downwards and years after slightly upwards, and I wonder is this because of satellite data coming onstream and keeping them honest subsequently.

    We’re seeing a huge amount of interest in this, with people going off and looking into this data for themselves, for the first time in a lot of cases. But I worry that this is going to dissipate our efforts.

    I have just downloaded the Daily data from GHCN (Please tell me that this data is truly raw, otherwise I will have downloaded 1.7GB of nothing) with a view to better learning R, and seeing what it shows for my country, Ireland.

    Different datasets and methodologies are being used, meaning the warm-mongers can say “Ah yes, but we adjust for this in such and such a way using internationally recognised practices, and anyway, your methodologies haven’t been peer-reviewed”. And we’re going to keep bumping up against this every time a new dataset is issued.

    I’ve seen lots of complaints about the various ways the raw data is adjusted, but I think if we are going to be taken seriously we need to come up with our own one way of adjusting the data, and then get the wider community to help out with the actual implementation.

    Willis Essenbach has recently acquired the surfacetemps.org domain name. I’ve offered to help, as I’m sure many others have too, including the heavyweights. I would suggest that we put our shoulders behind this effort and put up a genuine alternative to the current crop of temperature analyses.

  11. A relevant hit from the late great Benny Hill

    She said she’d like to bathe in milk
    He said alright sweetheart
    And when he finished work one night
    He loaded up the cart
    He said you wanted pasturised
    Coz pasturised is best
    She says Ernie I’ll be happy
    If it comes up to me chest
    And that tickled old Ernie (Ernie)
    And he drove the fastest milkcart in the west

  12. “I think we have to accept the depth of our ignorance, for now, and admit that we do not really have a clue… ”

    That is a proper attitude toward what is without a doubt one of the most complex natural systems one could choose to study. Had that attitude been maintained by all of the scientists interested in GCC, or at least those instrumental in setting the terms of the debate, it would have maintained the reputation of climatology as a science properly humble in the face of awesome forces and bewilderingly complex interactions.

    Now, because of plainly revealed human error–to apply a generous characterization to plainly revealed behavior–this most vital of natural systems seems likely to revert to mean: a scientific backwater, ‘though as ever an immensely interesting one, whilst a reliable and universally accepted record is accumulated, commensurate with the immense periods and geography of global climate and with the importance of the immense human consequences of modifying its dynamics.

  13. Huffington Post: Anatomy Of The Tea Party Movement

    http://www.huffingtonpost

    The modern day Tea party movement was started by the Ron Paul movement.

    Rachel Maddow: Tea Parties and Ron Paul

    http://www.youtube.com/wa

    Can people post comments over there and set them straight? I’m banned over there. *****************************************************************
    *****************************************************************
    Want to here a true story?
    On December 12th, 2009 michaelwise says:
    Want to here a true story that happened to me over the past two days?

    Yesterday I posted this on my WUWT blog.

    http://wattsupwiththat.co

    “Michael (16:44:43) :

    I scan the comments to the articles on a daily basis at The Huffington Post et al, to measure what I call the “Mass Brainwash Index”, that publication being one of the best places to get accurate results from the populous. Six months ago my index was at a reading of 9.5, 10 being the most brainwashed and 0 being the least. Today my Index has fallen to a reading of 7.5. Something dramatic is happening to the psyche of the American population.”

    Today I post this concerning Huffington post;

    “Michael (01:15:38) :

    Top story on Huffington Posts Green Tab has this as the first comment about The Copenhagen Summit. Is somebody handing out brains over there?

    “Mogamboguru I’m a Fan of Mogamboguru I’m a fan of this user 328 fans permalink

    ” An Incredibly Expensive F o l l y ”

    “Why Failure in Copenhagen Would Be a Success”

    CO2 Emissions Cuts Will Cost More than Climate Change Itself

    Based on conventional estimates, this ambitious program would avert much of the damage of global warming, expected to be worth somewhere around €2 trillion a year by 2100. However, Tol concludes that a tax at this level could reduce world GDP by a staggering 12.9% in 2100 — the equivalent of €27 trillion a year.
    .

    It is, in fact, an optimistic cost estimate. It assumes that politicians everywhere in the world would, at all times, make the most effective, efficient choices possible to reduce carbon emissions, wasting no money whatsoever. Dump that far-fetched assumption, and the cost could easily be 10 or 100 times higher.

    To put this in the starkest of terms: Drastic carbon cuts would hurt much more than climate change itself. Cutting carbon is extremely expensive, especially in the short-term, because the alternatives to fossil fuels are few and costly. Without feasible alternatives to carbon use, we will just hurt growth.

    Secondly, we can also see that the approach is politically flawed, because of the simple fact that different countries have very different goals and all nations will find it hard to cut emissions at great cost domestically, to help the rest of the world a little in a hundred years.”

    Me;
    Yes Virgina there is a Santa Clause.**************************************

    Later today I put up this topic on Daily Paul

    http://www.dailypaul.com/

    For those at the WUWT blog, the quotes are in the one with the Kid and the one with John Coleman.

    True Story.

    I don’t know? Maybe I can create a bridge between the WUWT blog and the Daily Paul Blog, a Political blog and a Science blog, to bookmark these threads, and keep the conversation going between these two blogs. The concersation that would come out would be most fascinating.

    I think I just invented “Cross Blog Debate”. This is my idea. Please give me credit for it. Thanks.

    ******************************************************************
    Now if we could just get comments to appear on both blogs talking abound the same subject at the same time linking the two in the blogosphere?

    Wouldn’t that bee a Hoot.
    *********************************************************************
    “Want to Join the Debate” is another invention of mine, a web site managing all the debates all over the world. If you don’t mind me laying claim to it right here right now.

    It is a website that brings all the Blogosphere together.
    It connects blogs talking about similar subjects together. Tiers of competence are created, voted on by their peers. May the best blog debate WIN!
    **********************************************************************
    We can track the comment to the millisecond to see who got there first.

  14. Anyone that can make sense of this let me know:

    I was going through the “raw” from GISS for State College Pa. when I came across something I can’t see how they cam up with the figure. First background for those that don’t know when you get the data from GISS there is 2 ways to get it, one by a Postscript file the other by looking at their txt file and pasting it over.

    From there you see the data strung out with headers for each month followed by the four seasonal means starting with Dec of the previous year and going forward. So you take the three months and add them together then divide by 3 and you got the seasonal mean. From the four means they add them together and divide by 4 and get the annual mean. Now I decided to check to see how they computed the Seasonal mean when they had no data for a month and this is where things went into the twilight zone.

    So there I am looking at 1973 and 1974, both of which had no data for the month of November. 1973 had 17.9C for Sept and 12.9C for Oct and a S-O-N of 12.3 . So then I go and look at 1974 and for Sept they had 15.9C and Oct 9.5C and a S-O-N of 9.6C . Now can anyone explain how you come up with a seasonal average lower then all inputs and another only .1C above all inputs?
    To my poor tires brain it should be an average of 15.4C for ’73 and 12.7C for ’74.

    Then in 1975 they don’t have data for Oct and I used the Sept and Nov data provided and what I calculated matches the S-O-N listed of 11.9C . However when I go to 1984 they don’t have data for Sept and when I calculate the S-O-N it doesn’t match the data file. My Calc: 8.5C what they got is 11.6C.

    To me it looks like the USHCN v2 has a bug in it that when the data is missing from the center of the Seasonal mean calculation it gives a funking reading but when it is on either side (ie the S or the N in S-O-N) it works fine.

  15. Have you tried attributing Nashville’s earlier, higher temperatures to the friction caused by Reconstruction? A link between global warming and institutionalized racism and sectional North/South bigotry would be a win/win in guiltophile circles.

    On a serious note, Nashville has grown enormously and I’d expect more UHI on at least some level. Has this already been factored in?

    I tried pulling up data for Kentucky but it seems to be a bit scrambled. Mt. Vernon kept pulling up old records for Munfordville, as if some links had been messed up.

    This whole thing has me quite worried because if the IPCC claims are true, in a hundred years Kentucky will be as warm as Tennessee, where human life is impossible due to the incredibly high temperatures.

  16. “So what would appear to be a general cooling trend over the past ~130 years at this location when using the unadjusted HadCRUT3 data, becomes a warming trend when the homogeneity adjustment is supplied.”

    This is not correct.

    The HadCRUT3 data you are using are not raw data. They are homogenized data.

    The GISS homogenization was not applied to HadCRUT3 to arrive at the GISS homogenized series.

    If you want to ascertain the effect of the GISS homogenization, you need to compare to raw data, or back it out from the GISS code.

  17. What really bothers me is that Tiger Woods gets crucified for liking sex and women, while the UN and governments of the world get off scot free perpetuating this massive AGW swindle that will cost taxpayers trillions of dollars, even when blatant examples of their fraud are being uncovered daily. So much for journalism. Where is the investigative reporting into this serious matter that will affect millions of lives? Where is the peer-review in the scientific process of climate change? No, instead Tiger Woods gets front page coverage in the News, Sports AND Entertainment sections of all the mainstream media channels. Something is very wrong; a revolution may be required to get rid of these climate change hucksters.

  18. I wish that I’d excised a couple of instances of “immensely” and one “plainly revealed.”

    I want to add one thought about “universally accepted,” since I know that is open to challenge, and a question. Imagine the state of simple astronomical science of the (Copernican) solar system if there had been, say, two different databases showing the distances and times of revolution of the planets. The astronomical “community” would have called a halt to any further prognostication until the predictive ability of one theory or another, or improvement in measurement techniques, purged the faulty data set(s) from the debate, or rectified them. Why did that not happen in this community?

  19. This post, Willis Eschenbach’s “Smoking Gun”, and Anthony’s on raw/partially cooked/stewed data got me thinking.

    Just as with the old “SETI at Home” project, couldn’t we all engage in some “distributed computing” using raw data, if any still exists, and then having it compiled in a central location?

    I’m thinking along the lines of Anthony’s surfacestations.org project.

    How about if each of us who was a surfacestations surveyor would take the “raw” data from the stations we surveyed, crank it through formulas like Willis and Basil did in their posts, and then upload the results to supplement our surfacestations.org surveys?

    Now, I’m not educated in the right field to figure out how to devise the proper analytical tools (I’m a retired lawyer, not a scientist or statistician), but I bet someone could come up with an Excel script (or something) that each one of us could download that would create the right kind of spreadsheet for us to plug the raw data into. We then could let our computers do the work, draw the graphs, and then upload it for all to see.

    It seems that there’s a lot of interest and energy out here in WUWT reader land. Analyzing all the HCN stations would be too big a project for folks like Willis Eschenbach, Basil Copeland, Anthony, et al., to take on all by themselves. But if analysis was handed off to all the readers of WUWT, surfacestation.org surveyors, and the like, there might be quite the artillery battery of smoking guns.

    Am I making sense here? Is this in any way feasible?

    Regards,

    Mike

  20. Basil and others

    When examining this sort of data please bear in mind that in many parts of the world we can go furher back in time and see ‘modern’ temperatures in much better context where this is the latest in a series of peaks and troughs

    http://climatereason.com/LittleIceAgeThermometers/

    Clearly there is a considerable UHI effect in many stations which is not properly accounted for as the locations have become increasingly urbanised over the years. Some of the earliest temp recording stations have seen the area around them grow 100 fold.

    Tonyb

  21. Ot but people, especially young ones, seem to need an agenda.. There was socialism (which failed even the ol ruskis admit that now…LOL), then there was the hippie movement… then “globalization”, which was “terrible” but now apparently is a “good thing” LOL. No wonder climate change has become so popular. Fortunately everybody is gonna become real bored because it ain’t gonna change over their lifetimes LOL. They will soon find some other agenda don’t you worry about that…

  22. Is this a joke? You can show warming trend from data that goes from 15.9C to ~15.425C? I am sorry, but something is wrong with the calculations, if your saying maybe there is a warming trend at the very end, that is one thing, but to say there has been net warming, that is pretty much unbelievable. Either the starting temperature is higher or lower than the ending temperature, there is no other choice. You did not even show what the actual temperature curve looked like, just a trend line.

  23. Thanks for this.

    When I first read this story (as well as the Darwin Zero entry), I thought it was a tragedy for science. Surely it is a sad day when the amateurs (blog readers, “non-climate scientists”, “non-academic professionals”) are unpicking the errors of the professionals – at a terrifying rate. Perhaps, in a sense, it is. The professionals long ago gave up the pretence of caring about the truth. Tax-payer funds are now going exclusively to large institutions who fudge data to keep the tax-payer funds rolling in.

    But in the end, I think this situation is better for science and for humanity. Science doesn’t care who finds the truth. It could be a worker in a patent office in Bern. It could be a blogger in the tropical town of Darwin. Websites like this one, and a dozen others, have allowed us all to get involved with the science – to go out and find our nearest temperature station, and check the records (as just one example). Websites like this have forced the great institutes to throw open their doors and make their data (or some of it) public. It is us ‘amateurs’ who are now checking their work, finding the mistakes, and correcting them.

    If this situation continues beyond the (fortress) walls of climate science and into other areas, one wonders whether we’re on the cusp of another scientific revolution. How many seemingly-intractable problems might be solved by an army of blog-readers who have an unbiased passion for the subject?

  24. Jerome, yes there has been warming, but then all too often graphs and data are started from the end of the Little Ice Age (LIA), so of course there’s been warming! The golden question to the statement, “The world has warmed” is “From when?”. You have to pick a time period. Warmists always pick from the end of the LIA, and hardened sceptics pick from 1998. Neither are good at explaining what, if anything, is going on. We should at least go back 1,000 years. When you go back 11,000 years you get a true picture. http://www.uigi.com/Temperature_swings_11000_yrs.jpg

  25. Headline “Nashville avoids global warming by using unadulterated data”

    Stately Gore Manor is in Nashville, so I’m sure Uncle Al will take credit for it as soon as one of his briefers tells him…

  26. What’s the up-and-coming proxy for climate change, acid rain etc? Eroded gravestones. http://www.gizmag.com/gravestones-provide-climate-history-clues/13569/

    Seems rather pointless without knowing the composition(s) of the chemical compound(s) and how the different compositions of limestone are affected by them.

    Not all limestone is exactly alike and the environment the stone is in affects how it erodes. Looks like yet another unscientific experiment created to prop up preconceived notions. At best it’ll be useful to document how much each stone has eroded between point A (the death date on the stone) and point B, the time of measurements. Another factor is that in many cases it may be several years after a burial that a stone is emplaced, with a temporary placeholder marker from burial until then. In some cases gravestones have been replaced years or decades after burial due to damage, really heavy erosion (due to the original being cut from an inferior grade of marble) or simply because some relative/descendant just wanted a different style of headstone.

    Without at least a fully documented history of every stone measured, the results will be mostly useless, especially for the stated purpose of the study. The best people to have working on this would be geology students with an interest in metamorphic types of rock with an appetite for digging through old paper records and files.

    Laura Ingraham hosted “The O’Reilly Factor Dec. 11, 2009 with a segment with Tyson Slocum. Unfortunately she arrived unarmed to the duel and allowed Slocum to spout the “peer reviewed” line unchallenged many times.

    Somebody needs to put together a “war chest” from the CRU documents for the media people willing to challenge the AGW fraud so they can quickly and easily counter anything people like Tyson Slocum say.

    http://paxalles.blogs.com/paxalles/2009/12/ingraham-factors-climategate-tiger-woods-obama-peace-prize-narco-state-and-more.html

  27. I would prefer global series made from several hundred hi quality stations with long records, which are positioned on the same rural place. Take Arctic with several dozens of stations: no net warming between 40ties and 2000s:

    My favorite station is Irish Armagh Observatory: 2000s are barely above 1940s:

    http://www.junkscience.com/MSU_Temps/CETvsArmagh_long.html

    CET is very long record, but guess who adjusted its data for UHI – Cheers, Phil.

  28. I’m curious if the GISTemp data has both raw and homoginized data ?
    Because if that is the case I have seen several stepladder adjustments from raw to adjusted in Pa alone …
    adjustments that can’t be justified by station moves or UHI …

  29. I think this is interesting. I’ve made the point before that another way of looking at the data is to use a CUSUM chart. This is the cumulative sum of differences from the mean. It is a standard quality control plot in analytical and engineering laboratories and is very sensitive to small differences in trends and steps in data. I think Bill’s approach is probably very similar.

  30. Nigel S – I worry that many readers are American and will not know Benny Hill and will also not get the best joke in the song simply because US humour is so different than the English… so I’ll explain.

    Ernie is a milkman (who delivers milk to your door as is the English tradition)

    “He said do you want pasturised
    Coz pasturised is best
    She says Ernie I’ll be happy
    If it comes up to me chest” (the reference being to do you want “past your eyes coz past your eyes is best”… Benny hill innuendo lol.

    well it makes me laugh anyway. sadly there is no video on youtube… here is a tribute someone has cobbled together:

    you;re all still nuts btw;)

  31. An interesting approach to dealing with step changes. But I think your final section, labelled ‘An Important Caveat’, is misleading. To see if there has been warming recently, you should choose a period (say the last thirty years) and simply look at the average of the seasonal differences over that period. Your approach of fitting a line to the graph of seasonal differences is in fact measuring any acceleration in warming, a very different thing. There again, I may have misunderstood.

  32. Just as with the WMDs of Iraq the politicians wanted evidence to justify proposed action. AGW started with Margaret Thatcher in the early 80’s and will end with Mann-made global warming and Prof. Jones using UHI effects, homogenization, adjustments, moving thermometers and sleight-of-hand. Lets hope the various inquiries going on will get to the bottom of the real world average temperature trends over the past 30 odd years. – The TRUTH always wins out in the end.

    “There are three kinds of lies: lies, damned lies, and statistics.” – Mark Twain

  33. MattB (02:11:30) :
    Nigel S – I worry that many readers are American and will not know Benny Hill and will also not get the best joke in the song simply because US humour is so different than the English…

    Just because we’re rough, untutored frontiersmen doesn’t mean we don’t know the man who single-handedly turned “Yakkety-Sax” from a minor hit on pop radio into an iconic chase scene theme.

    And “US humour” isn’t so different — just omit the superfluous “u”…

  34. My thanks to the various dedicated researchers/posters and to the organisers of this site for much personal enlightenment.
    Having assumed until the ‘Climategate’ emails were made public that the measurement and collection of environmental data was subject to scientifically established and internationally-agreed standards, I have only recently become interested (fascinated) by the absolutely chaotic and messy, probably unacceptable methods that pertain to ground-level temperature data collection, seemingly on a world-wide basis, which is open to considerable fantastical interpretations, in my opinion.
    A few days ago, in search of enlightenment, I posed what I thought were 4 quite reasonable and rational questions about this on a Guardian (UK) newspaper-run blog written by George Monbiot and was immediately accused of trolling by a virulent AGW poster! To say that I was dismayed is an understatement, but I have rapidly come to realise that questioning anything about AGW provokes a huge reaction from the AGW faithful. In a recent UK employment court case the Judge ruled that AGW was held by the individual (who brought the action against his employers for firing him for promoting his AGW beliefs to the detriment of the company’s performance) as a religious set of beliefs.
    While I have little actual scientific background and no science quals whatsoever, like a significant section of the population, I am literate, numerate and have a reasonably accurate nose for bullshit. I spent a number of years serving as elected chair of a community committee in New Zealand, set up as part of the permit process under the first NZ Resource Management act to have a critical oversight of the establishment and operation of a soon to be built ‘sanitary landfill’ and to liaise, on behalf of the community, with the landfill operators, the engineering and technical peer review group, various inspectors from local and national bodies , etc. During this decade I garnered considerable environmental information, plus a good insight into the methodolgy of on-site environmental measurement and data collection. I was also apalled by the absolute BS promoted by some of the more hysterical members of the local green movement.
    Sorry to be a tad long-winded, but it is refreshing to read information put together honestly and using ‘proper’ scientific methadology.
    Thanks again,
    Alexander

  35. RE: “Laura Ingraham hosted “The O’Reilly Factor Dec. 11, 2009 with a segment with Tyson Slocum. Unfortunately she arrived unarmed to the duel and allowed Slocum to spout the “peer reviewed” line unchallenged many times.”

    So true, too many conservative hosts are still uneducated on climategate. Laura let Slocum use the logic that climategate did not matter because AGW scientist are peer reviewed even though climategate shows clearly the hijacking of the peer review process, that the emails were stolen even though they were most likely leaked, that climategate had nothing to do with the underlying science even though there are now many example of fraud. NO challenge to this on FOX … groan. There needs to be an simple educational seminar on what Climategate means and these hosts need to attend it. I found myself yelling at the TV … at least my kids got an education.

  36. I very much disagree that “less cooling” means warming.
    If the trend is still negative there’s no way to call it warming.

    It’s the same as “less than expected job-losses” not meaning a recovery – it’s still getting worse.

    cheers, acob

  37. Thankyou Basil for an interesting Post. I agree totally with the main point that you are making.
    I have started to maintain a small well sighted weather station, with a thermometer, at the School that I work, and it is clear that the temperature recorded at the School is often different to temperature data recorded locally, let alone any recorded a larger distance away. Sometimes it’s higher by a degree or so, sometimes it’s lower. I travel 25 miles to work on a fairly flat route and during the journey the temperature recorded by the car changes significantly. The other day it decreased by 3 degrees.
    The only thing I can say with confidence is when the temperature has changed in time at the place where the weather station is situated. That’s it!
    However, since “prestigious institutions” such as HADCRU and GISS have attempted to provide a full global temperature history they have, in their infinite wisdom, created various ingenious (?) methodologies to combine a wide variety of individual temperature records.
    One thing is for certain, every single construct that they apply to the raw data introduces the possibilities of bias and error.
    If they must chase their unrealistic goal of providing global temperature trends for over 100 years then they should at least be honest about the limitations and keep the data used as pure as possible!

  38. I just checked the data for London Gatwick Airport.

    ‘Good’ data only available from 1961 to midway through 1998 !!

    So how are they getting their global temperature data?
    Which stations are they using?
    Are they cherry picking again?

    Gatwick started life under the Air Ministry in 1934, does CRU expect us to believe that they were not competent enough to make temperature measurements? Why stop at 1998, which just happened to be one of our warmest years.

  39. Warning – never assume that you have a set of RAW data to compare to. The cases I have studied seldom if ever start with raw data. For Australia, the Giss data that undergoes Giss adjustment appears to start with adjusted Bureau of Meteorology data. The Hadley files that I have looked at are also copies of adjusted BOM data, though I have not looked at many, yet. I doubt that many of us have ever seen raw data, anywhere, unless it is transcribed from observer sheets and known to be untouched.

    If you think you have some raw data, genuinely, I’d dearly love to know its source and whether it is still able to be downloaded.

  40. As each new bit of data pops up and shows that it’s been tampered with, I’m increasingly concerned at the scale of what’s going on.

    I thought it was primarily about clique behaviour and group-think acceptance of an argument which had gained popular traction. I really didn’t think that ANYONE would fiddle with the actual temperature records.

    Making up models to suit their PowerPoint slides and selective use of start points is one thing – but to actually rewrite history is simply stunning. And so stupid – how did they expect to get away with this?

    They do say that if you’re going to tell a lie, tell a big one – but this?!?

    It’s like Dr Shipman and his hundreds of victims or 9/11 – no one would have believed it was real.

  41. Wonderful research and analysis….

    So the averages have been “cooked”… nothing can be taken at face value… we have to go back to basics… look for the daily maximum and minimum temperatures to see the real trends… the averages hide way too much information on a local, regional and global scale…

    Just goes to prove “Who controls the past controls the future”….
    And we now know “Who controls the present controls the past”…

  42. “But even that is far from certain, for before we can conclude that, we have to rule out natural climate variability on centennial time scales. And we simply cannot do that with the instrumental temperature record, because it isn’t long enough.”

    I’d like to see this derivative analysis (“trend of a trend”) for the dozen or two continuous records that do go back to the 1700s or even 1600s (Central England), a century or more longer than the 1881 GISS cut-off date. What equation do I use in Excel to calculate it?

    Fahrenheit’s scale of 1724 made thermometers fairly accurate. He set salted ice water at 0° and healthy body temperature (of a horse) at 100°. When the variation in the boiling point of water with altitude was understood, the scale was altered so that water boiled at 212° (half the degrees in a full circle above ice water at 32°). Thus by 1800, or even 1750, thermometers were fairly good. Even if absolute values were off for a given thermometer, even the original body temperature scale should be quite good at measuring changes.

    From Wikipedia: “An individual’s body temperature typically changes by about 0.5 °C (0.9 °F) between its highest and lowest points each day.”

    My temperature with a drugstore digital thermometer under my tongue came out as 98.1° F instead of the standard value of 98.6° F. Body temperature isn’t good enough for climatology. But if I’m off by 1° F in the high point of my 0-100° F calibration then the length of my degree is only off by 1% so I can still accurately measure temperature changes quite accurately.

    For those interested, these two plotting sites allow listings in terms of how old records are:

    http://rimfrost.no

    http://www.appinsys.com/GlobalWarming/climate.aspx

    And this site has collected many plots as images: http://climatereason.com

    Most old records are quite linear and show no clear AGW signal even with urban heating not subtracted out. A few do (especially De Bilt and Uppsala) show a recent upswing. And a few (e.g. Paris) show linear trends that alter at some point to a different slope).

    I’ll post my own long-record plots, made merely as appealing illustrations rather than analytical figures but which are in fact accurate aside from a couple of sadly truncated records having had 2-5 years of the very latest data tacked on from GISS after adjusting it to match the last available year in the super-long data set (New York is one and Minneapolis one I’d still like to treat thus):

  43. “Would You Like Your Temperature Data Homogenized, or Pasteurized?”

    Well I think they thought they could homogenize it and sneak it Past-our-eyes….. ;-)

  44. MattB (02:11:30)

    Glad you liked it but things are pretty secure humo(u)rwise over there. Have you ever watched the Simpsons? Possibly the most brilliant thing on TV ever.

  45. I am having trouble understanding why observed temperatures over the last 100 years show no indications of warming…not even the hint of a hockey stick. The Australian Bureau of Meteorology allows you to look at the mean maximum and mean minimum temperatures for any year, and to compare the plots of any 2 years.

    If you look at the records, the temperatures in say 1890 show little or no difference with 2009…in some cases 1890 would be slightly higher. This is true for both city and country.

    Glen Inness (country town) and Sydney Observatory (major city) are two good examples.

    Any yet, the temperature chart for the state of New South Wales (both Glen Innes and Sydney are in New South Wales) shows a pronounce “hockey stick” warming since the 1970s.

    How can this be?

  46. There is no way to make heads or tails of the temperature record. and with each passing day, isn’t that the point? They have bent, spindled and mutilated the surface temperature data so they can present any fairy tale they choose. It’s really disgusting that this would then show up as science.

    It’s going to take years to try and sort it all out, if it’s even possible to do so.

    We need first a world version of SurfaceStations.org to even get started.

    And then some agreed upon rules.

  47. I have just downloaded the CRU data and was looking through the UK data. Many of the stations only have data over a limited period. For example the CRU data for Oxford UK contains data from 1900 to 1980, yet the Met Office site for Oxford covers 1853 to 2009. Why is the CRU data so truncated I wonder?

  48. The HadCRUT3 data set is not the raw data. Here is how the Met Office describe the data.

    http://www.metoffice.gov.uk/climatechange/science/monitoring/subsets.html

    “The data that we are providing is the database used to produce the global temperature series. Some of these data are the original underlying observations and some are observations adjusted to account for non climatic influences, for example changes in observations methods”

    What changes have been made?

  49. JustPassing (02:38:55) :

    “I think I’ll send the CRU at East Anglia a nice box of fudge for Xmas.”

    Bingo! If your lead was followed… a couple of thousand boxes of fudge arriving at CRU would certainly make a point.

  50. Instead of just picking one station here and there, why not look at the effect of adjustments overall. This is what Giorgio Gilestro did for the GHCN set that Willis analysed for Darwin alone. . He shows the distribution of the effects of the adjustment on trend. It looks quite symmetric. Adjustments are just as likely to cool as to heat.

  51. DeNihilist (00:01:26) :

    Yes, it brings back memories.
    And today’s weather is not at all unlike 1970’s.
    Full circle.

  52. “Would You Like Your Temperature Data Homogenized, or Pasteurized?” ??????

    Just before I clicked here I was reading the emails…I searched and did not find that the following had ever been posted here:

    “One way would be to note that the temperature amplitude (1000 – 1950)
    for each is ~1.5°C. Thus you could conclude that hemispheric/global
    climate varied by over a degree Celcius (although with regional
    differences)

    Another way would be to average the records. The resulting temperature
    amplitude would be smaller because extremes would cancel since
    variability is large and each region’s extremes occur at different
    times.

    Thus, if people simply looked at several records they would get the
    impression that temperature variations were large, ~1.5°C. Imagine
    their surprise when they see that the ensemble averages you publish
    have much smaller amplitude. ”

    http://climate-gate.org/email.php?eid=219&keyword=their%20surprise%20when%20they%20see%20that%20the%20ensemble%20averages%20you%20publish

    ……..
    hmmm… Imagine their surprise!!!!

  53. I have a question:

    Why study the temperature at all? Temperature is a weather phenomenon – to begin with. It flickers and swings too much and thus requires mathematical ‘handling’ to make sense of. The warmists will always set up filters that warm up things because there is no baseline climate measurement paradigm to begin with. Using temperature seems to have its roots in the use of the phrase ‘global warming’.

    Why not study only proxies? Arriving at a new compromise proxy that all agree on shouldn’t be contentious and it will have the added advantage of both camps not knowing what way things will trend (for both past and future) when the analysis is performed. Drastic political action should be taken only if the proxy satisfies predetermined conditions (which must still be subject to revision with improving scientific understanding).

    Mann chopped off Briffa’s post-1960 part because it trended downwards w.r.t to the temperatures. In reality, it is a basic fallacy to do so, given the fact that tree ring reconstructions are the more long-term secular trend between the two. Questioning the longer tree record (with all its inherent problems) against a much shorter thermometer record (which has its own problems no end) doesn’t sound right.

    Throw the temperature blade away and see if the secular proxy still bends only once to form a hockey stick. If it doesn’t – no unprecedented warming.

  54. ScottA (02:14:06) :

    Did I miss the bleg for transforming row to column data?

    Yes, you did! I had to go in to work last night, and in my haste to send this to Anthony, I forgot to add that. So let’s start with that, and I’ll have comments on some of the other posts later.

    Have any ideas on how to do this? Years ago, I programmed in REXX, a great language for string processing, and it would be a snap to write up a script to do this in REXX. I suppose it wouldn’t be too hard to do in any language, but I haven’t programmed in years. I guess I was wondering if anyone had a suggestion for the easiest way to code something up to do this. All of the HadCRUT3 data is in this form, as is the GISTemp data. But I need it in straight column format for reading by my stat routines. What I did for this post was manually cut and paste it into Excel using the “transpose” option.

    I suspect a true linux/unix geek could code up a bash script to do it as well, and I have access to linux machines to run code on if somebody were to give me a bash script that would do it.

    Ignoring differences between HadCRUT3 and GISTemp in the header lines before the data of interest start — those lines just need to be deleted — when we get to the data, it is of the format

    YYYY xxxx xxxx …. xxxx

    where YYYY is the year, and the xxxx are monthly figures (except for GISTemp, which has a bunch of averages after the December figure. In either case, the object is to parse the line into “words” and write the 2nd through 13th word out line by line to a new file. (Well, I’d need to probably write out the first YYYY, maybe as a comment, at the start of the new file, since the data do not all start at the same year.) I realize this is a very trivial exercise…to anyone who codes daily.

    So the bleg was just to ask for some ideas about how to do this. Lots of bright and capable people read this blog. Paul Clark (WoodForTrees) has to do this to read the GISS data for his web site, so he probably has some C++ code that does it, but it would have to be modified some for HadCRUT and the GISS station data. And I do not have a C++ IDE/compiler installed anyway.

    Anyway, that’s the issue. With the release of the HadCRUT3 “subset,” there is a lot of data that would be interesting to look at in detail. But with the tools that I use, I need to come up with a more efficient way to handle this row to column transformation.

  55. ralph (02:15:28) :

    Do I spy an inverse hockey-stick developing in one of the graphs above?

    http://wattsupwiththat.files.wordpress.com/2009/12/nashville-figure5.png

    That’s an “artifact” of HP smoothing. All smoothing methods have issues with handling of data at the ends of the series. With some smooths, the results near the end are padded with extra data added, but one has to make assumptions about the extra data. HP doesn’t do this. It is a least squares approach, and it produces an unbiased smooth, in the sense that a linear regression through the smooth has exactly the same fit as a linear regression through the unsmoothed data.

    Looking at the figure, we’re likely at the bottom of a cycle in the data. As the trend turns back up, the integration of new data into the smooth will make the turning point higher, removing the “blade” you see which reminds you of a hockey stick. That said, it still may be that the depth of this latest cooling episode will turn out to be more extreme than previous episodes. But the main point is that with HP smoothing to this type of data, you cannot read much of anything into the half of a cycle that begins and ends the series.

  56. David (22:20:43) :

    I’ll issue a gripe here. Annualizing the temperatures. Why? There are four seasons every year. If you want to make the data less choppy, fine, but at least keep it in its natural rhythm. Do the Grand Poobahs have a reason for imposing the calendar’s will on the data?

    While the rate of growth is annual, the seasonality is all still there, month by month. The monthly “natural rhythm” is present in the gray data labeled “(original data)” underlying the blue “(trend)”. But the blue trend is a “natural rhythm” as well, just on a time scale of a decade or so.

  57. Anthony, there is something magical about using data from 1979 or so onward.

    Here, try this simple test. Ask the NSIDC why they will ONLY use data from 1979 to 2000 as the reference average for Arctic sea ice. How can it possible be that this is the best and only reference for all things Arctic. They don’t have a rational answer. The obvious conclusion is this is done for dramatic effect.

    The NSIDC then has the gall to proclaim that what has been seen in the 30 year record of Arctic sea ice is falling outside the natural variability – REALLY! If that truly were the case, then no more studying of the Arctic need be done. Actually, anything falling outside 2 standard deviations of the 20 year average is considered outside the natural variability. As Church Lady would say, “how convenient”.

  58. JJ (23:17:00) :

    I didn’t say HadCRUT3 was “raw.” I referred to it as “unadjusted.” Now what I meant by that is that it should be the same as GISS before GISS applies its “homogeneity” adjustment.

    No monthly data can ever be described as “raw,” because the “raw” observations are daily. And any data set that has NO missing observations is unlikely to be truly “raw.” For all the talk about going back to the “raw” data, I don’t think that is where the problem begins. From my work with US data (I do some consulting work where I have occasion to look at the truly “raw” data occasionally), NOAA does some “quality” control right off the bat in reading from the hand written daily records. I doubt that any systematic “warming bias” is introduced at that point.

    The next problem faced with using “raw” data is the handling of missing data. I think this may need to be looked at a little more closely. To read how this is done, it sounds like a reasonable approach, but only superficially. It is done by correlating differences (that’s important, and a good thing) from “nearby” stations. This is probably okay, for the occasional missing observation, especially if this is being done to fill in the daily data (which I’m not sure about, but should read up on at some point). But if a lot of data is missing, there should be a point at which the station is just excluded, rather than fill it in with the method currently used. (Maybe this is done. Do you know?)

    For its purpose, I think what I’ve done is defensible. If you go to the GISTemp web site, you get the option of downloading its “pseudo-raw” version of the data, i.e. the data before it applies its “homogeneity” adjustment. I could have used that, instead of HadCRUT3, and I believe that the results would have been similar, if not the same. Both CRU and GISS are, in a case like this, getting their data from the same place. I doubt that they started with the daily data. They are taking the monthly data as given to them by the various met agencies. So it is relevant, and interesting, to explore how they can take the same data, and come up with such different results. Which is all I did.

  59. An enlightening post, Basil Copeland. Seasonal differences seems to be more appropriate for knowledge about temperature/climate change. ClimateGate or CRUCrud is the worst and most costly scientific scandal of our generation. Your conclusion is ONE OF THREE ideas I would like to see emphasized every time there is a discussion of Climate Gate or Global Warming.

    ONE “I think we have to accept the depth of our ignorance, for now, and admit that we do not really have a clue about what might have caused the kind of upward drift we see in the blue trend line in the preceding figure. Of course, that means putting a hold on any radical socioeconomic transformations based on the notion that we know what in truth we do not know.” (Copeland)

    TWO (A) A real-scientist/skeptical-public site where validated (by whom?) RAW DATA is kept. This seems to be a necessity if we are going to pull together reasonable conclusions for a general public. See paulhan (22:53:05) “I have just downloaded the Daily data from GHCN (Please tell me that this data is truly raw, otherwise I will have downloaded 1.7GB of nothing).”

    TWO (B) If the RAW DATA needs ADJUSTING, a validated (by whom?) method for doing so. Also paulhan (22:53:05) “”I’ve seen lots of complaints about the various ways the raw data is adjusted, but I think if we are going to be taken seriously we need to come up with our own ONE way of adjusting the data….”

    THREE. A CENTRAL LOCATION for “distributed computing” using raw data. Mike Fox (23:35:01) suggests surfacestations.org surveyors might take on the task using “formulas like Willis and Basil did” and asking standardized questions that speak to Basil’s conclusion — why we must do nothing now but wait for scientific research based on transparency, accountability and verification. Paulhan suggests Eschenbach’s surfacetemps.org. E.M. Smith also has collected a tremendous amount of data.

    There already are so many smoking or smoldering guns to go along with the pre-1999 (before the conspiracy began in earnest) science plus the skeptic-scientist “banned” research of the last 10 years, that concerted action to urge resonable non-action seems, well, urgent. (And I still want jail time for the fraudsters, fines equal to the grants at tax-payers expense they squandered, and banishment from academic life.)

  60. MikeC (22:47:20) :

    You guys are on the right track, but what you have not done is gone into the scientific literature and looked at exactly how the adjustments are made… you will, i guarrantee, find this whole temperature monitoring and reporting bizzarre

    John Goetz and EM Smith has gone a step further by analyzing GISS source code. The buck stops there.

    http://chiefio.wordpress.com/2009/07/21/gistemp_data_go_round/

    http://wattsupwiththat.com/2009/07/22/giss-step-1-does-it-influence-the-trend/

    http://wattsupwiththat.com/2008/04/08/rewriting-history-time-and-time-again/

    http://climateaudit.org/2008/02/09/how-much-estimation-is-too-much-estimation/

    One thing I’d like to see on future raw/cooked and blinking temperature
    graphs is the difference. There ought to be several cases where the
    adjustments are the same at several stations as UHI and street light
    adjustment heuristics are twiddled.

  61. Basil (05:26:33) :

    ScottA (02:14:06) :

    > Did I miss the bleg for transforming row to column data?

    Ignoring differences between HadCRUT3 and GISTemp in the header lines before the data of interest start — those lines just need to be deleted — when we get to the data, it is of the format

    YYYY xxxx xxxx …. xxxx

    where YYYY is the year, and the xxxx are monthly figures (except for GISTemp, which has a bunch of averages after the December figure. In either case, the object is to parse the line into “words” and write the 2nd through 13th word out line by line to a new file. (Well, I’d need to probably write out the first YYYY, maybe as a comment, at the start of the new file, since the data do not all start at the same year.) I realize this is a very trivial exercise…to anyone who codes daily.

    What’s a bleg? A request for something in a blog?

    Awk would be an easy choice, my kneejerk reaction for almost anything outside of kernel code is Python (follow the link!) and that might be on your Linux system already, some distros use it for a lot of system configuration stuff. Perl would be good to, but one reason I learned Python was so that I didn’t need to learn Perl.

    I don’t really have time, but if there are no takers….

  62. Mike Fox (23:35:01) :

    ….

    How about if each of us who was a surfacestations surveyor would take the “raw” data from the stations we surveyed, crank it through formulas like Willis and Basil did in their posts, and then upload the results to supplement our surfacestations.org surveys?

    See what I just wrote to JJ about “raw” data. I certainly am not interested in going back to the truly “raw” data, which is daily.

    In deciding where to begin, we have to decide whether we begin with the truly raw data, or the data that was received by CRU (or GISS) from the met agencies. Much of the ruckus over FOI’s has been simply to get the latter. And I think we now have some of that with the “subset” of HadCRUT3 that has been released. I may be mistaken — somebody correct me if I am — but I think this is supposed to be the data “as received” from the met agencies. They describe the data they are releasing thusly:

    “The data that we are providing is the database used to produce the global temperature series. Some of these data are the original underlying observations and some are observations adjusted to account for non climatic influences, for example changes in observations methods.”

    It is a little unclear from this statement as to who is responsible for the adjustments, but I think they mean the original met agencies, not CRU. I.e., if the data they got from a national met agency was adjusted for time of observation bias, that’s what they used. If it wasn’t adjusted for time of observation bias, that is still what they used. I do not think they are saying that they (here Hadley or CRU) adjusted some of the data for non climatic influences, but didn’t adjust other data. They just went with what they were given.

    Assuming this is so, for monthly data, this is probably about as “raw” as it gets, without going back to the original daily data. And it is a start. GISS is taking this same data, and is doing something more to it. So a good starting point is to understand the differences between GISS and HadCRUT.

  63. This is probably a little off topic, unless you consider motivations for all this crap that’s going on- CRUtape Letters, AlBore, Copenhagen, etc.

    Look at this: When the UN’s involved: Follow the Money.
    From World Net Daily:

    NEW YORK – A story emerging out of Britain suggests “follow the money” may explain the enthusiasm of the United Nations to pursue caps on carbon emissions, despite doubts surfacing in the scientific community about the validity of the underlying global warming hypothesis. A Mumbai-based Indian multinational conglomerate with business ties to Rajendra K. Pachauri, the chairman since 2002 of the U.N. Intergovernmental Panel on Climate Change, or IPCC, stands to make several hundred million dollars in European Union carbon credits simply by closing a steel production facility in Britain with the loss of 1,700 jobs. The Tata Group headquartered in Mumbai anticipates receiving windfall profits of up to nearly $2 billion from closing the Corus Redcar steelmaking plant in Britain, with about half of the savings expected to result from cashing in on carbon credits granted the steelmaker by the European Union under the EU’s emissions trading scheme, or ETS.

  64. TonyB (00:01:31) :

    I’ve asked about it a couple of times, but have never received a satisfactory answer about why we should be trying to UHI-adjust the data. UHI is a type of AGW. I understand the desire to quantify it, but not as an “adjustment” to the data. Leave out trying to “adjust” the individual stations for UHI. Then, once we have good data on temperature for selected stations, then try to quantify the effect of UHI.

    Which brings up an interesting question. At what stage is the UHI adjustment done? I’m guessing (!) that it is done when taking the data received from the met agencies, and trying to grid it into a world model. But I am very suspicious about the adjustment, and doubt that it is nearly what it ought to be. I’d rather than adjustment be left out of the “global temperature” data set, and then let climate scientists argue out in the literature about how much UHI is in the “global temperature.” The way it is now, alarmists can dismiss UHI because it supposedly has already been accounted for.

    I really do think UHI needs to be studied like Peter did in the video Anthony posted a couple of days ago. Do not make it part of the “global temperature” data. The “global temperature” is “what it is” regardless of the source. Adjustments may be justified for time of observation bias, or other “non-climatic” influences, but UHI is definitely not a “non-climatic” influence. It is very much an influence on the climate.

  65. I downloaded the CRU data recently released by the Met Office and was surprised to find only partial data sets for the UK stations, some ending in 1990 or earlier. I then went to the met office website and downloaded two stations that had long data sets.

    http://www.metoffice.gov.uk/climate/uk/stationdata/

    I have plotted the graphs of temperature over time on my website.
    For Durham and Oxford. These two towns are central England

    http://www.akk.me.uk/Climate_Change.htm

    They show a steady decline. I would appreciate any comments, this was my method and admittedly it is only two sites.

    I have taken the difference between tmax and tmin for each month and averaged them over the year. Where there was estimated data I have used the estimated data. Where there was missing data I interpolated the missing data using the previous and following month.

  66. astonerii (00:29:32) :

    Is this a joke? You can show warming trend from data that goes from 15.9C to ~15.425C? I am sorry, but something is wrong with the calculations, if your saying maybe there is a warming trend at the very end, that is one thing, but to say there has been net warming, that is pretty much unbelievable. Either the starting temperature is higher or lower than the ending temperature, there is no other choice. You did not even show what the actual temperature curve looked like, just a trend line.

    Here’s your “actual temperature curve(s)”:

    With that much variation, it isn’t hard to imagine subtle differences having major impacts on the trend lines.

  67. “to accept the depth of our ignorance” – To do anything else would be dishonest, and opens the door to bias, wild speculation, and puts leaders and politicians in the drivers’ seat, to steer it in any direction they wish, which in this case, was right off the damn cliff. Not only does one have to be aware of the level of ignorance, but we must also be painfully aware of the sheer complexity involved in attempting to instrument the climate of our planet with the intention of predicting it’s ‘long-term’ future state.

    This requires, if it is even feasible (or possible), a great deal more observational data (ALOT more, read: ‘launch more satellites!!’) over a greater period of time, with a great deal more supercomputers to process the data and to start basically guessing with the models until the predictions actually become somewhat accurate.

    Anyways, and not to pile on, but I’ve just started looking into the infrared radiative transfer models… Too early to say, but my impression is that the potential for more of the same is most definitely there…and again, it’s a case of not accepting the depth of our ignorance, in addition to it being on more shaky science ground than mercury thermometers on an asphalt parking-lot covered in latex paint.

    And the satellite calibration and data and it’s future.. well, this one is harder to investigate, but apparently the plan is to have a U.N. agency write the software and probably have control over the database and who gets access etc….

    maybe we should consider building an open-source, massively distributed computing framework on the internet, as an additional means of oversight, (plus it could be a great deal of fun)… might even turn out be an asset for the big government agencies.

  68. Off-topic specifically but on-topic generally, I would like to recommend this post by Alan Sullivan of Fresh Bilge. It is a re-post of something he wrote in 2008, but it is even more relevant today. He is an excellent writer and a good science generalist. He neatly summarizes most of the true scientific “consensus” regarding the inter-relationships of forces and energies affecting the climate.

    See http://www.seablogger.com/?p=18358

  69. Basil,

    “I didn’t say HadCRUT3 was “raw.” I referred to it as “unadjusted.””

    The HadCrut3 data you used was not raw data. It is adjusted data. Specifically, it is homogenized data. The Met office tells you that on the page that you linked to above. This:

    Some of these data are the original underlying observations and some are observations adjusted to account for non climatic influences, for example changes in observations methods.

    means homogenized. Homogenized means ‘adjusted to account for non climatic influences’.

    “Now what I meant by that is that it should be the same as GISS before GISS applies its “homogeneity” adjustment.”

    That is not true. The HadCrut3 is not raw data. It is homogenized data. The Met tells you that on the page that you linked to above. It is not the same as GISS before before GISS applies its homogeneity adjustment.

    You are not comparing GISS adjusted data with unadjusted data, as you claim. You are comparing two different adjusted datasets. What you say about that comparison is false. Please correct your error.

    “For its purpose, I think what I’ve done is defensible.”

    It isnt. You dont understand what you have done, and are describing both the datasets and the comparison you made with them incorrectly.

    “If you go to the GISTemp web site, you get the option of downloading its “pseudo-raw” version of the data, i.e. the data before it applies its “homogeneity” adjustment. I could have used that, instead of HadCRUT3, and I believe that the results would have been similar, if not the same.”

    [snip] You have the option to actually get the unhomogenized data from GISS, but instead you got homogenized HadCRUT3 data, and pretend that it is the same? Why would you do that? Honestly, why? That makes absolutely no sense. If you could have used unhomogenized GISS data where your intended analysis called for unhomogenized GISS data, why on earth didnt you?

    “So it is relevant, and interesting, to explore how they can take the same data, and come up with such different results. Which is all I did.”

    No it isnt. What you did was compare two homogenized datasets, and claim that one was homogenized and the other wasnt. And you did this, even though you admit to having access to what you knew to be the unhomogenized data you claim you were using.

    Why on earth would you do this?

  70. Trevor Cooper (02:13:35) :

    An interesting approach to dealing with step changes. But I think your final section, labelled ‘An Important Caveat’, is misleading. To see if there has been warming recently, you should choose a period (say the last thirty years) and simply look at the average of the seasonal differences over that period. Your approach of fitting a line to the graph of seasonal differences is in fact measuring any acceleration in warming, a very different thing. There again, I may have misunderstood.

    Your suggestion has merit. The problem you will encounter is that the standard deviation is so high that it is all but impossible to conclude anything from doing that. For example, for the GISTemp data set, I compared the mean of the last 30 years to the mean of all the years preceding:

    Null hypothesis: Difference of means = 0

    Sample 1:
    n = 360, mean = 0.027222, s.d. = 2.5256
    standard error of mean = 0.133111
    95% confidence interval for mean: -0.234553 to 0.288997

    Sample 2:
    n = 1186, mean = -0.024872, s.d. = 2.699
    standard error of mean = 0.0783719
    95% confidence interval for mean: -0.178635 to 0.128891

    Test statistic: t(1544) = (0.027222 – -0.024872)/0.160045 = 0.325496
    Two-tailed p-value = 0.7448
    (one-tailed = 0.3724)

    For a graphic of the test, look here:

    The difference between the means is “not significantly different than zero.”

    Here’s the “problem:” natural climate is…variable. It is very difficult — in my view it is impossible, given the present state of knowledge — to say that the last 30 years are outside the range of natural climate variability.

    We need to coin a new aphorism: “hide the variability.” Because besides “hiding the decline,” a lot of the alarmist claims are dependent on “hiding the variability.”

  71. I am not a mathematician and I never studied statistics so please someone correct me if I’m wrong.

    It seems as though any oscillating curve (even a sine wave) can be fitted with a trend line that points either up or down, depending on where you choose to start and end the curve that you are fitting.

  72. Basil, more of a question. You are using the entire year’s data. What happens when you look just at the daily temps for the two ends of the climate, Jan and July? That is, what is the trend for all the Jan months for the series and same with the July months?

    I have long speculated that what we are seeing is not a true increase in temps, but a narrowing of varation below the maximum temps, which will tend to increase the average daily temps, but no real physical increase is occuring. That is, what we are seeing are shorter warmer winters, with no change in summer temps, which gives an increase in average temps over time.

    Also, I’m speculating that spring can come earlier and fall come later, which one can see with the temp changes in those transition months.

    This could be the killer of AGW if this is the case because there is no real increase in temps, just less variation below max temps for each month. The alarmism of a catastrophic future dissapears then doesn’t it.

    I’m going to test this speculation myself as I’m currently downloading, from Environment Canada’s website, all the daily temps for all 1600 Canadian stations from 1900 to the present. It will take about 10 days to complete that download. Then I’ll import it into Access or SQL Server to do the analysis.

  73. acob (02:51:10) :

    I very much disagree that “less cooling” means warming.
    If the trend is still negative there’s no way to call it warming.

    It’s the same as “less than expected job-losses” not meaning a recovery – it’s still getting worse.

    cheers, acob

    I understand your point. It is a question of semantics. I’m simply trying to acknowledge that something may be taking place that is raising the earth’s temperature, compared to what it would be if that thing were not taking place. Now I’ve purposely left ambiguous what that “thing” might be, for the reasons discussed in the post.

  74. If it has not already been put up here and I missed it, this is interesting:
    CRU: Artificial Corrections, Fudge Factor

    A programing file called ‘briffa_sept98_e.pro’ from the somewhat overlooked documents section of the CRU ‘FOI2009’ collection: [here there is a very interesting graph]
    ;
    ; PLOTS ‘ALL’ REGION MXD timeseries from age banded and from hugershoff
    ; standardised datasets.
    ; Reads Harry’s regional timeseries and outputs the 1600-1992 portion
    ; with missing values set appropriately. Uses mxd, and just the
    ; “all band” timeseries
    ;****** APPLIES A VERY ARTIFICIAL CORRECTION FOR DECLINE*********

  75. Nick Stokes (05:05:39) :

    Instead of just picking one station here and there, why not look at the effect of adjustments overall. This is what Giorgio Gilestro did for the GHCN set that Willis analysed for Darwin alone. . He shows the distribution of the effects of the adjustment on trend. It looks quite symmetric. Adjustments are just as likely to cool as to heat.

    Then how do you explain this?

    If I’m not mistaken — and I might well be! — the CRU dataset takes any adjustments like this, that have already been made by the met agencies, as they are. Gilestro only looks at what CRU has done to the data that was given to it. I’m a bit of “middle of the roader” here, but I’m not particularly leery of HadCRUT. I’m less sanguine about GISS.

    In any event, I wasn’t comparing CRU “raw” to CRU “adjusted.” I was comparing CRU “whatever it is, as recently released by the UK met office” to GISS “homogenized.” I suspect that the kind of biases I show here will NOT all average out to zero, but reflect some intrinsic issues with respect to the GISS homogenization process.

  76. Here is some rough excel code. It works.
    NOTE that the excel sheet must be saved as a .XLSM macro enabled worksheet and macros will have to be enabled.

    excel7
    click developer tab
    type a new macro name
    e. g. YearPerRowToClmn

    Assign to keybord if you like

    click [create]
    then between:

    Sub YearPerRowToClmn()

    End Sub

    paste this (but not the “———“:

    ‘——————————
    ActiveCell.Offset(1, 0).Range(“A1:A11″).Select
    Application.CutCopyMode = False
    Selection.EntireRow.Insert , CopyOrigin:=xlFormatFromLeftOrAbove
    ActiveCell.Offset(-1, 2).Range(“A1:K1″).Select
    Selection.Copy
    ActiveCell.Offset(1, -1).Range(“A1″).Select
    Selection.PasteSpecial Paste:=xlPasteAll, Operation:=xlNone, SkipBlanks:= _
    False, Transpose:=True
    ActiveCell.Offset(11, -1).Range(“A1″).Select

    ‘ must “hide the incline”
    ‘ beware the ides of march

    repeattranspose:
    If Len(ActiveCell.Text) < 2 Then GoTo stopp
    ActiveCell.Offset(1, 0).Range("A1:x11").Select

    Application.CutCopyMode = False
    Selection.EntireRow.Insert , CopyOrigin:=xlFormatFromLeftOrAbove
    ActiveCell.Offset(-1, 2).Range("A1:K1").Select
    Selection.Copy
    ActiveCell.Offset(1, -1).Range("A1").Select
    Selection.PasteSpecial Paste:=xlPasteAll, Operation:=xlNone, SkipBlanks:= _
    False, Transpose:=True
    ActiveCell.Offset(11, -1).Range("A1").Select
    GoTo repeattranspose

    stopp:
    '———————–

    to use:
    get data in text form with 1 year of 12 months + other stuff
    Select data [ctrl]+a selects the lot
    Copy data [ctrl+c
    open blank sheet in the workbook containing the macro
    paste the data at cell a1
    You now have a single column of data one year per row. If your excel is set up differently it may convert the data to columns automatically, ifnot
    select the first to last year:
    click first – scroll to last and click the last year whilst holding shift
    select [data] tab
    click text to colums
    click delimited if you KNOW that there is always a certain character (space, comma etc) between monthly data
    or click fixed width
    click next (select delimiter character if necessary)
    check the colums are correctly selected – move, delete or add. If station number is in data this usually has the date attached without space. If so add a separator to separate the date from the station.
    If station number is in first column click next and select station number to be text click finish
    else click finish
    You should now have the data separated into columns.

    Click the cell containing the first date (or the first cell to the left of january’s temperature)
    Save the workbook as the next step is not undo-able.
    Run the macro above (use keyboard shortcut or go to the developer tab and double click the macro name.
    The macro should stop on the last line of data (looks for empty cell). However if it does not press [ctrl]+[break] a number of times. select end or debug from the options according to taste.
    No guarantee is given with this software!!!!!

  77. I sail third officer deep sea now and the position, temperature, humidity, and wind as well as sea state are recorded at the end of every 4 hour watch as a matter of law (I speak of the US flagged). All of our logs are required to be retained by law. On one ship I had to launch a bathythermograph every watch for a NOAA guy that rode with us. These results were logged on a computer.

    Does anyone know if there has been a large scale attempt to collect all ships logged weather data? They must cover a substantial part of the sea. Wx reporting to NOAA is done on some ships, but this is optional and subject to the Mates patience. Almost 100% of US flagged ships instruments (Barometer, dry/wet bulb thermometer) are calibrated by NOAA every so often. I would guess there is more care taken in the accuracy of these reading because we heavily rely on the resulting Wx maps (Provided by NOAA) to forecast and route.

  78. Basil,

    Turning to the balance of your analysis, you can save yourself considerable effort next time when calculating your ‘mean seasonal difference’ statistic. It is not necessary to create a difference series and tally all of its values to do that. Mathematically, your ‘mean seasonal difference’ statistic simplifies to the following equation:

    (Te – Ts)/Y

    Where:

    Ts= Temperature at the start of the analysis period.

    Te= Temperature at the end of the analysis period.

    Y = number of years in the analysis period.

    It should be much faster for you to calc that statistic next time. This does raise the following questions, however.

    You have 130 years of data at the Nashville site, which amounts to 1,560 monthly average temperatures. First, you throw out 92% of that data by choosing only to look at October. Then you throw out 98% of that October data, by using a ‘mean seasonal difference’ statistic that is determined by the values of only two of the October monthly averages – the endpoints of the period of analysis.

    Why is using only 0.13 % of the data to calculate a temperature trend superior to calculating a temperature trend from a larger portion of the data? Like, say, all of it?

    Given that your ‘mean seasonal difference’ statistic only uses two datapoints (the endpoints of your seasonal series) it should be apparent that the choice of those two points is … pretty important. Just moving one of the endpoints of your analysis period forward or backward by one year could dramatically change the ‘mean seasonal difference’ trend that you calaculate.

    In fact, on a dataset with essentially zero trend (such as the homogenized GISS dataset for Nashville that shows much less than 0.1C warming over a century) you could completely flop the trend from warming to cooling and back with only tiny moves of the endpoints. And that warming or cooling trend could pretty easily be an order of magnitude larger than the miniscule warming trend calculated from all of the data.

    Is that what you mean when you say this method is ‘robust’?

  79. Oldgifford you say:

    “I have taken the difference between tmax and tmin for each month and averaged them over the year. ”

    It is not surprising that there is a declining trend in the difference between maximum and minimum tempertatures since as far as I know generally in the UK there has been a trend of higher minimum temperatures and this has been higher than the rise in maximum temperatures.

  80. Basil, I think we need to be more certain about the raw data. You suggest that you do not think using raw data is reasonable and that it is not necessary, if I read you correctly.

    Basil 6:28:57 “For all the talk about going back to the “raw” data, I don’t think that is where the problem begins. From my work with US data (I do some consulting work where I have occasion to look at the truly “raw” data occasionally), NOAA does some “quality” control right off the bat in reading from the hand written daily records. I doubt that any systematic “warming bias” is introduced at that point.”

    I just can’t agree. NOAA’s “‘quality’ control right off the bat” must be checked by truthful citizens/scientists — perhaps at a sample of stations to verify that the raw data is not already cooked. Why would you “doubt” that any systematic “warming bias” is/has been introduced when this is the most significant scientific scandal of our time plus one that already has cost us billions, if not trillions — perhaps even quadrillions — of dollars during the last 10 years? Do you think these people are going to let this go easily?

  81. “I have taken the difference between tmax and tmin for each month and averaged them over the year.” – oldgifford

    Surely you need to take the average of tmax and tmin for each month (ie. sum not difference) and then average? Otherwise you get a graph of temperature range not temperature amplitude.

    Of course if tmax-tmin does decline over time, what that might show is the the development of an urban heat island effect, as warm nights is one normal result of urbanisation.

  82. None of this matters. The power-hungry in Denmark are going to make Global Governance happen through climate change means, unless we stop them with something close to a revolution.

    They’re literally daring us at this point.

  83. “From another thread thiss seems an extremely important analysis:

    http://www.gilestro.tk/2009/lots-of-smoke-hardly-any-gun-do-climatologists-falsify-data/

    The adjustments over the whole GHCN series add up to
    NO BIAS” – bill

    An interesting analysis, worrying for our “smoking guns” if true. But is he really analysing “raw” data, as I thought they were only releasing the adjusted stuff? If the data was this available, why have they been working so hard to avoid releasing it? I’m also unclear how you analyse an adjustment in terms of its effect on trend.

  84. Would You Like Your Temperature Data Homogenized, or Pasteurized?

    I’d like it explained before Congress.

  85. Basil 6 44 43

    Couldn’t agree more. I have now read some thirty studies on UHi and looked at the Real Climate and IPCC take on this.

    As far as I can see they really downplay the effect of UHI as they claim that averaged over the whole globe it is negligible (which is true) However that misses the point that UHI is very important in urban areas which covers an increasingly large percentage of the temperature database.

    I think we shoud be supplied raw data then UHi applied individually according to the circumstances of the urbanisation. Clearly a station set in a large park is going to be affected completely differently to a station at a busy airport or in a city centre that 100 years ago was a field at edge of town.

    I believe the Met office adds in UHI from 1975, but despite my asking have never told me what factor they use-perhaps it is buried in their web site somewhere.

    Some of the claims made in the studies of UHI do in my opinion sometimes overstate the case. A station set in a large city will see warming up to a point then the additional urbanisation is likely to mean the warmth is felt over a wider area rather than become more concentrated.

    However certain conditions -like clear still nights- will undoubtedly create a greater uhi effect than in other circumstances like windy weather.

    All in all though the uhi effect is noticeable and not really factored in to anythig like the degree it should be (pun intended!)

    Tonyb

  86. on above post of mine
    The code does not translate to wordpress very well

    Any red marked line:
    single quote, replace with single quote from keyboard – its a comment so you can delete!
    question mark replace with double quote from keyboard
    double quote replace with double quote from keyboard

    If you ever end up with a correctly formatted column of monthly data you will need to remove all error indicators.
    Mark the data column
    on [data] tab select filter
    on first line of column click the down arrow
    deselect all (click the ticked select all box)
    look through offered numbers for error indicators “-” “-9999.9″ etc.
    click the boxes associated with the error indicators
    press[ok]
    only the data in error is now shown
    mark it all and press be careful of the first box as this is patially obscured by the arrow [delete]
    Turn off filter by clicking filter in the ribbon again
    Data is now clean
    I data shows temp*10
    the adjacent (right) to the first temp type
    = [click temperature to left to enter cell]/10
    mark all column including this cell to last row containing temperature
    from [home] tab click fill then select fill down
    This column now contains correctly scaled temperature but the cell contents are actually formulae.
    with column marked copy column [ctrl]+c
    right click first cell of this column and select paste special
    select values only then ok it
    The column now contains actual values and can therfore be copied to another sheet with dates in decimal format i.e. year+(month-1)/12. Note that excel does not like true dates before jan 1st 1900

  87. As an old instrumentation guy, I am always uncomfortable with folks attempting to describe how they use real world temperature measurements to determine trends of only a degree or two over a century. The surface temperature record begins with instruments delivered by horse drawn wagon with calibration accuracy of only a degree or two. Modern air temperature measurement equipment typically has a calibration accuracy of plus or minus 0.5 degrees. Essentially no calibration checks are done over time. Worse is the siting issue.

    For sea surface temperature reading, picture some poor fellow on the deck of a rolling wooden sailing vessel attempting to read, by the light of a swaying oil lamp, a thermometer he just pulled out of a bucket of sea water. That’s how it was done a century or so ago. Land temperature is even stranger.

    Water is pretty much water but land changes day by day. Trees grow, buildings come and go, measuring station locations have to be changed. What may have started as instrument accuracy of 0.5 degrees is impacted by changes in its local environment. The WUWT temperature site survey shows just how poorly the resultant readings may reflect actual local temperatures and any trends noticed in their data. Having instruments with ten times the internal accuracy would not improve the accuracy of our air temperature readings.

    The best any instrument guy would claim would be a trend that shows up as three times the calibration accuracy, and many would demand ten times. And this accuracy must include the error associated with poor siting and instrument reading interpolation. Suggesting that simply smoothing the data will increase its accuracy a logical fallacy. This would be based on the assumption that all errors in measurement are simply random noise. That we are attempting to adjust for instrument changes and UHI should make it obvious that there is much more going on than random noise.

    In my judgment, we have enough evidence in the form of freeze/thaw date changes along with changes in plant and animal life ranges to know some amount of warming has occurred since we have bothered to take records. I do not believe, however, that our long (and short) term temperature measurement and estimation methods provide sufficient accuracy to quantify it.

    Now, some of you may be thinking about the sophisticated signal analysis done on radio signals. We are able to dig out signals buried deep in noise. You might wonder why if we can do that, why that same technique would not work for temperature measurements. The difference, of course, is that for those radio signals, we know the exact nature of the noise and the nature of the highly repetitive signals we are working on. We use algorithms that are designed to find that specific repetitive signal in the presence of noise. Think for the moment what that means for temperature trends.

    With temperature trends, we are looking for a non-repetitive signal in “noise” that we cannot reasonably characterize. The result, of course, is that if we build a signal processing algorithm that looks for a specific trend, it will probably find it. That appears to have been the case with the recent tree ring episode. Eliminate the data sets that do not match an expected pattern from the past, claiming the remaining data sets have proven to be accurate by that standard, and then claim that some non-temperature factor destroyed their accuracy after the matching date range so the later values are inaccurate. That is enough to make any engineer or technician shudder.

    Anyway that’s my opinion looking at this discussion from the perspective of an instrumentation tech, which I was many years ago.

    Just imagine what that poor bloke standing there in an oil skin rain coat on the deck of his 19th century sailing ship would think if you told him that some time in the future his temperature reading would be interpreted to the nearest one thousandth of a degree!

  88. Nick Stokes (05:05:39) :

    It is very hard to tell how skewed that distribution is in that picture, but if you look at the x-axis, you can see how adjustments extend a good deal further to the positive side.

  89. Isn’t the HadCRUT3 data already “adjusted”? If so, it would certainly be interesting to do a similar comparison to data that hasn’t even been touched.

  90. According to NOAA, the largest adjustment in the USHCN data (the US part of the Global Historical Climate Network, which the basis for GISTemp) is time of day adjustments. These adjustments also have the strongest hockey stick shape, especially since the 1950’s.

    The black line here is the time of day adjustments. Yellow is the slightly less hockey stick shaped homogenization adjustments. Source with explanations here.

    Presumably NOAA has the adjustment data for each station. Thus for the Nashville station, for instance, it should be possible to find out WHEN the time of day adjustment was made, to see how much sense this makes of the adjustment record. Perhaps it accounts for the big mid-60’s adjustment, which might on that grounds be perfectly legitimate.

    Time of day adjustments may themselves turn out to be a source of manipulation, but at least they are an adjustment factor that COULD properly move in a systematically warming direction, unlike altitude adjustments (which should tend to cancel each other out over large numbers) or UHI adjustments (which should be downwards).

  91. Out of curiosity I downloaded the station data from the Met Office Website for two England stations with long spans of data.

    http://www.metoffice.gov.uk/climate/uk/stationdata/

    Here are the averages for Durham and Oxford, the slope of the temperature increase varies depending on where you take the snapshot. What we seem to have is a temperature rise over the period of about 2 deg C but apparently starting to downturn. Time will tell.

  92. I live in a small UK village, population about 250, two days ago I drove my mini cooper early evening from our village, the warning alarm came on showing 3 degrees C, 15 minutes later I entered the small town of Mansfield, population 100,000, the temp reading now was 5 degree C, seems the UHI effect is quite significant.

  93. Good piece of work Basil, thanks very much.

    The ‘raw data’ is the most revealing, and shows what I think most people already knew – temperature is always going up and down over time. It is interesting to see that during an interglacial, the Earth’s homeostat is successful in keeping the system in balance to a fraction of a degree Celsius. I wish my central heating thermostat was half as good as this.

    I suspect that if we could go back and examine the real raw data set used for global temperature trends a similar result would be found. Unless, of course, Nashville doesn’t respond to the higher levels of CO2 like the rest of the world.

    The hypothesis of CAGW is brain dead, It’s time to turn off the life support machine before any further harm is done.

  94. Interesting link there, Plato Says (09:15:47).

    It reminded me of IowaHawk’s fascinating nature narrative on the secret life of climate researchers, posted here a while back: click

  95. Thom (09:05:19) :

    Here is an interesting piece of logic:

    “Historically, global warming cycles have lasted 5,000 years. The 800 year CO2 lag only shows that CO2 did not cause the first 16% of warming. The other 4,200 years were likely to have been caused by a CO2 greenhouse effect.”

    It is “obvious” that after the original warming (unexplained) that CO2 causes 4,200 years of warming? How? Also, if you cannot account for a natural process, how do you discount it?

    And the piece about reliable temperature records? Well, ‘detailed filters’ seems to be what this post was about, no? I think there is enough evidence that the record is shaky to cast doubt on the claims of ‘unequivocal’ evidence of ‘unprecedented’ and ‘catastrophic’ warming. Doomsaying is not helpful to the science underlying the case for man made global warming, and neither is the constant stream of media sensationalism about the aforementioned doomsmanship. Yes, I made that word up.

  96. Thom (09:05:19) :

    Also precious is this assertion:

    “And we haven’t even gotten to the stage where the oceans warm up.”

    10,000ish years is not long enough to warm the oceans? Instead it is going to happen in the next 100? Huh? Are we in the magical 1% of the interglacial where previous natural processes do not apply?

    You can fool 1% of the people 100% of the time, 100% of the people 1% of the time, but not 100% of the people 100% of the time.

  97. Let me get this straight: HadCRUT3 is based on real temperature measurements (with likely UHI effects), but has “corrections” and “homogenizations” in it already — probably including extension of measured data into unmeasured areas of the earth, correct?

    Does GISTemp use HadCRUT3 and make more corrections to it? Or do GISTemp and HadCRUT share a root data set?

    I would sure like a map to all of this. Maybe a little flow chart showing the Global Warming creation process.

  98. Now we are not even sure if there HAS been any warming at all! This could be the biggest scientific fiasco of modern times.

    What the elite Climate Scientists are measuring seems to me to be the same as only how they are measuring it. So it doesn’t look good for their case to the effect that what they are doing has much to do with the real world.

  99. If one wishes to test the hypothesis that UHI has affected the trends, then follow the lead of epidemiology and make a dose-response “curve”. For instance, sort the raw data into strata based on population growth (probably local energy usage is important also) and calculate average $\Delta t$ for each stratum. If one wishes to detect whether siting issues are relevent, then a dose response curve against station ranking 1-5, and so forth.

    Just using rural stations may be probative in this UHI regard, but Evan has said that rural stations have worse siting problems, and probably have worse maintenance problems, than do urban stations.

    Now if one wishes to quantify the UHI effect the dose-response curve is not much use in my opinion, and the UHI project that Anthony has on his “Projects” page is pertinent. It allows one to calculate an apparent partial of T with respect to time (i.e. an apparent global warming value) from a “v dot partial of T with respect to x.”

  100. Alec Rawls (09:14:50) :

    According to NOAA, the largest adjustment in the USHCN data (the US part of the Global Historical Climate Network, which the basis for GISTemp) is time of day adjustments. These adjustments also have the strongest hockey stick shape, especially since the 1950’s.

    Thanks for the references. I was looking at the Illinois data on blink comparator that Mike Macmillan posted yesterday, and find the adjustments to be bizarre just the same. Griggsville, Illinois for example has data that were shifted whole-hog by a decade into the past. Were the older data recorded in the wrong decade during the analog to digital conversion? Its possible–I’ve seen horrible data “busts” in digital elevation data for example.

    Here is what the comparisons do for some other stations.

    Morrison: The reverse correction of almost all other records. 1930s shifted positive by about 1 C.

    Olney: 1930s adjusted cooler by 2C. Now very easy to make the current decade the warmest. Some of the oldest data is adjusted downward by 3C and a little of the most recent is adjusted upward by 0.5. The pivot point is about 2000.

    Sparta: Pivot is about 1980. Oldest data vanishes.

    Decatur: Pivot at 1970.

    Aledo: 1985 and 1998 left alone, everything else is adjusted, and while the general adjustment prior to 1998 is downward and that after 1998 is upward, the adjustments are seesaw all throughout.

    Almost every data value is adjusted in some way or another. Is this what we should expect of raw data? These don’t look like any sort of smooth adjustments…

  101. boballab (23:07:48) :

    Anyone that can make sense of this let me know:

    Missing data is difficult to deal with when it occurs in a time-series with a strong trend or cycle. So, for instance, in the S-O-N season the N value will always be colder (Northern latitudes here) than the average of S, and O, so you cannot figure a seasonal average from S and O alone, but have to have some algorithm for replacing the missing N. Apparently, from what you describe, a missing O is no big deal as they just average S and N, and this makes some sense, as O is generally intermediate to S and N in temperature. I have no idea how they handle missing terms at the front or back end of these seasons. I have lots of my inquiries to NOAA and related agencies go unanswered, but you might ask them what it is they do.

    By the way, missing data are called “censored” values in the world of statistics, and isn’t that an ironic term considering the last month or so?

  102. ScottR (10:31:14) :

    I would sure like a map to all of this. Maybe a little flow chart showing the Global Warming creation process.

    This game has become so complicated that a “program” would be helpful. Unfortunately the flow-chart might end up looking like those diagrams that Mort Saul made of the Kennedy Administration, and only good for comic effect.

  103. Mark (22:14:25) :

    Having spent a few weekend driving around upper Minnesota and the two Dakotas for surfacestations.org, I have to agree with Evan. Maybe in the Stevenson Screen days it would have been different, but the power/trenching requirements on the MMTS units seem to have done a darn thorough job of significantly biasing the rural stations too.

  104. Phil A,
    You can expect that anyone who calls people “deniers” probably isn’t a disinterested neutral party. I think the gentleman in question overreaches with his summation that this analysis effectively debunks any concerns about adjustments in the surface temperature record.

    This may show that, in total, the CRU is relatively even-handed with their adjustments once they get the data. It does not end to end verify anything and, maybe someone can confirm/deny, is the CRU raw really raw?

  105. JamesJ003 (11:31:38) :

    Kevin Kilty 10:35:00

    It has….McKitrick and Michaels 2007

    http://www.uoguelph.ca/~rmckitri/research/jgr07/jgr07.html

    Thanks. That is quite an interesting paper, and it does provide statistical evidence of a sizable effect from UHI. However, it does not eliminate a need for a study along the lines of the UHI project that Anthony suggests, which would be more like a direct measurement.

    Also you will note in several sections they state

    “If done correctly, temperature trends in climate data should be uncorrelated
    with socioeconomic variables that determine these extraneous factors.”

    If we are talking about identifying high-quality data, then insistence on no-correlation is almost assuredly not correct. I think that temperature measurement and adjustment, if done right, should show some correlation of this sort, as our use of energy, independently of the effect of CO2, demands that energy usage is dissipated to heat and ought to be apparent in reliable temperature readings.

  106. Let me see if I get this correct. Based on what I think you did with the data. If I were to sum up all the data points for Jan over the 100 or so years then I would see how much the temperature rose in the month of January over the 100 or so years. That might be as instructive as just knowing the deltas from year to year.

  107. I read all the responses last night, but there’s just too many to get thru today.

    The point that I’d like to make is that everyone is looking at the temperature data & finding complete fabrication. Imagine what they could have done with the grid matrix? For them it’s the mother lode!

    Think about it, for them to get the hockey stick they want, they need the temperature of the grid times the grid ‘volume’/’area.’ Wouldn’t it be easier to fudge that stuff? There is never a discussion of the grid.

  108. TonyB (08:36:05) :

    Basil 6 44 43

    A bigger problem is the stations that have been dropped from the data

  109. “Phil A,
    You can expect that anyone who calls people “deniers” probably isn’t a disinterested neutral party. I think the gentleman in question overreaches with his summation that this analysis effectively debunks any concerns about adjustments in the surface temperature record. ” – NickB

    Agree entirely about the “deniers” point – as soon as I saw that I knew we were probably not dealing with an open mind.

    Having thought about it, let’s put it this way. Say we looked at a few criminal trials and found suspicion of jury tampering. Would a statistical analysis showing overall conviction rates had not been affected mean that the identified issues simply didn’t matter? Because that’s what he’s saying.

  110. Scott R,

    “Let me get this straight: HadCRUT3 is based on real temperature measurements (with likely UHI effects), but has “corrections” and “homogenizations” in it already ”

    Yes. The HadCrut3 data that have been released are homogenized. We dont know by what method(s), because Phlim Phlam Phil wont tell anyone. But they are homogenized. They likely do not include any significant correction for UHI, because Phil doesnt believe in it.

    “— probably including extension of measured data into unmeasured areas of the earth, correct?”

    No. Extending measured data into unmeasured areas occurs during the gridding procees. The homogenization process may extend measured data into measured areas, i.e. adjust some station’s data based on other stations’ data.

    “Does GISTemp use HadCRUT3 and make more corrections to it?”

    No. Despite what you may have read above, that is not how GISTemp is calculated.

    “Or do GISTemp and HadCRUT share a root data set?”

    Good question. It now appears that CRU draws heavily from GHCN, as does GISS. We dont know what other elfin magic CRU adds, though. And even if both draw from GHCN, they undoubtedly use a different mix of stations and a different mix of the data from those stations. Until the Motley CRU release their station list and detailed methods we wont know, but it is unlikely that the two start from exactly the same ‘root data set’.

    “I would sure like a map to all of this. Maybe a little flow chart showing the Global Warming creation process.”

    I agree. A cogent explanation of the two primary ‘global warming’ temperature estimates (GISS and CRU) and how they are derived from GHCN and/or other sources would be useful. It would document the interrelatedness of these ‘independant’ measurments, as well as catalog the holes in the public side of the process that the FOIA requests have been trying to fill.

    One of these guys with a website ought to do this. Perhaps someone already has …

  111. seems to me that if it’s “global” warming then i should be able to find a rural site with a long history that is well sited and pull it’s data and see the trend. If it’s truly global then the trend should always show up. Take one from each hemsiphere if you really feel a need to multiple samples.

  112. Basil,

    You note that you used “homogenized data” Since Nashville is an Urban site GISS will adjust the record using surrounding rural sites. CRU make no such adjustment for Urban effects, instead CRU increase the error bands around the temperature signal to include the presumed effect.

    You should look at the station “raw” data from GISS, then also at the data after combining stations if you want to get a complete picture.

    Just a hint

  113. ScottR (10:31:14) :
    “Let me get this straight: HadCRUT3 is based on real temperature measurements (with likely UHI effects), but has “corrections” and “homogenizations” in it already — probably including extension of measured data into unmeasured areas of the earth, correct?

    Does GISTemp use HadCRUT3 and make more corrections to it? Or do GISTemp and HadCRUT share a root data set?

    I would sure like a map to all of this. Maybe a little flow chart showing the Global Warming creation process.”

    No. I know its confusing but here is the situation.

    GISS: use GHCN “raw” data. They make adjustments to Urban stations
    based on nearby Rural stations. These “adjustments” are all over the map, sometimes warming urban stations, sometimes cooling them. St. Mac has covered this. After adjustment Hansen claims his data shows the same trends as a RURAL ONLY. Problem? Rural isnt Rural.. se surfacestations project.

    CRU: CRU ( it would appear) Also use GHCN ( plus others) They make NO ADJUSTMENT. They argue, citing peterson 2003 and Parker’s paper) That the Impact of UHI is SLIGHT. Jones ( see climategate mails in the Jan 07 time period ) clarifies this for Susan solomon prior to her Paris briefing. According to Jones ( 1990) the urban effect is on the order of .05C per century. CRU handle this NOT BY ADJUSTING, but by INCREASING the error bands around the dataline.

    Clear? I thought not.

  114. The site is doing a fantastic job. Forgive me if I push for guidance on several related issues. It would help me to take the battle forwards.

    The debate is now addressing raw data and historical data (the same?) and homogenised and now manipulated data. I’m concerned that so-called raw data has been changed and I am very concerned about that. I’m not clear whether it is still available or not.

    I’m also vaguely aware that the default of increased warming could be a process consequence rather than a conspiracy. I think a lot of us have strong views on that and eagerly await a conclusion. Either way, if it changes cooling or flat into warming, then at what point do we give data to the press?

    It would be very useful for those of us not in the business to understand the relationships between GISS, NOAA, CRU and Hadley (UK Met Office) and any other major players. In particular, who has the raw data, which agencies manipulate it, who publishes what, etc. Is their homogenisation legitimate or misguided or corrupt?

    For example, I am particularly annoyed by UK Met Office propaganda, but I don’t know how many data sources they use and whether the data is raw or homogenised when it reaches them. I don’t know whether they are the guilty party or the messenger. I want to hammer them but I don’t have the necessary knowledge. I’m sure that others are in this situation.

    An informed overview of this lot would be great.

  115. Anthony, we here in Wallowa County are as rural as you can get. And we don’t appreciate you degrading our BBQ’s and silver upturned boats next to the temperature house. Plus, it is DAMNED cold in the winter and a lil’ heat by the Stevenson screen helps us navigate the daily thermometer check. Afterall, 15 minutes in this windchill and we freeze our ears and fingers. However, I will admit that the mug of hot Tom and Jerry helps a bit. I will remind the folks that monitor the two stations we have here to take the mug’o T&J with them instead of firing up the BBQ or using the blowtorch so’s they can see the frosted glass.

    Now, for the really important stuff. My Jeep windshield frosted on the INSIDE (not fogged…it fricken froze!) this morning. Anyone know of an ice scraper shaped for the inside of the window?

  116. My own feeling, after looking at some of the actual raw data station records and reading the Climategate documents, is that the real problem CRU/GISS have is that the raw thermometer data they gathered wasn’t of sufficiently good quality for them to be able to produce meaningful anomaly graphs, and they had to ‘fudge’ it as best they could. There were also too many problems with dissonance in the various temperature proxy series for this to be much use either.

    Satellite data is much better in terms of continuity and spacial coverage, but even after it had been ‘calibrated’ to the obviously flawed thermometer data, it didn’t show the trend they believed was happening. So perhaps we shouldn’t be surprised at seeing station deletions and adjustments made to the pre-satellite data, needed to provide the +ve global temperatures needed to get continue support for the CAGW myth as our cooling Earth failed to co-operate.

    Once you start going down the slippery slope, it’s impossible to stop.

  117. I would prefer my data raw please.

    If we had ten thousand stations recording the minimum and maximum temperatures every day for 150 years, this amounts to 1,095,000,000 temperature readings or one gigabyte. Assuming a lot extra information is added on the site location and other details, this could blow up to perhaps a 100 gigabytes. This would hardly strain a modern pc. Just how expensive is it to make such data available on the web? Consider the data streams that astronomers need to deal with. I spent a whole minute searching and found this:

    Catalina Sky Survey…. Thanks to the $890,000 NSF grant awarded this month, the CRTS team soon will construct a web site that will make roughly 10 terabytes of data taken by the CSS over the past 5 years — as well as all new CSS data that continues to stream in — available over the Internet to astronomers worldwide, professional and amateur.

    How many billions has gone to fund climate research?

  118. Thom (09:05:19) :

    What do you make of this?

    http://www.informationisbeautiful.net/2009/the-climate-deniers-vs-the-consensus/

    All you have to do is look at the “climate deniers vs the consensus” title to see that the author likely has no idea what s/he is talking about, because both sides would likely be wrong, or at least would most likely have positions that have nothing to do with science. But, then, what are “climate deniers” anyway? I haven’t even heard of that one!

    Going to the site, the author admits having no knowledge at all about climate science, praises Gavin Schmit and realclimate – where openness is absent, and doesn’t understand the basic argument of many sceptics – that the ipcc and its elite Climate Scientists are not doing real Science, where you have to first prove or back up your hypotheses by following the Scientific Method, which involves ‘showing your work’ by making it – code, data, ‘materials and methods’ – very accessable to anyone interested and in a timely way.

    Otherwise, there are no scientific conclusions to either defend, promote, or criticize in the first place. It gets even worse from there for the “consensus”, pro-AGW position, which has been demonstrated well by Steve McIntyre, Richard Lindzen and many others, completely apart from what the leaked emails reveal.

    I’m afraid that the author of the linked blog has no idea of what science is. S/he’s trying, but completely off the mark.

  119. steven mosher (13:34:23) :

    Basil,

    You note that you used “homogenized data” Since Nashville is an Urban site GISS will adjust the record using surrounding rural sites. CRU make no such adjustment for Urban effects, instead CRU increase the error bands around the temperature signal to include the presumed effect.

    You should look at the station “raw” data from GISS, then also at the data after combining stations if you want to get a complete picture.

    Just a hint

    This, and your comment that followed, were insightful. Thanks for them.

    Let me see if I understand. The GISS adjustment is to make Nashville “rural.” And it does this by making it appear that Nashville has warmed more than it actually has? Shouldn’t the adjustment be doing just the opposite? If the “unadjusted” Nashville trend was already sloping downward, any adjustment to remove UHI should have increased the downward slope, not turn it into a positive slope! It is almost as if the adjustment, rather than removing UHI, has artificially enhanced it.

    Now maybe this is a “one-off” example, but it certainly suggests that there are situations where the UHI adjustment produces bizarre results. Of course, I’ve said several times now that I don’t think we should be trying to remove UHI from the “global temperature” metric. If I have a fever, I want to know it, I do not want a thermometer that has been hacked so that it does not measure fevers. This is just a case of being too clever by half.

  120. Basil (05:26:33) :
    YYYY xxxx xxxx …. xxxx

    where YYYY is the year, and the xxxx are monthly figures (except for GISTemp, which has a bunch of averages after the December figure. In either case, the object is to parse the line into “words” and write the 2nd through 13th word out line by line to a new file. (Well, I’d need to probably write out the first YYYY, maybe as a comment, at the start of the new file, since the data do not all start at the same year.) I realize this is a very trivial exercise…to anyone who codes daily.

    So the bleg was just to ask for some ideas about how to do this. Lots of bright and capable people read this blog. Paul Clark (WoodForTrees) has to do this to read the GISS data for his web site, so he probably has some C++ code that does it,

    Some ideas for doing such things in C++ are here

    http://arnholm.org/tmp/basil.htm

  121. “Would You Like Your Temperature Data Homogenized, or Pasteurized?”

    Actually I prefer my data raw and unadulterated. It’s hard to find it that way though.

  122. JJ (07:47:45) :

    Basil,

    Turning to the balance of your analysis, you can save yourself considerable effort next time when calculating your ‘mean seasonal difference’ statistic. It is not necessary to create a difference series and tally all of its values to do that. Mathematically, your ‘mean seasonal difference’ statistic simplifies to the following equation:

    (Te – Ts)/Y

    Where:

    Ts= Temperature at the start of the analysis period.

    Te= Temperature at the end of the analysis period.

    Y = number of years in the analysis period.

    It should be much faster for you to calc that statistic next time. This does raise the following questions, however.

    There are other reasons for doing what I’m doing. I’ll come back to that. But for now, I’m trying to follow you, and not having much luck.

    At the beginning of the the period, the GIS temperature was 1.8. At the end of the period, it was 13.8. Now putting aside the fact that I’m actually shy a couple of months of having 128 years, according to you, the average should be (13.8 – 1.8)/128 = 0.09375. But the actual number is negative, not positive, so your shortcut is not even in the right ballpark. I think I know why you are so far off, but I’ll let you work it out. For now, besides starting at 1.8 in January 1881, and ending at 13.8 in October 2009, the average monthly seasonal difference over 1546 months was -0.012647. Have at trying to come up with the latter just from the starting point, the end point, and the number of points.

    You have 130 years of data at the Nashville site, which amounts to 1,560 monthly average temperatures. First, you throw out 92% of that data by choosing only to look at October. Then you throw out 98% of that October data, by using a ‘mean seasonal difference’ statistic that is determined by the values of only two of the October monthly averages – the endpoints of the period of analysis.

    Why is using only 0.13 % of the data to calculate a temperature trend superior to calculating a temperature trend from a larger portion of the data? Like, say, all of it?

    I’m sure you are trying to say something profound, but I’m not getting it. I haven’t thrown out any data. I’ve used it all because I need it all to come up with the HP smooth (the blue wavy line in the charts). And contra your absurd — sorry, the more I think about this, the more in a huff I’m getting over it — notion that I’m ignoring any data, the blue lines are very good representations of how the trend varies across time.

    I think you’ve misunderstood something big time, but I don’t know what it is. Did you get confused by the straight lines in the second figure? Those are linear trends, using good old ordinary least squares, through all the data — actual temperatures, not seasonal differences.

    I know I’m doing some unusual things with the way I’m analyzing the data and presenting the results, and so I bear the burden of trying to make sure I’ve explained it thoroughly. But readers have a burden, too, of making sure they understand what I’m doing before they go off and say I’m half-cocked.

    Looking back at what you’ve said, I’m trying hard to understand you, because you might just be right. But I cannot understand you. Maybe we could take a step back and try to understand each other better if you will clarify what you are trying to say with these statements:

    First, you throw out 92% of that data by choosing only to look at October.

    What in the world are you saying? If you want to talk about particular months, there are — just to pick one — 128 seasonal differences for April in the -0.012647 average. And 128 seasonal differences for June. And so on.

    Then you throw out 98% of that October data, by using a ‘mean seasonal difference’ statistic that is determined by the values of only two of the October monthly averages – the endpoints of the period of analysis.

    Again, what are you saying? What “two of the October monthly averages?” In truth, there are 128 Octobers in the -0.012647 average. But only 127 Novembers, or Decembers. E.g., if you multiply 128×10, and 126×2, and add them, you’ll get 1534 observations, as in the following printout of stats for this variable:

    Summary Statistics, using the observations 1881:01 - 2009:10
    for the variable 'sd_GIS' (1534 valid observations)

    Mean -0.012647
    Median 0.00000
    Minimum -10.900
    Maximum 9.5000
    Standard deviation 2.6586

    Well, enough of this. To the bit of snark “Like, say, all of it?” all I can say is, “I did you all of it.”

    Basil

  123. Re: Basil (05:26:33), ScottA (02:14:06), & Others

    Converting 12 month rows for each year to column-stacks is a snap in Excel using the “offset” function.

  124. Pamela,

    Up here in the Great White North ice on the inside of a windshield is a frequent winter hazard. The only recourse that I am familiar with is to use an old credit card to scrape the glass, as it will bend to conform to the shape of the window. Points cards and other useless customer loyalty cards work as well. I use an old Tim Horton’s card. But since I am dealing with a minivan, I often find it hard to reach the furthest recesses, and often fantasize about someone inventing an inside scraper. It’s been frigid here, too; in fact, record-breaking out west.

  125. Re: bill (08:44:20) & bill (07:40:13)

    Good heavens bill. Just enter the “offset” function in one cell, copy/paste, & be done.

  126. Having become more than a little disenchanted with the low level of statistical inquiry (and data handling) undertaken by GISS and CRU, I can see no other choice other than for individuals to enter into the fray and provide disinterested, non-partisan analysis and make this analysis available to review by their peers.

    To this end, I am setting up a Bayesian inference engine (using MCMC) which will decompose monthly or daily temperature records for a single (or multiple locations) into the following:

    1) A within year annual cycle – using a Bayesian form of Fourier decomposition taking, say, the first 12 terms – (i.e. 1 year cycle, 1/2 year cycle, 1/3 year cycle, and so on). [Purpose is to systematically account for the strong seasonal signal].

    2) A decadal cyclical component – (i.e. time span of data series, time span of data series/2, and so on). [Purpose is to identify influences such as PDO and other long cyclical behaviour - and to place error bounds on that signal].

    3) A random walk component – possibly fitted as an AR(n) process.

    4) A linear trend (with offset) and error bounds. This is the long-run ‘climatic’ trend component and is, ultimately, the primary statistical object of interest.

    5) Measurement noise. All measurement processes are noisy. Some of the noise may be attributed to human-induced variations; other parts may be attributed to short-term weather variations.

    Does anyone have any thoughts as to the utility of this endeavour?

  127. I was interested in finding some historical weather data for New York City and since I’ve been living here for over fifty years, I’m familiar with details of the locations around where the data is collected.

    I was looking for the station data for Central Park in Manhattan. The station is in the middle of the park and the park hasn’t changed radically since it was first built in the 19th century. The park also provides some insulation from the streets surrounding the park. The surrounding streets were first developed in the early 20th century and also haven’t changed radically over time either.

    The reason is that this is where the temperature is taken that’s in every newspaper and TV weather report and has been for decades. Since this is the benchmark for weather reporting in New York City, there should be no reason at all for “adjustments”, “corrections”, “homogenization”, etc. Imagine telling people that the temperature they were told on the local TV and newspapers was all wrong for the past hundred years.

    What I was looking for was a single station, in the same location, in a place that hasn’t changed very much. The current location, Belvedere Castle, has been in use since 1961. Before that, it was taken at a building called The Arsenal almost at the southeast corner of the park. My idea was to take look and see if there was any trend in that one set of data and compare it to GISS which shows a definite uptrend. My problem is that NSDC wants $70 for the data from 1961 and $200 for the complete record back to 1869. Also I’m not sure if the historical data is “as reported” or has been “corrected”. I have to think about that.

    So I thought I found a way around this by getting from the Met Office. And sure enough they do have a set but it’s marked “New York/La Guardia” (725030.txt) and goes back to 1822. Not as good though because it’s located in Queens which was pretty rural up until the 1920’s and 1930’s. Also, it’s right on the water so that will have an effect on the readings. And temperatures in Queens are cooler generally than in Manhattan (smaller buildings, less concrete, more trees).

    But wait! LaGuardia Airport (correct spelling) station didn’t start until 1935. What’s up with that? Hell, the airplane wasn’t invented until 1903. This raises some interesting questions.

    Where does the rest of the data going back to 1822 come from?

    Were a number of area stations averaged together? Which ones?

    Why, wouldn’t you just use perfectly good single set continuous back to 1869? Perhaps because it provides some opportunities for “adjusting”? Hmmm?

  128. Nick Stokes (5:5:39), and Basil (7:31:56), this needs more analysis.

    http://www.gilestro.tk/2009/lots-of-smoke-hardly-any-gun-do-climatologists-

    falsify-data/

    He is a Neurobiologist and very confident that the adjustment bias is zero. I was about to put in my two bits worth when I scrolled down to a comment that echoed my thoughts, by SG. SG said that if the early adjustment is down and later is up then a warming slope is produced from a neutral

    The point is, are the adjustments random on a time basis ?. My reading of comments here (Spain,Darwin) is that they are not. Early years are adjusted down and later years are up.
    If you look at this on a swing (around zero) basis, plus one and the minus one cancel out but the time series result is a positive warming trend.

  129. Richard Wakefield:

    “I have long speculated that what we are seeing is not a true increase in temps, but a narrowing of variation below the maximum temps, which will tend to increase the average daily temps, but no real physical increase is occurring. That is, what we are seeing are shorter warmer winters, with no change in summer temps, which gives an increase in average temps over time.

    Also, I’m speculating that spring can come earlier and fall come later, which one can see with the temp changes in those transition months.

    This could be the killer of AGW if this is the case because there is no real increase in temps, just less variation below max temps for each month. The alarmism of a catastrophic future disappears then doesn’t it.”

    Amazing that mainstream science hasn’t looked into this possibility as a way of testing its hypothesis. (Actually, it’s not amazing.)

  130. Basil,

    “Let me see if I understand. The GISS adjustment is to make Nashville “rural.” And it does this by making it appear that Nashville has warmed more than it actually has?”

    Has the GISS homogenization adjustment made Nashville appear warmer? The only way you would know that would be to compare to the non-homogenized data, and you have not done that. See my post above on this topic.

    “Shouldn’t the adjustment be doing just the opposite?”

    Perhaps it does. You wont know until you compare to the non-homogenized data.

    “If the “unadjusted” Nashville trend was already sloping downward,”

    Was it? Have you looked at the non-homogenized data yet?

  131. Mike Fox (23:35:01) : See what I just wrote to JJ about “raw” data. I certainly am not interested in going back to the truly “raw” data, which is daily.

    As near as I’ve been able to work it out, the GHCN and USHCN (v2 or 1) are all “cooked” in various ways. That just leaves the online dailies….

    If there is some ‘in between’ compilation that stays “uncooked” I’ve not yet found it. (If you know of one, please post a note at my blog. Given the pace of things now and here I can’t keep up with all the threads with followups…)

    In deciding where to begin, we have to decide whether we begin with the truly raw data, or the data that was received by CRU (or GISS) from the met agencies.

    GISS does not get raw or semi-raw data. They get GHCN from NOAA / NCDC and USHCN or USHCN.v2 from NOAA / NCDC. The GHCN data are horridly “cooked” via thermometer deletion. ( I don’t have different ‘eras’ or ‘release levels’ to look at to see if there is a cooking / bias for individual stations over time… but it’s a big “Dig Here”…) USHCN is significantly broken given that the comparison of 2 different “versions” of what is supposed to be the same “raw data” can have 1/2 C variations in any one yearly average and has about 17 years completely missing from one… (IIRC it has 1883 in one and starts at 1900 in the other for Orland – yet the “adjusted data” DOES start at the earlier time. How you can have the “adjusted” start before you have the “raw” start is an “interesting question” …

    IMHO, it is this pernicious data cooking by NOAA / NCDC that is the most egregious as it impacts EVERYONE else. Even the Japanese were snookered in that they use GHCN too.

    Much of the ruckus over FOI’s has been simply to get the latter. And I think we now have some of that with the “subset” of HadCRUT3 that has been released. I may be mistaken — somebody correct me if I am — but I think this is supposed to be the data “as received” from the met agencies. They describe the data they are releasing thusly:

    I don’t think so. Maybe they released something else too? But what I saw on the top page sounds “kind of raw” but when you look into it, the web site says the 1500 records are of the CRUcooked product.

    From:

    http://chiefio.wordpress.com/2009/12/10/met-office-uea-cru-data-release-polite-deception/

    QUOTE:
    “The data subset consists of a network of individual land stations that has been designated by the World Meteorological Organization for use in climate monitoring. The data show monthly average temperature values for over 1,500 land stations.”

    “The data” “individual land stations” “monthly average temperature values”. It all sounds like they are releasing the temperature data…
    END QUOTE

    and then further down…

    QUOTE:
    There is a link near the top of that page that mentions this is a subset of the HadCRUT3 data set… “But I thought HadCRUT3 was a product, not the “raw” data?”
    [...]
    From:

    http://www.metoffice.gov.uk/climatechange/science/monitoring/hadcrut3.html

    [...]
    “HadCRUT3 is a globally gridded product of near-surface temperatures, consisting of annual differences from 1961-90 normals. It covers the period 1850 to present and is updated monthly.

    The data set is based on regular measurements of air temperature at a global network of long-term land stations and on sea-surface temperatures measured from ships and buoys. Global near-surface temperatures may also be reported as the differences from the average values at the beginning of the 20th century.”

    So this is the product and not the data. It has the HadCRUt 1850 cutoff in it. It is based on measurements and it not itself a measurement of anything. This is not the temperature data, this is the homogenized pasteurized processed data food product.
    END QUOTE.

    I would love to be told that there was another 1500 data set released and it was the “raw” data, but as near as I can tell, this looks like CRU Crud to me.

  132. nominal (09:58:14)

    Thank You for the links. Would this information be useful for a complete global surface temperature reconstruction (being at the sea surface, not in the water) since land stations cover such a small percentage of the globe? Or has this been attempted. It’s funny how everybody ends up just using surface stations. Also there is an enormous amount of data on paper lying around never reported to NOAA, although this would require more effort than the CRU seems to have.

  133. Keith Minto (18:57:36) :

    An excellent point. I also find it a little odd that the adjustments would turn out around 0. I would expect them to be a little higher or a little lower if the adjustments were necessary because of some equipment. If the net value of the adjustment should be around 0, then why bother adjusting any of it? It seems more likely that you would want to look at each station on a station by station basis and determine what adjustments were necessary by looking at a station. If that figure came out to be as close to 0 as that distribution indicates, I would be pretty shocked.

    This would mean that the flaws in all of the stations developed in such a way that would add up to 0!!! Some stations read too hot, some too cold, which would be pretty incredible, no? It would also mean it is safe to stop adjusting the data until a new station comes online. Does GHCN/CRU look the same with the adjustments removed? That would prove the assertion made by Gilestro is correct or incorrect.

  134. ScottA (02:14:06) :

    Did I miss the bleg for transforming row to column data?

    I have not caught up yet if you have the rotation answer, but if you are in Excel spreadsheet, simly highlight the data by dragging, then copy, select a starting cell in a blank area, then “Paste special – transpose” (box at lower right of window).

  135. Basil,

    “There are other reasons for doing what I’m doing. I’ll come back to that. But for now, I’m trying to follow you, and not having much luck.”

    I believe I have figured out our disconnect.

    I had understood that you were only using the October data for your ‘mean seasonal difference’, using it as a representative annual figure (much like we often use the annual water year minimum for certain trend analyses, which coincidently also uses October values). Under that assumption, my previous post was correct.

    I gather from your subsequent comments that you are actually using the data from all months of the year. If that is the case, I amend my previous post as follows (changes in italics):

    ***

    Turning to the balance of your analysis, you can save yourself considerable effort next time when calculating your ‘mean seasonal difference’ statistic. It is not necessary to create a difference series and tally all of its values to do that. Mathematically, your ‘mean seasonal difference’ statistic simplifies to the following equation:

    (Te – Ts)/(Y-1)

    Where:

    Ts= Temperature at the start of the analysis period.

    Te= Temperature at the end of the analysis period.

    Y = number of years in the analysis period.

    Simply apply this equation for each month you wish to include in your ‘mean seasonal difference’, and average the results.

    ***

    The balance of my previous comments on your method apply, with the minor correction that your method only uses 24 out of the 1,548 temperature measurements available to you to define your ‘mean seasonal difference/Trend’. You are actually using 1.5% of the data to derive your trend, not 0.13% as I had claimed earlier.

    The previous questions remain:

    Why do you consider a trend that only reflects 1.5% of the data to be superior to a trend calculated from a larger percentage of the data, such as the standard trend line that uses all of the data?

    Given that your ‘mean seasonal difference’ statistic only uses 24 datapoints (the monthly endpoints of your seasonal series) it should be apparent that the choice of those 24 points is … pretty important. Just moving one of the endpoints of your analysis period forward or backward by one year could dramatically change the ‘mean seasonal difference’ trend that you calaculate.

    In fact, on a dataset with essentially zero trend (such as the homogenized GISS dataset for Nashville that shows much less than 0.1C warming over a century) you could completely flop the trend from warming to cooling and back with only tiny moves of the endpoints.

    To quantify, I have checked these assertions against the Nashville data.

    Applying your method to these data, I arrive at a ‘mean seasonal difference/trend’ (MSD) for 1881-2009 of -0.0126. This matches what you report here.

    Applying my simplified method to the same data, I arrive at an MSD for 1881-2009 of -0.0126. The simplified method works.

    Moving the start of the analysis period up only five years, the MSD flops sign from cooling to warming. MSD for 1885-2009 = 0.007.

    Move up the start of the analysis period up one additional year and quit two years sooner, and you get an MSD for 1886-2007 of 0.025. By shifting one endpoint of the analysis by six years and the other 2 years, your method turns a cooling trend into a warming trend double the size.

    Is this robust?

    Incidently, I calc’d a standard linear regression trend for the GISS homogenized data 1881-2009. I did it a couple of different ways, and end up with a cooling trend each time. What method did you use to calculate trend that shows a warming in the GISS data over that period?

  136. Re: Keith G (17:49:20)

    I have some background in that area. Great memories: Metropolis-Hastings algorithm.

    I would very strongly caution you that almost any assumptions of randomness will be suspect, aside from your #5. Many will disagree with me and I will, of course, disagree with them (respectfully if collegiality finds a way to be a 2-way street).

    The randomness issue isn’t nearly the problem it seems to be if conclusions are qualified &/or presented with necessary context. My first question is always: What assumptions are the conclusions based upon? Too many stats profs fail to impress upon their students the importance of critical thinking about assumptions, but it is (perhaps) easy to understand why, given how much algebra has to be plowed through.

    I will be interested to see what insights you can share. Every perspective sheds new light and I’ve not heard much talk of Bayesian stuff around here.

    Cheers.

  137. Re: Paul Vaughan (00:29:52) :

    Thanks for the note of caution wrt randomness. Correct me if I am wrong, but this caution refers to assumption #3?

    Input wrt model assumptions always appreciated.

    In any event, it will take me a few days to set up and debug the statistical model: I have to work through the algebra, write the Mathematica code, and debug – all in my spare time. But with Christmas looming, results may not be forthcoming quickly.

    I will start with a couple of neighbouring sites – a la Peter (and Dad) – to begin with. If there is any merit in continuing, I will expand to a larger data set.

    Insights, if any, will be shared – as will data, assumptions and code.

  138. Jeff (01:31:25) :
    I’m curious if the GISTemp data has both raw and homoginized data ?
    Because if that is the case I have seen several stepladder adjustments from raw to adjusted in Pa alone …
    adjustments that can’t be justified by station moves or UHI …

    A rather complex question to answer that OUGHT to be simple. But, IMHO, the complexity shows where the “issue” starts…

    GIStemp takes in what is called “raw” data. Everyone studiously uses that label. It uses “raw” GHCN and USHCN. But when you go to NOAA / NCDC you find that the “raw” GHCN and USHCN are not raw. There is some ill defined “QA” and “homogenization” applied. AND the GHCN data are heavily biased by deleting cold thermometers from the recent past, but leaving them in the “baseline periods” of the major “value added” (GAK!) temperature series (GIStemp, HadCRUT). So “raw” isn’t “raw” and “unadjusted” is “adjusted”… There is your first clue…

    When you find you must keep two sets of books on what “is is”, well, something is smelling. And when someone says “You just don’t have the credentials” something is smelling. And when someone says “trust us”, don’t.

    So, back to your question:

    The really raw USA data go to NOAA / NCDC. It goes through their sausage grinder and comes out as part of the GHCN data set (in degrees C) or the whole USHCN data set in degrees F. Both of those are available in two forms. Adjusted and “Unadjusted”. Yet the “unadjusted” are in fact adjusted. And a short comparison of the USHCN old version (that ends in 2007) and the USHCN Version 2 shows that even those two versions of “unadjusted data” can vary by at least 1/2 F for any given annual mean. And both are different from the GHCN copy (after converting to the same C or F for comparison).

    THEN GIStemp takes this “unadjusted adjusted data” and adjusts it one heck of a lot more with a double dip of homogenizing on top of the homogenizing done to the “unadjusted” cooked data that is taken in (that GISS calls “raw” but isn’t.)

    Baffling? I think that was the whole point…

    I like to “Keep a tidy mind”. And when something is as untidy to try and think about (and keep straight) as this is, my “Bull In Our TImes!” raspberry alarm clock goes off…

    So, back to PA, to sort out who did the buggering and stepping you saw, you must take the GIStemp data and compare it with the USHCN or USHCN.v2 data (depending on if you are looking at a recent chart or one from about a month ago). Then, if you find it is USHCN (either version) you would need to go “upstream” to the dailies…

    Good Luck..

    BTW, you said ” several stepladder adjustments from raw to adjusted in Pa alone”. I would assert you are most likely NOT looking at really raw data. You are most likely looking at the USHCN unadjusted data … and those are not raw even though GIStemp calls them raw… If it’s USHCN vs GIStemp end product, then you are looking at the results of GIStemp STEP1 – homogenization and GIStemp STEP2 – UHI adjustment. Of them, I’d suspect the STEP2 program PApars.f as the most likely suspect.

    (Notice what I mean about the pain of trying to keep this mess tidy in a tidy mind… you need to use phrases like “unadjusted adjusted” and “raw adjusted” and … It is just screaming deception from the tortured language needed to track the pea under the shells…)

  139. H.R. (04:57:13) :
    JustPassing (02:38:55) :

    “I think I’ll send the CRU at East Anglia a nice box of fudge for Xmas.”

    Bingo! If your lead was followed… a couple of thousand boxes of fudge arriving at CRU would certainly make a point.

    IFF you do that, please assure the fudge is overcooked and arrives stale… just like the temperature series… and having a few ‘rubber erasers’ as ‘synthetic in-fill’ would be a nice touch too…

    Come to think of it “salting a mine” is not that much different from what they were doing… so I’d make sure to put a lot of salt in, but not so much sugar…

    Just a thought… (wouldn’t want them to actually enjoy the fudge, would we ;-)

  140. Hi Guys,

    First how I converted the CRU downloads to XLS
    Moved all the files into one directory as .txt files
    You can download this as a zip file

    http://www.akk.me.uk/CRU_data.zip 3.3Mb

    Rename all the files to .xls
    You then get the data in one column
    Uses data –text to columns, delimited –space to split into columns.

    Would appreciate your comments on this anomaly.

    I took the Met Office Station data for Oxford.
    They give tmax and tmin.
    For each year added the 12 month values and divided by 12 to get the average for the year.

    Calculated (tmax-tmin)/2+tmin to get the average temperature for each year.

    Took the CRU data for Oxford which only covers 1900-1980.
    [ I wonder why when the station data is readily available on the Met Office site ]
    Again, added each month together and divided by 12 to get the year’s average
    Compared the CRU average with the Met Office average.
    Minor differences until the last 3 years when the difference jumps from maximums of around 0.05 to 0.5

    You can download the spread sheet at

    http://www.akk.me.uk/tempdata.zip 71Kb

    Let me know if you can spot any errors.
    Thanks

  141. JJ,

    Well, I still do not see how a careful reading, in the first place, would have led to your mistake: the reference to October is preceded with “for example,” and it is sitting three lines below a chart that shows that I used all months. In any case,

    Simply apply this equation for each month you wish to include in your ‘mean seasonal difference’, and average the results.

    Yes, this would work.

    But I am not just interested in the average. I want to see the patterns in how the average changes over time. These patterns are indicative of “natural climate variation” and need to be quantified and understood before we can began attributing the sources of climate “change.” As I said in commenting on the last figure, which is the same as the blue line in the preceding figure:

    I suggest that the above chart showing the fit through the smooth helps define the challenges we face in these issues. First, the light gray line depicts the range of natural climate variability on decadal time scales. This much – and it is very much of the data – is completely natural, and cannot be attributed to any kind of anthropogenic influence, whether UHI, land use/land cover changes, or, heaven forbid, greenhouse gases.

    This very important point is lost in fitting simple linear trends through undifferenced temperature data. When doing the latter, and there is a tendency to attribute any rising trend to AGW. But that is a spurious conclusion, because the trend depends on where the start and the end are in terms of cycles in natural variation. In other words, while there might well be some AGW in a trend of rising temperatures, if you peg the trend calculation to start during a cold period, and to end during a warm period, then the trend will capture a spurious increase due to natural climate variation. In fact, this is exactly what the IPCC did, in Chapter 3 of AR4, by splitting the 20 century at 1950 to argue that warming in the second half of the century was so much greater than in the first half that it must be due to AGW. As for “robust,” and

    Moving the start of the analysis period up only five years, the MSD flops sign from cooling to warming. MSD for 1885-2009 = 0.007.

    I think you need to recheck your calculations:

    ———————————————————
    Summary Statistics, using the observations 1885:01 – 2009:10
    for the variable ‘sd_GIS’ (1498 valid observations)

    Mean -0.0025367
    Median 0.00000

    Summary Statistics, using the observations 1885:01 – 2009:10
    for the variable ‘sd_CRU’ (1498 valid observations)

    Mean -0.0064085
    Median 0.00000
    ——————————————————

    In any case, the variability of the monthly seasonal difference is so high that moving around the beginning and ending points is not going to make a “statistically significant” difference, no matter how hard you try. And that is an important point — that the volatility is that great.

    Incidentally, in your “shortcut” approach, you are the one not using all the data, and as a result of that, you cannot provide a true estimate of volatility (standard deviation) for the data. I can.

    Incidently, I calc’d a standard linear regression trend for the GISS homogenized data 1881-2009. I did it a couple of different ways, and end up with a cooling trend each time. What method did you use to calculate trend that shows a warming in the GISS data over that period?

    Here’s what I have:

    ———————————-
    OLS estimates using the 1534 observations 1882:01-2009:10
    Dependent variable: sd_GIS
    HAC standard errors, bandwidth 8 (Bartlett kernel)

    coefficient std. error t-ratio p-value
    ———————————————————
    const -0.0378095 0.162683 -0.2324 0.8162
    time 3.22807E-05 0.000172980 0.1866 0.8520

    Mean of dependent variable = -0.0126467
    Standard deviation of dep. var. = 2.65862
    Sum of squared residuals = 10835.3
    Standard error of the regression = 2.65945
    Unadjusted R-squared = 0.00003
    Adjusted R-squared = -0.00062
    Degrees of freedom = 1532
    Durbin-Watson statistic = 1.7165
    First-order autocorrelation coeff. = 0.140591
    Log-likelihood = -3676.08
    Akaike information criterion (AIC) = 7356.17
    Schwarz Bayesian criterion (BIC) = 7366.84
    Hannan-Quinn criterion (HQC) = 7360.14
    ——————————————

    What did you do about the missing value in the GIS data set?

  142. Seth (22:06:12)

    yeah, I’d say it’s not only useful, but a requirement that this data be used. And you’re right, it is only the sea -surface- temperature (SST). i think the CRU used the HADSST2 dataset for the IPCCar4, which is composed of the:

    “…International Comprehensive Ocean-Atmosphere Data Set, ICOADS, from 1850 to 1997 and from the NCEP-GTS from 1998 to present.”

    HadSST2 is produced by taking in-situ measurements of SST from ships and buoys…”

    http://badc.nerc.ac.uk/data/hadsst2/

    list of available marine datasets here: http://www.marineclimatology.net/wiki/index.php?title=Datasets

    IMO, the more data used, the more accurate the models… of course, there is the homogenization, “quality control” and gridding manipulation issues that effect accuracy…

  143. the US Navy believes that the Arctic Ocean will be ice free in the summer with in 20 years and that are subs will no longer be able to hide.

  144. EM. When I started looking a this years ago the daily data was a real eye opener. Primarily because in California there were other sources of daily data that were pristine: daily data from the Agriculture department. Stations in the middle of crop lands. It

  145. Thom (07:19:55) :

    thanks all for your great comments and insights… have you seen this? http://www.dailymail.co.uk/news/article-1235395/SPECIAL-INVESTIGATION-Climate-change-emails-row-deepens–Russians-admit-DID-send-them.html

    That was never in doubt. Lots of us, myself included, grabbed a copy off that Russian server before it disappeared from it. Without reading the article, I think somebody may be trying to make it sound like the Russians have admitted hacking into the CRU server, which is unlikely.

  146. E.M.Smith (01:50:48) :

    “Jeff (01:31:25) :
    I’m curious if the GISTemp data has both raw and homoginized data ?
    Because if that is the case I have seen several stepladder adjustments from raw to adjusted in Pa alone …
    adjustments that can’t be justified by station moves or UHI …”

    A rather complex question to answer that OUGHT to be simple. But, IMHO, the complexity shows where the “issue” starts…

    GIStemp takes in what is called “raw” data. Everyone studiously uses that label. It uses “raw” GHCN and USHCN. But when you go to NOAA / NCDC you find that the “raw” GHCN and USHCN are not raw. There is some ill defined “QA” and “homogenization” applied. …

    Thank you, sincerely, for that informative and entertaining post. However, you say “ill-define ‘QA'”. Yesterday someone on the thread about pasteurized data pointed to an NOAA web-page that explains the process steps, and includes references. Whether or not the QA is ill-defined must probably now await a look at those research papers behind the adjustments and render some judgement about them. They are pretty old papers, but NOAA appears to have just gotten around to applying them — am I wrong here?

    In my opinion the over-all result of this QA, shown in a graphic on the web page, looks a little suspicious. For example, the net effect of all this QA is to produce an adjustment that is a hockey stick of about 0.4C magnitude, warms nearby decades, and cools the 1930s. Therefore truly raw observations with no trend, will end up after correction with an accelerated warming up to the present time. I have several projects lined-up for my X-mas break, but I may add this to the list. Perhaps a few other people can do similarly and we can compare notes.

  147. I should clarify a couple of things in my last post. The over-all effect of these corrections I should have explained is statistical and as it applies to the full set of data results in an average warming. The corrections applied to individual stations does all sorts of things. As I said in a post yesterday the corrections applied to the station at Griggsville, Illinois did absolutely nothing to actual values from about 1925 to 1945, but shifted them ten years into the past. Most stations end up with a warming trend if there wasn’t one before, and an enhanced warming trend if there was one before. Some of the adjustments are as large as 3C! Then there are a few stations which have adjustments that pop up and down all over.

    The most significant result is that the average of adjustments across all data is a warming trend that is an almost perfect confounder with a true greenhouse warming effect.

  148. Roger Knights (19:31:16) :

    Richard Wakefield:

    “I have long speculated that what we are seeing is not a true increase in temps, but a narrowing of variation below the maximum temps, which will tend to increase the average daily temps, but no real physical increase is occurring. That is, what we are seeing are shorter warmer winters, with no change in summer temps, which gives an increase in average temps over time.

    Also, I’m speculating that spring can come earlier and fall come later, which one can see with the temp changes in those transition months.

    This could be the killer of AGW if this is the case because there is no real increase in temps, just less variation below max temps for each month. The alarmism of a catastrophic future disappears then doesn’t it.”

    Amazing that mainstream science hasn’t looked into this possibility as a way of testing its hypothesis. (Actually, it’s not amazing.)

    Actually mainstream science was saying exactly this in the late 1990s. There was widespread agreement that the range of daily maxima and minima had narrowed, with minima increasing. Tom Karl, i think it was, pointed this out in a paper I read in Science Magazine at the time, and also pointed out that almost all of the warming had occurred in western North America and Siberia.

  149. Basil,

    “Well, I still do not see how a careful reading, in the first place, would have led to your mistake:”

    Largely because of your description of the process from your former article:

    … a particularly useful type of differencing is seasonal differencing, i.e., comparing one month’s observation to the observation from 12 months preceding. Since 12 months have intervened in computing this difference, it is equivalent to an annual rate of change, or a one month “spot” estimate of the annual “trend” in the undifferenced, or original, series.

    I’m used to using single month trends in other work, and that description suggested a similar process to me. At any rate, we are now on the same page, and the original criticisms still apply. (Speaking of original criticisms that still apply – you are still incorrectly identifying HadCrut3 data as unadjusted. It is not. HadCrut3 is homogenized data.)

    Simply apply this equation for each month you wish to include in your ‘mean seasonal difference’, and average the results.

    Yes, this would work. ”

    Now, digest the import of that. Given that it does work, it means that the trend you calculate is nothing more than the slope between the endpoints. It is a two-point trend line, and it is subject to all of the issues of a two-point trend line.

    “But I am not just interested in the average.”

    Please. Do not hand wave. That average/trend is the focus of your post. You discuss its sign (underlined and bolded). You discuss its magnitude (for two datasets) . You compare that average/trend from HadCrut3 to that average/trend from GISS (incorrectly asserting that this has something to do with the GISS adjustments vs unhomogenized data). That average/trend is the only metric that you quantify.

    You even make this statement:

    Here, I’m only showing the fit relative to the smoothed (trend) data. (It is, however, exactly the same as the fit to the original, or unsmoothed, data.)

    Of course its the same! The trend you calcualte is in no way dependant on the data between the endpoints, so smoothing the data between the endpoints has no effect. This isnt because (as you caution the reader) you were careful to pick a smoothing method that doesnt affect the average. No smoothing method would affect the trend, so long as it honors the endpoints. You could replace the data between the endpoints (1881 & 2009) with 126 years of Major League Baseball scores, and the trend you calculate would not change!

    I’ll admit that there is a certain ‘robustness’ to a trend that is blind to 98.5% of the data it is supposed to be describing. That does, however, render it decidedly non-robust to the values of that 1.5% of the data that the trend is wholey dependant on – the endpoints. You dont seem to grasp that this endpoint sensitivity is a feature of your average/trend calculation, given that you say this:

    This very important point is lost in fitting simple linear trends through undifferenced temperature data. When doing the latter, and there is a tendency to attribute any rising trend to AGW. But that is a spurious conclusion, because the trend depends on where the start and the end are in terms of cycles in natural variation. In other words, while there might well be some AGW in a trend of rising temperatures, if you peg the trend calculation to start during a cold period, and to end during a warm period, then the trend will capture a spurious increase due to natural climate variation.

    That is your complaint about other trend calcs, but that is precisely what your ‘mean seasonal difference/trend’ does! And it is more sensitive to endpoint effects than other trend calcs because the endpoints are all it uses!

    “I want to see the patterns in how the average changes over time. These patterns are indicative of “natural climate variation” and need to be quantified…”

    The only thing you quantify is the two-point trend.

    “This much – and it is very much of the data – is completely natural, and cannot be attributed to any kind of anthropogenic influence, whether UHI, land use/land cover changes, or, heaven forbid, greenhouse gases.”

    That is an unsupported assertion. I believe it to be incorrect. A difference series filters stepwise adjustments pretty well. It does not capture continuous trends worth a damn.

    “I think you need to recheck your calculations:”

    I did. My calculations are correct.

    As are yours, except that your calculation represents the trend 1884-2009. Mine represents 1885-2009. The 1884-2009 trend is -0.0025. The 1885-2009 trend is 0.007. This quite neatly illustrates the problem with your two-point trend. Change the period by one year, either intentionally or (as here) by accident, and the trend can flop direction. Result is a trend going more than twice as fast the other way. From a one year shift in endpoint.

    Run the numbers for yourself, pushing the start date back and forth by five or six years, and the end date back and forth two or three. Watch the trend flop around.

    “What did you do about the missing value in the GIS data set?”

    As I carry my analyses to the end of 2009, there are three missing values in my dataset (Aug 2002, Nov-Dec 2009). I interpolate the Aug 2002 value from the previous and following years, and copy forward Nov-Dec 2008 to fill in the missing 2009 values. This greatly increases the ease of computation, and has no effect on the trend results thru the third decimal place.

  150. Kevin Kilty (11:56:40) : , Your point about narrowing of maxima/minima concerns the problems of inland land temperature measurement with Stevenson screens reading an air layer that may be convenient for humans but inappropriate for consistent long term records.
    In another thread it was stated that the problem with climate science is that is deals with averages and not absolutes. If the interest is in warming (that certainly seems to be the bias), why not select maxima only, this would stop the dispute about urban/rural sites and UHI. But that might spoil the game.

    On another issue why not look at temperature records (maxima) on small islands like Malta and Rapanui ?,my reading of these is that they are remarkably stable and measure for the most part, sea breezes.

  151. JJ,

    Run the numbers for yourself, pushing the start date back and forth by five or six years, and the end date back and forth two or three. Watch the trend flop around.

    As I said earlier, all these changes are well within the range of error because of the large standard deviation.

    We could argue all day and night about the “right” way to compute a trend, because there is no clear cut answer. “It depends.” The method I’m using has its usefulness in showing how temperature varies because of natural climate variation. The variation you see in the final figure can be quantified in other ways, as well, such as with wavelet transforms, and spectral analysis. At this point, our p***ing match has taken us away from the main point of the post, and has done nothing to advance the discussion. However you slice and dice it, the GISS homogeneity adjustment increases the trend compared to HadCRUT. It would be nice to understand the reason for this.

    If you want to add substantively to the discussion, maybe you could offer your take on my response to a comment by Steven Mosher. I said:

    Let me see if I understand. The GISS adjustment is to make Nashville “rural.” And it does this by making it appear that Nashville has warmed more than it actually has? Shouldn’t the adjustment be doing just the opposite? If the “unadjusted” Nashville trend was already sloping downward, any adjustment to remove UHI should have increased the downward slope, not turn it into a positive slope! It is almost as if the adjustment, rather than removing UHI, has artificially enhanced it.

    Any thoughts?

  152. Jeff, you write “The really raw USA data go to NOAA / NCDC. It goes through their sausage grinder and comes out as part of the GHCN data set (in degrees C) or the whole USHCN data set in degrees F. ”

    Do you really know that for GHCN?

    Here is a description of what could have happened to the data from Darwin Australia (if it was one of the stations supplied to GHCN), but it might have been changed up or down since this reference was written:

    Key
    ~~~
    Station
    Element (1021=min, 1001=max)
    Year
    Type (1=single years, 0=all previous years)
    Adjustment
    Cumulative adjustment
    Reason : o= objective test
    f= median
    r= range
    d= detect
    documented changes : m= move
    s= stevenson screen supplied
    b= building
    v= vegetation (trees, grass growing, etc)
    c= change in site/temporary site
    n= new screen
    p= poor site/site cleared
    u= old/poor screen or screen fixed
    a= composite move
    e= entry/observer/instument problems
    i= inspection
    t= time change
    *= documentation unclear

    14015 1021 1991 0 -0.3 -0.3 dm
    14015 1021 1987 0 -0.3 -0.6 dm*
    14015 1021 1964 0 -0.6 -1.2 orm*
    14015 1021 1942 0 -1.0 -2.2 oda
    14015 1021 1894 0 +0.3 -1.9 fds
    14015 1001 1982 0 -0.5 -0.5 or
    14015 1001 1967 0 +0.5 +0.0 or
    14015 1001 1942 0 -0.6 -0.6 da
    14015 1001 1941 1 +0.9 +0.3 rp
    14015 1001 1940 1 +0.9 +0.3 rp
    14015 1001 1939 1 +0.9 +0.3 rp
    14015 1001 1938 1 +0.9 +0.3 rp
    14015 1001 1937 1 +0.9 +0.3 rp
    14015 1001 1907 0 -0.3 -0.9 rd
    14015 1001 1894 0 -1.0 -1.9 rds

    Source: Aust Bureau of Meteorology
    ftp://ftp2.bom.gov.au/anon/home/bmrc/perm/climate/temperature
    File alladj.utx.Z

    This has several hundred Australian stations. So when you talk of really raw data going to NOAA, I am completely unsure what the BOM supplies because I have never seen a reference to what is supplied. It could range from unmanipulated data to rather thoroughly adjusted data. I simply do not know what is sent, but there is a possibility that it is already adjusted.

    Same applies to CRU. The Darwin data that CRU released recently is almost identical to a set that Warwick Hughes was given in about 1993, until about 1991. Then it diverges by about 0.3 deg C from recent BOM online data in a way that I cannot understand. But there is no way I can check if this data set was then used by CRU with or without further adjustment, when making global models. There is too much obfuscation.

  153. Geoff Sherrington (19:55:36) :
    Jeff, you write “The really raw USA data go to NOAA / NCDC. It goes through their sausage grinder and comes out as part of the GHCN data set (in degrees C) or the whole USHCN data set in degrees F. ”

    Do you really know that for GHCN?

    Um, I think that quote was from me ;-)

    I was particularly taking about the “raw USA data”. I don’t know what goes into GHCN from other countries nor how it has been “adjusted” (and sadly, I suspect few people on the planet do… probably only the person at NOAA/NCDC that works with that individual country BOM and feeds the ‘product’ into the GHCN sausage mill … )

    So if you can get your hands on that 0.3 C lower earlier data set, it is a valuable bit of forensic evidence… From what I’ve seen the Australian BOM is engaged in the same shenanigans as NCDC / CRU et. al. There are 110 lines of text in the emails that look to be Australian:

    Snow-Book:~/Desktop/FOIA/mail chiefio$ grep Austral * | wc -l
    110

    So lots of “dig here” for folks Down Under. Like this email where they talk about a WUWT posting and how to cool down a warm blip in the 1940’s along with a complaint that ” Neville has never been successful getting any OZ funding to sort out pre-1910 temps” so I’d guess The Aussie BOM has a Neville?

    Snow-Book:~/Desktop/FOIA/mail chiefio$ cat 1254147614.txt
    From: Phil Jones
    To: Tom Wigley
    Subject: Re: 1940s
    Date: Mon Sep 28 10:20:14 2009
    Cc: Ben Santer

    Tom,
    A few thoughts
    [1]http://ams.allenpress.com/archive/1520-0442/preprint/2009/pdf/10.1175_2009JCLI3089.1.pd
    f
    This is a link to the longer Thompson et al paper. It isn’t yet out in final form – Nov09
    maybe?
    [2]http://wattsupwiththat.com/2009/09/24/a-look-at-the-thompson-et-al-paper-hi-tech-wiggle
    -matching-and-removal-of-natural-variables/
    is a link to wattsupwiththat – not looked through this apart from a quick scan. Dave
    Thompson just emailed me this over the weekend and said someone had been busy! They seemed
    to have not fully understood what was done.
    Have looked at the plots. I’m told that the HadSST3 paper is fairly near to being
    submitted, but I’ve still yet to see a copy. More SST data have been added for the WW2 and
    WW1 periods, but according to John Kennedy they have not made much difference to these
    periods.
    Here’s the two ppts I think I showed in Boulder in June. These were from April 09, so
    don’t know what these would look like now. SH is on the left and adjustment there seems
    larger, for some reason – probably just British ships there?
    Maybe I’m misinterpreting what you’re saying, but the adjustments won’t reduce the 1940s
    blip but enhance it. It won’t change the 1940-44 period, just raise the 10 years after Aug
    45.
    I expect MOHC are looking at the NH minus SH series re the aerosols. My view is that a
    cooler temps later in the 1950s and 1960s it is easier to explain.
    Land warming in the 1940s and late 1930s is mainly high latitude in NH.
    One other thing – MOHC are also revising the 1961-90 normals. This will likely have more
    effect in the SH.
    With the SH around 1910s there is the issue of exposure problems in Australia – see
    Neville’s paper.
    This shouldn’t be an issue in NZ – except maybe before 1880, but could be in southern
    South America. New work in Spain suggest screens got renewed about 1900, so maybe this
    happened in Chile and Argentina, but Mossmann was head of the Argentine NMS so he may have
    got them to use Stevenson screens early.
    Neville has never been successful getting any OZ funding to sort out pre-1910 temps
    everywhere except Qld.
    Here’s a paper in CC on European exposure problems. There is also one on Spanish series.
    Cheers
    Phil
    At 06:25 28/09/2009, Tom Wigley wrote:

    I cut off the quote of an earlier email – EMSmith

  154. steven mosher (10:06:50) :
    EM. When I started looking a this years ago the daily data was a real eye opener. Primarily because in California there were other sources of daily data that were pristine: daily data from the Agriculture department. Stations in the middle of crop lands. It

    Looks like a sudden end at “It”… but yes, I’d expected to see thousands of thermometers in California. The “4 at the beach” really was an eye opener… We ought to have department of Ag, and all those forest fire stations, and all the little puddle jumper airports, and …

  155. Keith Minto (14:56:16) :

    Maybe this is off topic for the thread, but I was hoping you or Basil or anyone really, may be able to answer. Why not just study the mode for each continent and Micronesia? If it is getting warmer, then the modes would be higher each year, yes?

  156. Basil,

    “As I said earlier, all these changes are well within the range of error because of the large standard deviation. We could argue all day and night about the ‘right’ way to compute a trend, because there is no clear cut answer. ‘It depends.'”

    Your articles here have put the issue of choosing trend calcs in the center of the table, both by decrying the use of other trends as being sensitive to endpoint choice, and by offering this alternate method to allegedly correct that deficiency. And yet that method is demonstrably more susceptible to such effects. Unlike other trend calcs, such as standard linear regression, your method only uses the endpoints of the data. You need to be prepared to address the issues you raise.

    “The method I’m using has its usefulness in showing how temperature varies because of natural climate variation.”

    That remains an unsupported assertion. And unquantified.

    “The variation you see in the final figure can be quantified in other ways, as well, such as with wavelet transforms, and spectral analysis.”

    Yet more obtuse analyses are not the answer, when at issue is a method that, while much simpler, seems to have successfully hidden a two-point trend.

    “At this point, our p***ing match…”

    I do not consider this a p!$$ing match. It is worrying that you do. Teamish. Your article contains a couple of large, obvious errors, and a few untested methods and concepts that require examination. You should be far more receptive to constructive criticism and peer review than you are, IMHO. I am trying to help.

    “… has taken us away from the main point of the post, and has done nothing to advance the discussion.”

    Nonsense. This discussion has been about your sd method, and that method was fully two thirds of your post.

    The other third of your post was your mistaken interpretation of GISS vs Hadcrut3. Why have you not addressed the fact that, contrary to the assumptions and claims of your post, HadCrut3 is not the antecdent to GISS, and HadCrut3 is homogenized data?

    “However you slice and dice it, the GISS homogeneity adjustment increases the trend compared to HadCRUT.”

    The use of the word ‘increases’ here is misleading. It implies that GISS starts with HadCrut3 and operates to change a cooling trend to a waming trend. That is not true.

    First, GISS does not start with HadCrut3 data. GISS starts with unadjusted data. Those data are available to you. Use them.

    Second, it is not clear that the homogenized GISS data in fact show a warming trend. I have fit linear regressions to the homogenized GISS Nashville data a couple of different ways (monthly and yearly), and it shows a small cooling trend both ways.

    Earlier, I asked you how you calculated the trend in homogenized GISS data (your second figure), and your response was to post stats for your transformed SD data. How did you arrive at a warming trend in non-sd transfomed GISS data?

    “If you want to add substantively to the discussion,…”

    Snarkiness is not necessary. Nor is it supported by your position.

    “… maybe you could offer your take on my response to a comment by Steven Mosher.”

    OK:

    “Let me see if I understand. The GISS adjustment is to make Nashville “rural.” And it does this by making it appear that Nashville has warmed more than it actually has?”

    No.

    In point of fact, the GISS homogeneity adjustments for Nashville change a slight warming trend in the unhomogenized data into a slight cooling trend. I learned this by downloading the freely available unadjusted GISS temp data and running a comparison. It took all of six or seven minutes.

    “Any thoughts?”

    Yes.

    My thought is that you would have known that the GISS homogeneity adjustments for Nashville induce a cooling trend, had you bothered to download the unhomogenized GISS data and use it for your comparison, instead of using HadCrut3 data and pretending that it was unadjusted data.

    You need to correct your article.

  157. JJ,

    I wrote:

    “The method I’m using has its usefulness in showing how temperature varies because of natural climate variation.”

    You responded:

    That remains an unsupported assertion. And unquantified.

    And I wrote:

    “The variation you see in the final figure can be quantified in other ways, as well, such as with wavelet transforms, and spectral analysis.”

    To which you responded:

    Yet more obtuse analyses are not the answer, when at issue is a method that, while much simpler, seems to have successfully hidden a two-point trend.

    What is your simpler method here, to demonstrate the range and frequency of natural climate variation? A simple trend line through 130 years of data? The method is the same that Anthony and I used to quantify an effect, in global temperatures, that could plausibly be related to lunisolar influences. While I haven’t looked specifically at the Nashville data in this respect, the kind of cycles that are revealed in the final figure are similar to the pattern of drought cycles in the West that have been extensively studied, often just with frequency analysis. So it is an advance in analytical method to be able to demonstrate these cycles in the time domain, using a method that is perhaps less obtuse than wavelet analysis.

    I do not understand your hostility to this point. I may understand it to the point I take up next, but to so flatly deny any usefulness to the method I’ve used in the context of the study of the range and frequency of natural climate variation, while at the same time championing the superiority of fitting a simple linear trend suggests to me that you are trying hard to find something not to like in all of this.

    Allow me to frame this issue between us as I see it. Yes, the “average trend” over any period is more customarily estimated using linear least squares. I do not deny that. But there are a number of potential difficulties in interpreting such an “average trend.” The most important is that a fundamental assumption — that the deviations around the trend line be random — is hardly, if ever, met when fitting trend lines to temperature data. And the reason for that is quite simple: temperature is not random. There are very clear patterns of natural climate cycles in temperate data, in which it rises and falls over roughly decadal time frames. Being able to quantify and delineate the range and frequency of those cycles is a relevant task for climate science. You may dispute whether the method I’ve proposed is the best way to do that. But you may not dispute that what you seem to be championing here — simple linear trends — will not do it.

    Relatedly, because there are cycles that can be discerned in temperature data, the method of fitting a linear trend line through temperature data is easily subjected to cherry picking, and critically dependent upon start and stop dates. I mentioned in one of my replies that this problem underlies the IPCC’s Chapter 3 of AR4 computation of a linear trend in global temperature for the second half of the 20th century. As you’ve well shown, the method I’m using also appears to be subject to cherry picking, in that the result can be very different just by changing the start or stop date by a few years. But there is a difference. With linear regressions through undifferenced temperature data, there is the false sense of precision given by standard deviations that make the trend appear to be significantly different than zero. Now that is not the case here, because even linear trends through the undifferenced data are not significantly different than zero. In any case, in the method I’m using, none of the differences created by changing the start and stop date were statistically significant.

    Moving on…

    I wrote:

    “However you slice and dice it, the GISS homogeneity adjustment increases the trend compared to HadCRUT.”

    You responded:

    The use of the word ‘increases’ here is misleading. It implies that GISS starts with HadCrut3 and operates to change a cooling trend to a waming trend. That is not true.

    Where do you get the idea that my statement “implies that GISS starts with HadCRUT3?” Regardless of what HadCRUT3 represents, the statement is factually true and correct: the GISS “homogeneity adjustment” does increase the trend compared to HadCRUT3. That is the only conclusion possible from my first figure. Nowhere did I say, nor do I think I implied, that HadCRUT3 is the same as GISS before applying the homogeneity adjustment.

    On this:

    Second, it is not clear that the homogenized GISS data in fact show a warming trend. I have fit linear regressions to the homogenized GISS Nashville data a couple of different ways (monthly and yearly), and it shows a small cooling trend both ways.

    Earlier, I asked you how you calculated the trend in homogenized GISS data (your second figure), and your response was to post stats for your transformed SD data. How did you arrive at a warming trend in non-sd transfomed GISS data?

    First, a clarification is called for. My second figure plots trends in the seasonal differences, and since the labels do not make that clear, I can understand how I created some confusion there.

    However, some question still remains about our respective data sets for GISS, because when I use the undifferenced, i.e. the “original”, data, I get:

    ———————-
    OLS estimates using the 1546 observations 1881:01-2009:10
    Dependent variable: GIS
    HAC standard errors, bandwidth 8 (Bartlett kernel)

    coefficient std. error t-ratio p-value
    ————————————————————
    const 15.1365 0.396033 38.22 1.69E-225 ***
    time 1.75087E-05 0.000446020 0.03926 0.9687
    ———————————-

    Still positive, not negative as you seem to be coming up with.

    For comparison, here’s CRU:

    ———————————————-
    OLS estimates using the 1546 observations 1881:01-2009:10
    Dependent variable: CRU
    HAC standard errors, bandwidth 8 (Bartlett kernel)

    coefficient std. error t-ratio p-value
    ———————————————————–
    const 15.9017 0.389643 40.81 1.30E-247 ***
    time -0.000307072 0.000440654 -0.6969 0.4860
    ————————————–

    I asked earlier about missing data. Let’s compare the specific values we input for August 2002. I input 26.5, which is -0.1 from the HadCRUT3 value, same as in the month preceding, and the month after. What did you use for the missing value?

    Which brings me back to the issue of what, if any, is the relationship between GISS (w/homogeneity adjustment) and HadCRUT (as released). They start off exactly the same in recent years. When they start to differ, they differ in a very systematic fashion, which might well imply that GISS is adjusting something that is close to the same, if not the same, as the HadCRUT numbers. Reading back at what I wrote, I do see where I refer to the HadCRUT numbers as “unadjusted.” On that, you wrote:

    The other third of your post was your mistaken interpretation of GISS vs Hadcrut3. Why have you not addressed the fact that, contrary to the assumptions and claims of your post, HadCrut3 is not the antecdent to GISS, and HadCrut3 is homogenized data?

    I do not know if it is the antecedent or not. The systematic nature of the differences suggests some relationship, but I do not know what it is. As for HadCRUT3 being homogenized, I’ll concede you half a point here. All the monthly data has been adjusted to some degree or another. I do believe that GISS is here making an adjustment, which they call a “homogeneity” adjustment, which is in addition to whatever adjustments are in the HadCRUT3 data. On the latter, the UK Met Office says:

    The data that we are providing is the database used to produce the global temperature series. Some of these data are the original underlying observations and some are observations adjusted to account for non climatic influences, for example changes in observations methods.

    I take this to mean they haven’t done anything to the data they receive from the various national met agencies. Now that may not be entirely correct, but I think it is undisputed that GISS’ “homogenity” adjustment is for a purpose that HadCRUT has clearly not done anything comparable. GISS’ adjustment is to “homogenize” urban stations so that they look more like the surrounding rural stations:

    Hansen et al. modify the GHCN/USHCN/SCAR data in two steps to get to the station data on which all their tables, graphs, and maps are based: in step 1, if there are multiple records at a given station, these are combined into one record; in step 2 they adjust the non-rural stations in such a way that their long-term trend of annual means matches that of the mean of the neighboring rural stations. Records from urban stations without nearby rural staitons are dropped.

    Source: http://cdiac.ornl.gov/trends/temp/hansen/hansen.html

    You are not saying that CRU has done something comparable to GISS’ “step 2″ here, are you? My take is that HadCRUT has done something similar to GISS’ “step 1,” but not its “step 2.” Since the latter is in the global temperate data set for GISS that so many of us track, but not in the HadCRUT data set, it may well explain why GISS seems to have more warming than HadCRUT. But if so, I still contend that is a perverse result: if the purpose of the adjustment is to “ruralize” the urban locations, i.e. remove UHI effects, then GISS should show less cooling that HadCRUT, not more. What’s up with that?

    Basil

  158. Basil,

    “Where do you get the idea that my statement “implies that GISS starts with HadCRUT3?””

    It is implicit in the logic of the discussion of your article, and explicit in your further comments. Quoting you from above:

    “I didn’t say HadCRUT3 was “raw.” I referred to it as “unadjusted.” Now what I meant by that is that it should be the same as GISS before GISS applies its “homogeneity” adjustment.)

    Yet you now say:

    “Nowhere did I say, nor do I think I implied, that HadCRUT3 is the same as GISS before applying the homogeneity adjustment.”

    Are you channeling Michael Mann? Is Phil Jones feeding you these lines? You are currently acting like a Team player.

    Your article has several demonstrable errors, and you now appear to be in full denial mode.

    If you will not correct your article, Anthony should pull it before you cause further embarassment to yourself and this site.

  159. JJ,

    Your article has several demonstrable errors, and you now appear to be in full denial mode.

    Please list specifically, the demonstrable errors, as opposed to matters of judgment about methods or interpretation of results.

    After reviewing them, I’ll consider what to do about them.

    Basil

  160. When looking at various ways of analyzing data derived from observation and experimentation, it would be helpful to present a synthetic data model (a known signal with specified noise) to see how each analysis technique can tease out information from within the noise.

    The observation data you are using is from a turbulent system that has fluctuations over timescales ranging from below your sampling interval to longer than the available data. If you use synthetic data with properties similar to your observational data you can identify the strengths and weaknesses of the competing analysis procedures. Specify a temperature series with short and long timescale fluctuations with an underlying linear trend and see if either of the analysis techniques can pick out the trend accurately. Then apply each technique to the observational data using what you know from the synthetic analysis for interpretation.

  161. Mark Nornberg (10:23:47) : Isnt the data used by IPCC, the temperature records over 150 years, that are compiled from several sources and spliced together and are sparser and deser over time, also data from a turbulent system that has fluctuations over timescales ranging from below their sampling interval to longer than the available data, which concludes that the signal from 1950 to 2000 is from AGW, ignoring other more plausible explanations?

  162. Basil,

    The fundamental factual errors in your article are these two false assumptions, and the various combinations and permutations of them that arise throughout the document:

    1) Error: HadCrut3 is ‘unadjusted’ data. Truth: HadCrut3 is homogenized data.

    2) Error: GISS homogenized data are derived from HadCrut3, or from a file that is the same as HadCrut3. Truth:GISS homogenized data are derived from GISS combined station data, and that is nothing like Hadcrut3.

    Lest you think I am nitpicking minor differences, the Hadcrut3 file you used (#723270, incidently) differs from the GISS unadjusted data by an average of 0.4C, and … get this … as much as 11.5C on individual monthly records. OYG!!

    The clear intent of the first part of your article was to discern the effects of the GISS homogenization adjustments, especially on trend. That sounds like a good thing to do, but that is not what you did. To determine the GISS adjustments, you need to compare GISS combined station data with GISS homogenized data. Both are available for download. Get them and use them.

    You may also wish to compare GISS adjusted data to Hadcurt3. No problem. Just correctly identify and interpret what you are doing. You are not comparing adjusted GISS data with unadjusted Hadcrut3 data. You are not assessing the GISS adjustments. You are comparing two different adjustments applied to two different unadjusted datasets. The fact that the results differ by so much is an interesting issue, when correctly identified and interpreted.

    You may want to speak to the HADcrut3 adjustments. Be careful how you do that. The truth is, we dont know what the @!#$! those dip$#!^$ at CRU did to arrive at the file you are looking at. We dont know what they started with, and we certainly dont know what ‘value’ they added to it.

    We have reason to believe that the data that CRU started with was GHCN, and Nashville is in there. GHCN is not necessarily the same as GISS unadjusted, so dont make that mistake. Probably, what CRU used was GHCN homogenized data. Maybe it was GHCN unadjusted data. We dont know.

    Both raw and adjusted GHCN data are available for download. So if you want to look at HAdcrut3 adjustments, and need an unadjusted base to compare to, you can get what is probably the best guess as to what that base was. But it is still a guess.

    Also, please document and carefully check the trends that you quote for these data. I have fit least squares trend lines to the unadjusted and homogenized GISS data, and they both differ in sign from what you report here.

  163. JJ,

    So your “several” demonstrable errors reduce to two, one of which I’ve addressed, and the other of which is wrong.

    “1) Error: HadCrut3 is ‘unadjusted’ data. Truth: HadCrut3 is homogenized data.”

    You say it is homogenized. I say it is whatever CRU says it is, and copied the language above. Even if it is in some way adjusted (homogenized), it isn’t for the “homogeneity” adjustment of GISS, which is a purported UHI adjustment.

    “2) Error: GISS homogenized data are derived from HadCrut3, or from a file that is the same as HadCrut3. Truth:GISS homogenized data are derived from GISS combined station data, and that is nothing like Hadcrut3.”

    Did I ever actually say that “GISS homogenized data are derived from HadCRUT3, or from a file that is the same as HadCrut3″? I do think they both, in the Nashville case, go back to the same source, but I don’t think I ever made the claim as you state it. In any case, “for the record,” my purpose was to compare HadCRUT3, regardless of provenance, to GISS, with the latter’s homogeneity adjustment.

    If I’m in full denial mode, you are for some reason determined that I should fall on my sword and issue a public retraction. I don’t think I can please you, and be true to myself. All I can do is try to explain myself, as best I can.

    I am, however, concerned a bit with this:

    “Also, please document and carefully check the trends that you quote for these data. I have fit least squares trend lines to the unadjusted and homogenized GISS data, and they both differ in sign from what you report here.”

    That’s why I have asked you what you used for the missing value, though it would be surprising if any reasonable substitute for the missing value would account for this. I am more than happy to continue a dialog about this, but have no attention of engaging you further in the other matters. On this, for starters, maybe we could each just share some summary statistics for our GIS variables. For mine:

    Summary Statistics, using the observations 1881:01 – 2009:10
    for the variable ‘GIS’ (1546 valid observations)

    Mean 15.150
    Median 15.400
    Minimum -4.5000
    Maximum 30.500
    Standard deviation 8.3482
    C.V. 0.55104
    Skewness -0.13403
    Ex. kurtosis -1.2930

    Summary Statistics, using the observations 1881:01 – 2009:10
    for the variable ‘sd_GIS’ (1534 valid observations)

    Mean -0.012647
    Median 0.00000
    Minimum -10.900
    Maximum 9.5000
    Standard deviation 2.6586
    C.V. 210.22
    Skewness 0.070627
    Ex. kurtosis 0.79037

    I don’t know if you have the ability to quickly repeat exactly the same data as above, but mean, median, and standard deviation shouldn’t be too hard. Let’s see how close we are in these figures, to see if we’re using anything close to the same data.

    Basil

  164. Basil,

    “So your “several” demonstrable errors reduce to two, one of which I’ve addressed, and the other of which is wrong.”

    No. Two fundamental errors remain, and they propagate throughout your article, creating more errors.

    1) Error: HadCrut3 is ‘unadjusted’ data. Truth: HadCrut3 is homogenized data.

    “You say it is homogenized. I say it is whatever CRU says it is, and copied the language above.”

    CRU says it is homogenized! That is what ‘adjusted to account for non climatic influences’ means. And, as I mentioned to you earlier, those CRU homogenization adjustments are NOT small. Phlim Phlam Phil Jones added a LOT of ‘value’ to some of those data.

    “Even if it is in some way adjusted (homogenized), …”

    Then your assumption and multiple statements that it is ‘unadjusted data’, and the conclusions you draw while operating from that assumtion, are wrong.

    Man up and admit that.

    “… it isn’t for the “homogeneity” adjustment of GISS, which is a purported UHI adjustment.”

    That has no bearing whatsover, as your article is not in anyway a specific address of UHI. FFS you only mention UHI once, in a list of possible anthropogenic effects, at the end of the article. Stop grasping at straws.

    Man up an admit that you were wrong.

    2) Error: GISS homogenized data are derived from HadCrut3, or from a file that is the same as HadCrut3. Truth:GISS homogenized data are derived from GISS combined station data, and that is nothing like Hadcrut3.”

    “Did I ever actually say that “GISS homogenized data are derived from HadCRUT3, or from a file that is the same as HadCrut3″?”

    YES! More than once. Here:

    “I didn’t say HadCRUT3 was “raw.” I referred to it as ‘unadjusted.’ Now what I meant by that is that it should be the same as GISS before GISS applies its ‘homogeneity’ adjustment.” and again, here:

    “If you go to the GISTemp web site, you get the option of downloading its “pseudo-raw” version of the data, i.e. the data before it applies its “homogeneity” adjustment. I could have used that, instead of HadCRUT3, and I believe that the results would have been similar, if not the same.”

    You assumed that HADcrut3 was the same as GISS prior to adjustment. You assumed that substituting HADcrut3 for unadjusted GISS temp data would give the same results. The balance of your article makes sense under that assumption, and is non sensical otherwise. Stop contradicting your own words, man up, and admit that you were wrong.

    “In any case, “for the record,” my purpose was to compare HadCRUT3, regardless of provenance, to GISS, with the latter’s homogeneity adjustment.”

    Clearly, that was not the purpose in what you wrote. Your purpose was to assess the effect of the GISS temp homogeneity adjustment, demonstrate by comparison to ‘unadjusted’ data that the GISS temp homogeniety adjustment is inadequate, and offer your ‘sd method’ as a superior substitute. That is what you were trying to do. For the record, that would have been a sensible and interesting thing to have done. And the way you used HADcrut3 data, under your assumption that HADcrut3 data were unadjusted, is perfectly consistent with that approach.

    The only problem is, your assumption that HADcrut3 data were the same as GISS temp data before the homogenization adjustment was wrong.

    That is a simple error. It is easily corrected. You have access to the unadjusted GISS temp data. All you have to do to correct your error is actually use those data, instead of using HADcrut3 data that you thought were the same but arent. Why will you not simply do that?

    Evidently, this is why:

    [snip]

    Correcting your error would require you to admit that you made and error. You would rather make a fool of yourself denying a simple, easily fixed error, than admit to having made a simple, easily fixed error.

    You would rather Mann up, than man up.

    Anthony should strongly reconsider further collaboration with someone who exhibits Hockey Team behaviours. This site is supposed to be dedicated to exposing and correcting instances of bad science, not committing and rationalizing them.

  165. If anybody else wants to add to this, I’ll listen. But with each exchange, your tone becomes more and more strident. Plus, the ad hominem remarks — I ignored the “Mann” reference the first time, but now will not — betray something darker in your personality you need to get control of. But it looks like everybody else is moving on. I suggest you do the same. I’m done wasting my time with you.

Comments are closed.