GISS Hockey-Stick Adjustments

Guest Post By Walter Dnes:

Sign_of_RiskThere have been various comments recently about GISS’ “dancing data”, and it just so happens that as GISS data is updated monthly, I’ve been downloading it monthly since 2008. In addition, I’ve captured some older versions via “The Wayback Machine“. Between those 2 sources, I have 94 monthly downloads between August 2005 and May 2014, but there are somegaps in the 2006 and 2007 downloads. Below is my analysis of the data.

Data notes

  • I´ve focused on the data to August 2005, in order to try to make this an apples-to-apples comparison.
    1. The net adjustments between the August 2005 download and the May 2014 download (i.e. the earliest and latest available data). I originally treated 1910-2005 as one long segment (the shaft of the “hockey-stick”). Later, I broke that portion into 5 separate periods.
    2. A month-by-month comparison of slopes of various portions of the data, obtained from each download.
  • Those of you who wish to work with the data yourselves can download this zip file, which unzips as directory “work”. Please read the file “work/readme.txt” for instructions on how to use the data.
  • GISS lists its reasons for adjustments at two webpages:
    • This page lists updates from 2003 to June 2011. It is in chronological order is from the top of the page downwards.
    • This page lists more recent updates, up to the present. It is in chronological order is from the bottom of the page upwards.
  • The situation with USHCN data, as summarized in Anthony´s recent article , may affect the GISS results, as GISS global anomaly uses data from various sources including USHCN.

In the graph below, the blue dots are the differences in hundredths of a degree C for the same months between GISS data as of May 2014 versus GISS data as of August 2009. GISS provides data as an integer representing hundredths of a degree C. The blue (1880-1909) and red (1910-2005) lines show the slope of the adjustments for the corresponding periods. Hundredths of a degree per year equal degrees per century. The slopes of the GISS adjustments are…

  • 1880-1909 -0.520 C degree per century
  • 1910-2005 +0.190 C degree per century

The next graph is similar to the above, except that the analysis is more granular, i.e. 1910-2005 is broken up into 5 smaller periods. The slopes of the GISS adjustments are…

  • 1880-1909 -0.520 C degree per century
  • 1910-1919 +0.732 C degree per century
  • 1920-1939 +0.222 C degree per century
  • 1940-1949 -1.129 C degree per century
  • 1950-1979 +0.283 C degree per century
  • 1980-2005 +0.110 C degree per century

The next graph shows the slopes (not adjustments) for the 6 periods listed above on a month-by-month basis, from the 94 monthly downloads in my possession.

  • 1880-1909; dark blue;
    • From August 2005 through December 2009, the GISS data showed a slope of -0.1 C degree/century for 1880-1909.
    • From January 2010 through October 2011, the GISS data showed a slope between +0.05 and +0.1 C degree/century for 1880-1909.
    • From November 2011 through November 2012, the GISS data showed a slope around zero for 1880-1909.
    • From December 2012 through latest (May 2014), the GISS data showed a slope around -0.6 to -0.65 C degree per/century for 1880-1909.
  • 1910-1919; pink;
    • From August 2005 through December 2008, the GISS data showed a slope of 0.7 C degree/century for 1910-1919.
    • From January 2009 through December 2011, the GISS data showed a slope between +0.55 and +0.6 C degree/century for 1910-1919.
    • From January 2012 through November 2012, the GISS data showed a slope bouncing around between +0.6 and +0.9 C degree/century for 1910-1919.
    • From December 2012 through latest (May 2014), the GISS data showed a slope around 1.4 to 1.5 C degree per/century for 1910-1919.
  • 1920-1939; orange;
    • From August 2005 through December 2005, the GISS data showed a slope between +1.15 and +1.2 C degree/century for 1920-1939.
    • From May 2006 through November 2011, the GISS data showed a slope of +1.3 C degree/century for 1920-1939.
    • From December 2011 through November 2012, the GISS data showed a slope around +1.25 C degree/century for 1880-1909.
    • From December 2012 through latest (May 2014), the GISS data showed a slope around +1.4 C degree per/century for 1880-1909.
  • 1940-1949; green;
    • From August 2005 through December 2005, the GISS data showed a slope between -1.25 and -1.3 C degree/century for 1940-1949.
    • From May 2006 through December 2009, the GISS data showed a slope between -1.65 and -1.7 C degree/century for 1940-1949.
    • From January 2010 through November 2011, the GISS data showed a slope around -1.6 C degree/century for 1940-1949.
    • From December 2011 through November 2012, the GISS data showed a slope bouncing around between -1.6 to -1.7 C degree/century for 1940-1949.
    • From December 2012 through latest (May 2014), the GISS data showed a slope bouncing around between -2.35 to -2.45 C degree per/century for 1940-1949.
  • 1950-1979; purple;
    • From August 2005 through October 2011, the GISS data showed a slope between +0.1 and +0.15 C degree/century for 1950-1979.
    • From November 2011 through November 2012, the GISS data showed a slope bouncing around between +0.2 and +0.3 C degree/century for 1950-1979.
    • From December 2012 through latest (May 2014), the GISS data showed a slope around +0.4 C degree per/century for 1950-1979.
  • 1980-2005; brown;
    • From August 2005 through November 2012, the GISS data showed a slope of +1.65 C degree/century for 1980-2005.
    • From December 2012 through latest (May 2014), the GISS data showed a slope around +1.75 to +1.8 C degree per/century for 1980-2005.
  • 1910-2005; red;
    • This is a grand summary. From August 2005 through December 2005, the GISS data showed a slope of +0.6 C degree/century for 1910-2005.
    • From May 2006 through December 2011, the GISS data showed a slope of +0.65 C degree/century for 1910-2005.
    • From January 2012 through November 2012, the GISS data showed a slope bouncing around +0.65 to +0.7 C degree/century for 1910-2005.
    • From December 2012 through latest (May 2014), the GISS data showed a slope of +0.8 C degree per/century for 1980-2005.

    In 7 years (December 2005 to December 2012), the rate of temperature rise for 1910-2005 has been adjusted up from +0.6 to +0.8 degree per century, an increase of approximately 30%.

Commentary

  • It would be interesting to see what the data looked like further back in time. Does anyone have GISS versions that predate 2005? Can someone inquire with GISS to see if they have copies (digital or paper) going further back? Have there been any versions published in scientific papers prior to 2005?
  • Given how much the data has changed in the past 9 years, what might it be like 9 years from now? Can we trust it enough to make multi-billion dollar economic decisions based on it? I find it reminiscent of George Orwell’s “1984” where;

    Winston Smith works as a clerk in the Records Department of the Ministry of Truth, where his job is to rewrite historical documents so they match the constantly changing current party line.”

About these ads

92 thoughts on “GISS Hockey-Stick Adjustments

  1. The adjustments are necessary to get the data ‘on message’, colder in the past, warmer in the present and that troublesome warm spell in the 30th can’t be natural variability – it must be bad data – now ‘corrected’.

  2. Great work!

    “Can someone inquire with GISS to see if they have copies (digital or paper) going further back? ”
    Id be shocked.

  3. You need to make a short animated movie out of that. Two frames per month. Watch the line dance.

  4. So from 1905 to 2014, the temperature adjustments have brought 0.21C of warming to the data, so the Global meteorological stations show 1905 as -0.30C and 2014 as +0.80C, a difference of 1.1C. The new rise since 1905 would be 0.89C in 2014. If we fixed the 1905 temp at -0.30C, the 2014 increase would then be +0.59C, and 0.54C/century.

    By this time I haven’t a clue what the temperature data means anymore, except that whatever the temperatures are doing, they are doing it at a rate considerably less in than the IPCC models.

    That is actually enough: the models hold all the terror, but the models are a failure. There is no catastrophic or significant harm at the observed rate of temperature rise. If you want to de-carbonize the world for ideological reasons, you can’t use temperature observations as a supporting technical reason.

  5. Most US Government data is reported to Congress and can be found in Congressional records. Those records would likely not be “adjusted”.
    Wonder what certain groups’ reaction would be if someone “corrected” the records to reflect an assumed value for UHI?

  6. I don’t see a hockey stick here. But the fact is that GISS doesn’t do much adjusting at all now. Since GHCN V3 came out, they have used the GHCN adjusted data, with Menne’s pairwise homogenization. That’s the main reason for the change.

  7. Reblogged this on Norah4you's Weblog and commented:
    Läs och begrunda alla CO2-kramare:
    The next graph is similar to the above, except that the analysis is more granular, i.e. 1910-2005 is broken up into 5 smaller periods. The slopes of the GISS adjustments are…

    1880-1909 -0.520 C degree per century
    1910-1919 +0.732 C degree per century
    1920-1939 +0.222 C degree per century
    1940-1949 -1.129 C degree per century
    1950-1979 +0.283 C degree per century
    1980-2005 +0.110 C degree per century

    Se rebloggen nedan.
    Det har alltså inte räckt med att “korrigera” faktiska data för GISS. För att “data” skall passa in i de sk. datamodellerna (som rent ut sagt bara är bevis på okunskap i hur man skriver systemprogram) så har man också ändrat i sina tidigare korrigeringar när dessa korrigerade data inte längre stämt med CO2-hot “hypotesen”

  8. Thanks Walter.
    I think you have shown why the weather folks need super-fast computers and run them 24/7 to keep the adjustments coming.
    ———————————————

    jaffa says:
    July 3, 2014 at 4:40 pm
    …………should say 30′s (1930′s)

    should say 1930s

  9. I don’t think the method of calculating trend in multiple segments is correct. Because the segments don’t exist in isolation from each other, they should ‘join up’, ie. the end point of one segment should be the startpoint of the next.

    I’ll see if I can download the data, run the calc, and report back (I’m out for the next few hours).

  10. Mike Jonas says:
    > July 3, 2014 at 5:28 pm

    > I don’t think the method of calculating trend in multiple segments is
    > correct. Because the segments don’t exist in isolation from each other,
    > they should ‘join up’, ie. the end point of one segment should be the
    > startpoint of the next.

    The segments appear to have been adjusted separately, and I can’t see any over-riding reason why the endpoints must join up. I’m taking the numbers and plotting what shows up. The acronym GIEGO (Garbage In Equals Garbage Out), comes to mind.

  11. This is the earliest I have come across in numeric data. Global temperatures 1880-1993, Global first, NH second, SH third.

    http://cdiac.ornl.gov/ftp/trends93/temp/hansen.612

    Abstract: This data set represents temperature changes over the past century
    (1880-1993) calculated from surface air temperatures published in the
    “World Weather Records” and the World Meteorological Organization’s
    (WMO) Monthly Climatic Data for the World, supplemented by monthly
    mean station records available from NOAA’s Climate Analysis Center.
    At each gridpoint, data from nearby stations are combined to form an
    estimate of the temperature change (Hansen and Lebedeff (1987)).

    Data is presented as temperature anomalies relative to a reference
    period of 1951-1980 in degrees Celsius for the globe, Northern
    Hemisphere, and Southern Hemisphere.

    The citation for this dataset is:

    Wilson, H., and J. Hansen. 1994. “Global and hemispheric temperature
    anomalies from instrumental surface air temperature records”,
    pp. 609-614. In T.A. Boden, D.P. Kaiser, R.J. Sepanski, and F.W. Stoss
    (eds.), Trends ’93: A Compendium of Data on Global
    Change. ORNL/CDIAC-65. Carbon Dioxide Information Analysis Center, Oak
    Ridge National Laboratory, Oak Ridge, TN, USA

  12. GISS doesn’t adjust data so much as it fabricates a data-food-product via some algorithmic processes applied to GHCN / USHCN.

    I got GISTemp to run some time back, and did a load of analysis on the thing. Then figured out it was just a bad caricature of the data; and that the real “magic” was being done up stream anyway in the GHCN.

    I’ve not looked at it in a few years, but the original work is still “up”. See:

    http://chiefio.wordpress.com/gistemp/ as an entry point.

    The tech stuff is here: http://chiefio.wordpress.com/category/gisstemp-technical-and-source-code/

    Also, for GHCN stuff: http://chiefio.wordpress.com/category/ncdc-ghcn-issues/

    I came up with what is a more simple and, IMHO, clean way to present the data. A load of graphs and stuff here: http://chiefio.wordpress.com/category/dtdt/

    The fundamental “issues” likely to cause variation in trend are the (referenced above by Nick Stokes) change in input data set (and, one presumes, code – though I haven’t looked lately), and that every run of GIStemp makes a load of new manipulations of all the data items based on what other data items exist in the input data set… that changes every month. No two runs the same… (Details at my site / links – with some digging).

    At this point, it is best to think of GIStemp as a sort of carnival mirror reflecting the changes made to GHCN by NOAA/NCDC in a slightly daft way (as my personal opinion of what the code does). So you need to track both of those to figure out “why”… Oh, and for several years USHCN new data was not used in GIStemp as the format changed and they didn’t get on board with it “for a few years”. I did a posting on that…

    http://chiefio.wordpress.com/2009/11/06/ushcn-v2-gistemp-ghcn-what-will-it-take-to-fix-it/

    I’d not use GIStemp ‘data’ for anything. Period.

    If you would like, you can recreate some of the past GIStemp output via using older GHCN and USHCN data for input to your own vintage copy of GIStemp. I’m willing to help you ‘make it go’ if you want (and need any help). I have the old data saved… and the old code.

  13. So it is clear that this is not random, this is “history re-engineering.” My question is: is this some lone ranger weather guy on a political crusade? Or is there some new algorithm that is faulty? Or is there a dedicated group (in which why has there been no leak?). Or has this been achieved through policy decisions at a higher level, in which case those should be visible?

  14. Bill Illis says:
    > July 3, 2014 at 5:59 pm
    >
    > This is the earliest I have come across in numeric data.
    > Global temperatures 1880-1993, Global first, NH second, SH third.
    >
    > http://cdiac.ornl.gov/ftp/trends93/temp/hansen.612

    Thanks. More numbers for me to look at. I’m also poking around in the USHCN data right now, and finding some interesting stuff.

  15. It is pathetic that the government allows and the media remains silent concerning this obvious climategate charade.

  16. This is a good confirmation of the old analysis of temperature records….that the keepsrs of the data have adjusted the hell out of it, by cooling the past, and heating the present to create a bigger trend.

  17. Bill Illis says:
    > July 3, 2014 at 5:59 pm
    >
    > This is the earliest I have come across in numeric data.
    > Global temperatures 1880-1993, Global first, NH second, SH third.
    >
    > http://cdiac.ornl.gov/ftp/trends93/temp/hansen.612

    Thanks again. I just ran a quick plot (2014 versus 1993), and I see a similar crash from 1880 to 1914 or thereabouts, followed by a gradual rise. I’ll have to look into it more deeply, to get it on the same image scale, before I can make absolute comparisons. Also, there are 1/12th as many points, because the data is annual, rather than monthly. The result is more scattered-looking.

  18. Possibly a representative from NASA could explain the reasons for the adjustments… from the witness stand… under oath.

  19. Yes, in the Arctic,GISS are fiddling the historical records. They are cooling the past in order to generate a spurious warming trend. I have studied the sites with the longest unbroken records: Ostrov Dikson in Russia and Teigarhorn in Iceland. Comparing GISS’s older records to later ones, the latter has been artificially depressed by 0.9C. See http://endisnighnot.blogspot.co.uk/2013/08/the-past-is-getting-colder.html

    Let’s take one specific place, at one specific time: Teigarhorn in January 1900. GISS has variously reported the temperature at +0.7C (in 2011) at -0.2C (in 2012) and -0.7C (in 2013). My enquiry to the Icelandic Met Office yielded a nice reply, polite and thorough. A direct transcript of the pen-and-ink original, they tell me, shows +1.0C. THE PAST HAS BEEN COOLED BY 1.7C.

    Here in the UK we have no chance of holding to account the legions of bent academics, the civil servants, the champagne greens and the industrialists who are milking the public purse. But I hope our American cousins will persuade some congressman that this fraudulent misuse of data (upon which public policy is founded) is worthy of criminal investigation.

  20. Well this sort of explains, why it is, that you will never have any more information about the weather, or the climate, than the original raw experimentally observed (measured) numbers.

    All of the subsequent numerical prestidigitation, is simply an academic exercise in the field of statistical mathematics. It is not related to the weather or the climate of the earth.

    You can perform the identical algorithmic computations on the numbers on the licence plates of automobiles, that you might observe going past you on any corner on a traffic busy street, and they will tell you just as much about earth’s climate; which is nothing !

  21. MattN says:
    > July 3, 2014 at 5:17 pm
    >
    > Do they have an explanation of why they are doing this?

    Near the beginning of the post where I say
    > GISS lists its reasons for adjustments at two webpages:

    There are links to 2 webpages with their rationale…

    http://data.giss.nasa.gov/gistemp/updates/

    http://data.giss.nasa.gov/gistemp/updates_v3/

    ——————————————————————–
    From this layman’s superficial reading of the above rational for alteration, the changes appear to be normal, necessary and benign. However, the changes would appear to reflect unfavorably on the question of the level of certainty regarding any temperatures, the level of certainty regarding the change in temperatures over time, and regarding the characterization of a “settled science”.

  22. Advertising the Truth and Truth in Advertising
    There is a dichotomy here which needs exploring.
    The problem stems from what does the USHCN data really mean and how is it managed and interpreted. Website states The United States Historical Climatology Network (USHCN) is a high quality data set of daily and monthly records of basic meteorological variables from 1218 observing stations across the 48 contiguous United States.
    Steve Goddard commented it was the Coldest Year On Record In The US Through May 13 2014
    Zeke Hausfather, a data scientist currently a Senior Researcher with Berkeley Earth chucked fuel on the fire when he wrote a series of articles
    How not to calculate temperatures, parts 1,2 and 3 stating that Goddard was wrong
    The U.S. Historical Climatological Network (USHCN) was put together in the late 1980s, with 1218 stations chosen from a larger population of 7000-odd cooperative network stations based on their long continuous records and geographical distribution. The group’s composition has been left largely unchanged, though since the late 1980s a number of stations have closed or stopped reporting.
    And here is the crux. Mr Goddard reported real raw data, possibly with flaws in that missing temperature records were not counted. Zeke replied with an artificial model which was not designated an artificial model [see the blurb above from the website USHCN) is a high quality data set] yet treated this model data as the real data.
    Steven Mosher to his credit has consistently said that it was estimations only, whereas other commentators like Nick Stokes have said that it is a virtually true data set. Steven unfortunately ignores the fact that the USHCN is put out as a historical data set when it is neither of those two things.
    Further to this a deeper truth is hidden. The number of stations in the USHCN [a subset of the GHCN] IS 1218 , originally 1219 selected in 1987 with continuous records back to 1900. A large number of stations have closed over this time dropping the number of real stations to 833 reporting in March and April 2014.
    Zeke has further suggested the number of real stations could be as low as 650. Some stations have been added to make it up to the 833 current stations. This implies that up to 40% of the data is artificial, made up by programmes that would be as adept at making a poker machine reel spin.
    The data is adjusted in 2 ways according to Zeke and Steven and Nick. Infilling from surrounding stations if it appears erroneous for current temperatures with no comment on how low or high it is allowed to go before it is infilled. The past historical data is altered so the further back in time one goes the lower the so called historical data record is altered but it is not promoted or advertised or gazetted as a guess or estimate. It is put out as the truthful correct reading. Worse each day all these readings change as new readings are inputed on a daily, monthly [or mid next month computer input for the missing stations].
    The second is a TOBS adjustment and a change of thermometers adjustment.
    This results in a subtle underlying lowering of the whole historical record again presented as true historical data when it is anything but. Further it enables TOBS changes to be made to all missing data as in comparing it to surrounding stations gives an average reading but as the site itself was not working a TOBS is possibly made for that station as there is no proof that its Obs were done at the same time as the other stations.
    Steven Goddard’s graphs may be flawed by missing real data, he says this is small. His temperature representations are at least real and accurate data.
    Not an estimate dressed up as a drag queen of data, worse historical data when it is neither of those things.

    USHCN addendum
    it contained a contained a 138-station subset of the USHCN in 1992. This product was updated by Easterling et al. (1999) and expanded to include 1062 stations. In 2009 the daily USHCN dataset was expanded to include all 1218 stations in the USHCN.
    This is quite a concern. If the 1992 version only had 138 stations used for its graphs could it be that these stations still exist and could still give a graph. Why were others discarded? How many of these best located stations have died the death and why? Did the addition of the new and massively infilled stations with TOBS adjustments cause the so called historical rise in temperatures

    Final Note this question of truth, what is data and what is modelling, which is historically true and which has been written by the winners will persist until the agencies concerned label there models correctly and give raw data graphs, warts and all to the general public.

  23. For those here who are not fluent in Swedish, I -since I am- can attest that the comments made by norah4you above, are spot on in their understanding of the sleigh of hand that has been going on.

  24. Dear F.A.H.
    My understanding is that it is properly called ‘Mann-made Global Warming.’

  25. GISS has already answered your questions at their Q&A site,

    http://data.giss.nasa.gov/gistemp/abs_temp.html

    Excerpt:
    Q. If SATs cannot be measured, how are SAT maps created ?
    A. This can only be done with the help of computer models, the same models that are used to create the daily weather forecasts. We may start out the model with the few observed data that are available and fill in the rest with guesses (also called extrapolations) and then let the model run long enough so that the initial guesses no longer matter, but not too long in order to avoid that the inaccuracies of the model become relevant. This may be done starting from conditions from many years, so that the average (called a ‘climatology’) hopefully represents a typical map for the particular month or day of the year.
    It is not real, just the latest version of the GISS Fairy Tale for your entertainment!

  26. I doubt this is the work of one individual. To mess up this badly you require a committee.

  27. cynical scientst said:
    I doubt this is the work of one individual. To mess up this badly you require a committee
    I second that opinion

  28. What the surface stations survey showed was that the temperature gathering network was essentially uncalibrated for about a century. No serious scientist would draw any conclusions from uncalibrated instruments/sites. All of these “adjustments” are an attempt to re-calibrate measurements after the fact, although these recalibrations don’t seem to be unbiased (to put it mildly). There is no scientific or other evidence that such an after-the-fact recalibration is doable or reliable. This is worse than taking cheese and making “processed cheese product.”

    The bottom line is that there is no “temperature data,” prior to satellite measurements (which suffer from their own issues) and the USCRN network, which only has about 10 years worth of data, IIRC, BEST, GISS, HADCRUT, etc. etc. notwithstanding. All of the temperature indices are essentially models (which may reflect the modelers’ biases and preconceptions more than the historical weather.)

  29. @ miked1947 on July 3, 2014 at 9:44 pm:

    Thanks for the link to GISS. However, you have to be careful because the answer given depends on what “the definition of ‘is’ is.”

    From the link:

    The Elusive Absolute Surface Air Temperature (SAT)

    The GISTEMP analysis concerns only temperature anomalies, not absolute temperature. Temperature anomalies are computed relative to the base period 1951-1980. The reason to work with anomalies, rather than absolute temperature is that absolute temperature varies markedly in short distances, while monthly or annual temperature anomalies are representative of a much larger region. Indeed, we have shown (Hansen and Lebedeff, 1987) that temperature anomalies are strongly correlated out to distances of the order of 1000 km. (emphasis added)

    So what is the definition of “strongly correlated?”

    From Hansen and Lebedeff, 1987 (Abstract):

    The temperature changes at mid- and high latitude stations separated by less than 1000 km are shown to be highly correlated; at low latitudes the correlation falls off more rapidly with distance for nearby stations. (emphasis added)

    From Hansen and Lebedeff, 1987 (pg 3 of the pdf):

    For example, in these regions [the United States and Europe] the average correlation coefficient for 1000-km separation was found to be within the range 0.5-0.6 for each of the directions defined by 45° intervals. We did not investigate whether the correlations are more dependent on direction at low latitudes.
    …..
    The 1200-km limit is the distance at which the average correlation coefficient of temperature variations falls to 0.5 at middle and high latitudes and 0.33 at low latitudes. (emphasis added)

    So the definitions map as follows:

    “[S]trongly correlated” in the GISS link maps to “highly correlated” … “at mid- and high latitudes” in the abstract. NO mention is made in the webpage of the low latitude caveat in the abstract.

    Further,

    “[H]ighly correlated” … “at mid- and high latitudes” in the abstract maps to “0.5 at middle and high latitudes and 0.33 at low latitudes” at 1200 km in the body of the text.

    I would not define correlations of 0.5 and 0.33 as “high” or “strong,” yet all of the adjustments and temperature reconstructions (including BEST, I believe) depend on this definition of “high” or “strong” correlation, IMHO. It would seem that if a common sense definition of a “high” or “strong” correlation as 0.9 or above were used, then all of these reconstructions and/or adjustments (including, probably, BEST) would probably disintegrate.

    Furthermore, the statement that “absolute temperature varies markedly in short distances, while monthly or annual temperature anomalies are representative of a much larger region” also needs to be examined for logic and commons sense.

    Mathematically, the temperature anomaly is defined as the difference between the absolute temperature and the average temperature over a base period (in this case 1951-1980). If we let T_anom be the temperature anomaly, T_abs the absolute temperature and T_base the average temperature over the base period, the formula can be written as follows:

    T_anom = T_abs – T_base

    Since, T_base is a constant for the purpose of calculating anomalies, it would follow that if “absolute temperature varies markedly in short distances,” so would temperature anomalies. However, they added the modifier “monthly or annual,” so, presumably they are comparing daily absolute temperatures to “monthly or annual” anomalies. If you add the modifier “monthly or annual” to absolute temperatures, wouldn’t “monthly or annual” absolute temperatures also be “representative of a much larger region,” if anomalies are supposed to be? Maybe, to be kind, their wording is very poor. I think what they may be trying to say is that the derivative (or the first difference) of temperature is “representative of a much larger region,” but given that the correlation coefficients of temperature are so poor, why would the derivative (or first difference) of temperature be any better? I have never seen any calculations of correlation coefficients of temperature derivatives (or first differences), so why should we take these statements as having any validity?

  30. Michael D:

    Your post at July 3, 2014 at 6:15 pm asks

    So it is clear that this is not random, this is “history re-engineering.” My question is: is this some lone ranger weather guy on a political crusade? Or is there some new algorithm that is faulty? Or is there a dedicated group (in which why has there been no leak?). Or has this been achieved through policy decisions at a higher level, in which case those should be visible?

    The answer is All Of The Above.
    This answer has been public knowledge for over a decade.

    Discussion of this was part of Climategate.
    It was reported to the UK Parliamentary Inquiry into Climategate.
    It has been repeatedly subject to cover-up.

    All of my statements in this post are justified, evidenced and explained by this submission to the Parliamentary Inquiry.

    Richard

  31. for E M Smith
    This is a guess only to explain the continuing, increasing and divergent lowering of records in the past.
    The algorithm for changing TOBS in the past is still incorporated for changing stations in the present.
    This results in a subtle underlying lowering of the whole historical record again presented as true historical data. TOBS changes are made to all missing data at stations when comparing them to surrounding stations which gives average readings but as the sites were not working a TOBS is possibly made for those stations as there is no proof that their observations were done at the same required time as the other stations. Hence all current TOBS readings, and there are quite a few, have an inbuilt rise in temperature applied to the average temperatures when calculated. Even worse this then forces backwards changes on all the past recorded TOBS stations dropping them lower.
    Otherwise the past records would have stayed the same [Zeke says they are always adjusted each day] and only the current data would be modified. Take out the link changing backwards and the system becomes much fairer though still historically wrong and should require all USHCN graphs to be labeled as estimates not true data itself.

  32. Phil says:
    July 3, 2014 at 10:15 pm
    //////////////////

    I have several times made similar observations.

    I am at a loss to understand why any serious scientist would use the land based thermometer record post 1979. Essentially the land based thermometer record should be ditched after 1979; it being used for info prior to then simply on the basis that that is all we have. A more realistic error bandwidth should be applied taking into account what we know about equipment, the approaches to observing and recrding data, siting issues, instrument degadation, screen degradation, spatial coverage, the encroachment of UHI, station changes etc.

    There should be no attempt to splice the land based record with the satellite record.

    As you point out the satellite data has its own issuues (such as orbital decay and sensor degradation) but these are not as stark as those that invade (or should that read pervade?) the land based record. One merely needs a sensible error bandwidth to be applied to the satellite data.

  33. Nick Stokes says at July 3, 2014 at 5:22 pm

    I don’t see a hockey stick here. But the fact is that GISS doesn’t do much adjusting at all now. Since GHCN V3 came out, they have used the GHCN adjusted data, with Menne’s pairwise homogenization. That’s the main reason for the change.

    But the fact is that GISS doesn’t do much adjusting at all now.“?
    That may be true but seeing as they don’t state:

    1 How much adjusting they still do.
    2 When they adjust.
    3 What they adjust.
    4 Why they adjust.
    5 How they adjust.
    6 What the impact of the adjustment is.
    7 What the justification for that assessment of the impact of the adjustment is…

    Well, the data is junk.

    No legitimate organisation or individual scientist can use this contaminated gibberish anymore.

  34. I understand that to produce a global temperature anomaly, various “adjustments” have to be made. However, what use is a global temperature anyway?

    If you cannot show warming (or a hockey stick or whatever shape we are supposed to be seeing) in say 200 “proper” sites globally, then making all sorts of infills and manipulations is essentially just playing games. It is reductionism at its worst: if done honestly it is meaningless, if done dishonestly it is fraudulent.

    Is it really impossible to find a large enough sample of well-sited records with a reasonable spread geographically to demonstrate what has been happening with temperatures for the last 150 years? So we have to look at 200 data sets instead of one? So what?

  35. GISS doesn’t have to do any adjusting if it uses GCHN as this already has USHCN in it and its own similar adjustments due to massive station dropout.

  36. Let’s add some CO2 figures. We are told that in the ‘pre-industrial’ era CO2 levels were 280ppm.
    They’re now 400. That’s almost a 67% increase in CO2.
    Looking at all the detailed and painstaking work by Walter Dnes, and despite ‘adjustments’ of GISS figures, it’s clear that there hasn’t been much of a temperature increase for all that extra CO2, has there?
    Walter asks if the data can be trusted enough to make multi-billion dollar economic decisions based on it.
    My question is: where are the politicians with the guts to acknowledge that the whole CO2 story has reached the end of the line, and that no more money should be wasted on this fairy story?

  37. John Harmon,

    I’m not so sure the Congressional record can’t be changed. After all, congress-folk are allowed to record a speech at a later date and insert it into the record when they were not even present the day the debate took place. I hope that the actual date they made the recording is prominently displayed but I have not checked.

  38. Paraphrasing Nick Stokes says at July 3, 2014 at 5:22 pm

    I don’t see Clinton having sex with interns here. But the fact is that he doesn’t do much Lewinskis at all now. Since the end of his presidency, he has become more handy, with Hillary’s impending run. That’s the main reason for the change

  39. Walter,
    Thanks for the analysis. You are showing the local trends as discontinuous lines, but that’s only fair to GISS if they really have step-wise adjustment changes at the transition times. If they don’t, would it not would be fairer to present this as connected trend lines? Basically, you’d change from a fit on 6×2=12 variables to 6×1+1=7 variables.
    Frank

  40. The earliest version of the US National Temperatures from the NCDC are in this report on Page 25 Climate Assessment. A Decadal Review 1981-1990. 1934, 1921, 1953, 1954 were warmer than the 1980s.

    The actual data numbers have long been removed from the net by the NCDC. It is just a pdf and someone would have to digitize the data.

    http://www1.ncdc.noaa.gov/pub/data/cmb/bams-sotc/climate-assessment-1981-1990.pdf

    Also, a huge pdf 56 MB, Trends ’93 A compendium of data on Global Change
    Contiguous United States Temperatures from 1900-1991 are on Page 689. Just annual and seasonal anomalies in C in a table. You can download the report on this page.

    http://www.osti.gov/scitech/biblio/10106351

    Lots and Lots of data in this report some of which can still be found here – US temps have been replaced of course by the NCDC in the database.

    http://cdiac.ornl.gov/ftp/trends93/

  41. “My question is: where are the politicians with the guts to acknowledge that the whole CO2 story has reached the end of the line, and that no more money should be wasted on this fairy story?”

    When the practical consequences of ‘climate change’ policies start to hit the ordinary voter in his or her pocket in a significant and easily seen manner, politicians will gain the balls to face down the pro-AGW brigade. We have already seen this in the UK, where rising domestic energy prices are now a political hot potato. Politicians are still managing to hold two conflicting ideas in their heads, namely being against ‘fuel poverty’ and being pro-AGW at the one and same time, despite their policies to combat the latter actively causing the former. But the fact that they now have two ideas in there when before they only had one is itself progress. Slowly they will cotton on there are votes in reversing AGW policies and once one party breaks ranks others will follow. UKIP kind of have in the UK, but it’ll take one of the traditional parties to follow suit before the dam breaks.

  42. “The bottom line is that there is no “temperature data,” prior to satellite measurements”
    Thank you, Phil.

  43. Do they have an explanation of why they are doing this?
    ============
    they have a rationalization. human beings have an infinite capacity to rationalize any decision, good or bad, such that it is justified.

  44. So this claim is cherry-picked, “In 7 years (December 2005 to December 2012), the rate of temperature rise for 1910-2005 has been adjusted up from +0.6 to +0.8 degree per century, an increase of approximately 30%.” with 1910 being picked, presumably because it’s at the bottom of the adjustment curve. Despite the cherry-picking, I still find the method troubling, and I agree with this claim, “Given how much the data has changed in the past 9 years, what might it be like 9 years from now? Can we trust it enough to make multi-billion dollar economic decisions based on it?”

    Here’s the thing about adjustments. You need to label your units. Correctly.

    If these are temperatures, then the temperature in Central Park should be the same, whether it’s 2005 or 2014.

    That’s not happening in this data. The claim is because the data is “adjusted.” OK. There are some plausible reasons for that, but my simple mind needs an analog. Do we have any other adjusted data sources? Yes, the news always talks about “inflation adjusted dollars.” So maybe adjusted temperatures are like adjusted dollars? If so, there’s something missing. When the news quotes inflation adjusted dollars, they tell me the frame of reference, “buying power of a 2005 dollar,” for example. And the Minneapolis Fed gives me an inflation calculator so I can go to their web site and convert between 2014 dollars and 2005 dollars.

    GISS data is labeled in degrees C. Not “2014 adjusted degrees C,” and they provide no way to compare between years. Giving the benefit of the doubt and assuming that all of the adjustment methods used are “above boards,” I’d still have to say that their implementation is incomplete. They need to label their units as adjusted degrees, and they owe the public a conversion tool to convert between years the same way the Minneapolis fed has done for inflation adjustments.

    Also, if these adjustments are valid, once a temperature is measured and adjusted once for a set of stations at a specified time, the proportionate relationship between those stations should stay the same through all future adjustments until the end of time. It’s not apparent to me whether that is happening, but I suspect not.

  45. w.r.t that last paragraph, I downloaded the data and see that it doesn’t include individual stations. With exceptions for closely ranked years, the same statement should also hold true for points in time. If month1/year1 is markedly warmer than month2/year2 when they were collected in 2005, that should also be true in 2014. Barring rounding differences, if we happen to find that month2/year2 is markedly cooler in 2005 and warmer in 2014, for any pair of months/years, then I think that should be a huge red flag, as it indicates that the size of an “adjusted degree” is not being held constant, even within a particular dataset.

    If an “adjusted degree” is bigger for one station than it is for another, even during a particular data sweep, than I don’t see that how it conveys any useful information. On the other hand, if the inter-time-period ranking of months stays the same for all pairs at all times (except for rounding differences), that would offer some confidence in the adjustments.

    If I get around to it, I may run this test, but it might be a while… Don’t hold your breath waiting for me.

  46. The people involved, the people who help Obummer keep repeating his lies, “hottest year ever,” blah blah blah, have jobs requiring them to check their personal integrity at the door. Stokes you seem to be one of them. This is the ugliest thing in America these days, and most Dems are unaware that their chosen Liar-In-Chief stoops this low. The Dem donors who fund this wretched nonsense have no shame, cannot have any pride left.

    The universities have become cesspools of mendacity. I took Econ 101 from a Communist at U of Michigan. His Teaching Assistant propounded the lies from the Club of Rome, Malthusians. I scoffed openly at this in class, he flunked me on the first hourly despite my having answered every question correctly, I dropped the class. This was in 1979! Things have not improved, seems actually far worse now. “Climate Science” indeed, not very scientific.

    This blog is doing God’s work. Expose it all to the light, watch it wither away like in the vampire movies, and the sooner the better…

  47. You extremists just can’t see the proper role for public servants adjusting the published temperatures on a daily basis so I don’t feel the hot or cold extremes so much. It’s called sustainability and trust them, they’re from the Gummint and they’re here to help.

  48. This is a critical area as it underpins the whole of CAGW. Articles I have read over the past few years suggest every dataset has been altered in a similar fashion. The suspicion is that most of the alterations are fraudulent.

    It suggests the organisations looking after this data are ‘not fit for purpose’ and the functions should be removed from these bodies.

    At the end of the day it is just a bunch of data. Most governments have statistical departments which could take over this function. Using professional statisticians would also overcome some of the other criticisms on how the data is handled. Any adjustments should be requested via a committee and the reasons for any accepted published on the web in case someone wants to check or appeal against their inclusion.

    There is another aspect of this. If any changes are fraudulent then in effect the data is being altered to provide more funding. Alternatively it may be regarded as using government funds to create propaganda. I would expect this to be illegal and making them open to a criminal prosecution.

    These kind of articles come and go and are ignored by all apart from skeptics. What I would like to see is a coordinated effort to produce a document on how these datasets have been altered and a campaign to remove the functions from the current bodies. A campaign forever until things are changed. These datasets are so fundamental to CAGW that destroying their credibility would be a massive blow. I think that this is the one battle that if won would win the war.

  49. Bill_W says:
    July 4, 2014 at 5:06 am
    John Harmon,

    I’m not so sure the Congressional record can’t be changed. After all, congress-folk are allowed to record a speech at a later date and insert it into the record when they were not even present the day the debate took place. I hope that the actual date they made the recording is prominently displayed but I have not checked
    ==========================================================
    If you can stomach watching and listening, congressional types nearly always end their spiel with:
    “I reserve the right to REVISE and extend my remarks”. A convenient out for the times they get called out being for or against an issue. Just another way of saying “I was for it before I was against it.”

  50. Changing the time of temperature observations at a given weather station makes a huge difference to the reading. Surely that is obvious. Further, the work trend has been for people to stay at work all day, rather than return home at lunch time to eat which used to happen. So older manual readings tend to be taken closer to the hottest part of the day, and newer manual readings tend to be taken early in the morning before going to work. Most recently automated stations presumably give you a full temperature profile throughout the day.

    So there’s no way that you can get a sensible trend from readings at the same place with different “Time of Observation”, which forces you to adjust it and guarantees that the adjusted temperature is going to change with time as the knowledge of how to do it improves. Thus, to cope with the change in time you would expect to increase the trend over time for the majority of stations, though a minority may reduce, and you should have the data (“Time of Observation” in the paper record) available to do this. Exactly what adjustment you should then make is going to be a matter of doing analyses and developing expertise which would then let you create rules, which you might hope would get better over time.

    The other factor which needs adjustment is the well-known UHI (“Urban Heat Island”) effect. This time you need to identify changes in individual station readings not reflected in readings in the same area (which need to include urban and rural stations), and mostly this ought to result in reducing the more recent temperatures for urban stations. The process need not be done by explicitly identifying urban stations – just to make adjustments for any station which suddenly starts to get out of line with the other stations in the region ought to be able to cope with any UHI effects and similar changes in either direction.

    There are research papers and code available comparing “urban” with “rural” trends for the USA which determine whether there are any UHI distortions left in the data set, and the GISS temperatures come out pretty well on this analysis taking any one of four definitions of what “urban” might actually mean!

  51. Peter:

    re your post at July 4, 2014 at 10:20 am.

    Your excuses for the ‘adjustments’ to GISS data do not wash.

    If you were right then each past datum would require being ‘adjusted’ ONCE. But past data are changed almost every month.

    The frequent GISS changes cause this.

    GISS data may be propaganda but it certainly is NOT scientific information.

    Richard

  52. From Phil.

    “””””…..The bottom line is that there is no “temperature data,” prior to satellite measurements (which suffer from their own issues) and the USCRN network, which only has about 10 years worth of data, IIRC, BEST, GISS, HADCRUT, etc. etc. notwithstanding. All of the temperature indices are essentially models (which may reflect the modelers’ biases and preconceptions more than the historical weather.)……”””””

    And the top line is that almost concurrent with the satellite data era, is the era of those oceanic buoys, that have been measuring simultaneously, both the near surface (-1 metre) oceanic water Temperature, and the near surface/lower troposphere (+3 metre) oceanic air Temperatures, on which Prof. John Christy, et al, reported about Jan 2001 that those Temps are not identical (why would they be). And further more, they aren’t correlated, so all that 150 years of ocean water temps obtained haphazardly at sea, as proxies for oceanic lower troposphere air Temperatures, to mingle with the other 29% of the earth surface, that is solid ground, instead of water; is just plain junk. Total rubbish.

    So lacking correlation, the correct air temps, are not recoverable from the phony baloney ersatz water Temperatures.

    So I’m in agreement. No global Temperature data, pre circa 1979/80; so actually anything approximating global Temperatures have been getting measured for about one earth climate “interval” time (30-35 years). Wait till we have maybe five under our belt.

  53. the bottom line is always the same – unless they (i.e. any of the dataset holders) can provide the genuine real RAW data, along with each and every subsequent change/adjustment and the documented reason for each and every adjustment – we, the ‘consumers’ – have absolutely no idea what we are getting.

    Now, if this was a food type consumer product 0 we could take it or leave it if we don’t like. However, this is, in effect, a WORLD consumed ‘product’ (although force fed product might be a better description) and we are not allowed to question it!

    I don’t have a problem with that in itself – if mugs want to use such data with blind trust – so be it. However, when said data is used and upheld as scientifically valid – that changes the game – it must be reproducible and proven as valid. Frankly, to this day, I don’t think we have reliable data – and certainly not without questions as to its history or validity! Ergo, in almost any other scientific endeavour, the ‘results’ or ‘conclusions’ based on such data would be thrown out or at best held in very low esteem (think wagonload of salt here!)………

  54. Walter Dnes (July 3, 2014 at 5:46 pm) “I can’t see any over-riding reason why the endpoints must join up.

    Think of it as a single trend with multiple linear segments, not a set of unrelated linear trends.

    Taken separately, each trend line you show is technically correct for its period. But the period segments are typically very short and trends are dubious over short periods. More importantly, there’s a strong short cooling segment in which the start point is way above the previous end point, and the end point is way below the next start point. ie, your composite trend has large jumps in it which a trend line (even a composite trend line) shouldn’t have – not if it’s a “trend”. The trend for that particular segment is also heavily dependent on its start and end points – a modest change in end point will make a large change in trend. Lastly, the sum of all the trends over their selected periods is not equal to the trend over the total period, so the picture they give is misleading.

    It’s no big deal to calculate multi-segment trends that join up, if you have a general-purpose optimising algorithm for multiple variables. You simply do a least-squares optimisation with the temperatures at the segment ends as the variables. That’s n+1 variables for n segments. Even better, add in time as a variable at each internal segment end (ie, allow the join points to move horizontally), that’s another n-1 variables. I am horribly busy over the next few days so might not be able to do the calc for a while.

  55. Maybe instead of “Think of it as a single trend with multiple linear segments” try “Think of it as a curve-fit with multiple linear segments”.

  56. Richard Courtney says that the GISS guys should get only one shot at adjusting the data – once they have had that – no further changes. This would be a pretty stupid approach. Here’s how it would work….

    The GISS team have already adjusted readings for Time of Observation but now realise that the new electronic mini-max thermometers installed in stations on average give a maximum temperature 0.4C lower and a minimum temperature 0.3C higher than the previous manual readings. But Richard’s rule says that they have already had their shot at adjusting, so its not permissible to make the new change.

    And more scrutiny of the data has brought to light that a few guys were cheating in the 1920’s – they went on holiday without arranging a standin and decided the easiest approach was to record the same temperature for a week. Sorry guys – you’ve had your one adjustment and the adjusted temperatures are now fixed in perpetuity.

    See http://reason.com/archives/2014/07/03/did-federal-climate-scientists-fudge-tem for further cases – despite the title whoever wrote that piece had done more than take a superficial view saying “adjustments are always bad – particularly adjustments in the direction we don’t like”.

  57. http://eric.worrall.name/giss/ is a 3d navigable “lunar lander” construction based on the temperature data supplied by Walter Dnes in the post above. Tested in Safari and Chrome, *might* work in IE10.

    Only click the link if you have a high spec computer running the latest OS – it uses advanced 3D / html code, which stretches the computer to the limit.

    The terrain from left to right is oldest GIS snapshots to newest snapshots on the right. From front to back is oldest temperature anomalies (1880s) to newest anomalies at the back.

    You can navigate with the arrow keys, up = forwards, down = backwards, left and right.

    Enjoy :-)

  58. People really ought to look into R. R is great for examining data. For example, it took me all of ten minutes (including the time to familiarize myself with some unfamiliar commands) to generate 94 images showing the changes between each iteration of the data set. I’ve uploaded that collection of images here. It takes only five or so lines of code to repeat the process.

    I’m not drawing any conclusions on this topic. I haven’t spent the time. I just would like to see people move away from things like Excel.

  59. Kev-in-Uk says the GISS team should document each change and the effect on all the data from the previous change. The implication seems to be that he thinks they take the previous set of adjusted changes and make a further set of changes on that. Of course that isn’t what happens – each time they would go back and reprocess the raw data.

    What happens is that some new anomaly comes to light in the temperature data set, leading to a requirement for a change to the adjustment process. A new version of the adjustment computer code would be written. Once this has been run against the raw data, the new set of temperature data set version would be subject to regression testing to check that the new code version has not re-introduced any of old identified problems and that it does indeed fix the newly identified anomaly. At that point the new version of data can be released with a note saying why the processing has changed.

    All this should be standard IT practice – nothing special about temperature data sets here.

    In fact it will be a two-step process – the raw station data is processed in the manner described above, then a separate step is run which aggregates the results with suitable weighted averaging to provide data for a region, set of grid points or globally.

    Now what is the important thing? Is it important to know what errors were present in every single one of the past versions of data and to work out exactly how correction of these errors has changed every single station reading? This would be a total waste of time.

    The important thing is to be able to show that the new set of data contains none of the original flaws or biases that the data analysts have spent so much time trying to identify. The real deliverable for openness is the new version of code used to make adjustments on the raw data. An expert reading this code could determine the method of adjustment, whether it resolved all the issues identified in the past, and whether it was likely to introduce any systematic biases of its own.

    You would hope that over time the frequency and the effect of new adjustments would go down, and this seems to be what is happening to the GISTEMP data set.

    There are also ways you can use to validate whether the adjusted data is as good as you can make it….

    Someone else can write a completely different set of code using different techniques for resolving known anomalies in the same raw data, and this is what was done with the BEST project. BEST found that the adjustment techniques used by the NOAA were pretty good.

    You can compare with the satellite data, available only since 1979. However, this has problems of its own. While weather stations take readings from one place at one point in time, satellite sensors in the various microwave bands read the average temperature over a particular (and usually fairly broad) range of heights in the atmosphere. To get the “lower troposphere” temperature you have to add and subtract temperatures for different bands, then make allowances for orbital changes. Further, somehow you have to back-calibrate all sensors from old satellites against the better sensors on the newer satellites, when not all old satellite sensors did not overlap in time with the new sensors. Some new satellite sensors have had to be disregarded because they were clearly faulty. All in all it makes ground station adjustments look easy which why the RSS and UAH processing often give very different answers from the same set of available sensor readings.

    Some fairly significant problems were found with these satellite data sets in the early days, and it is still not clear how accurate they are. The resulting temperatures swing higher on the peaks and lower on the troughs of global temperature than the ground station readings. But it does provide a completely independent set of readings against which ground station readings can be compared.

  60. Peter:

    At July 5, 2014 at 1:44 am you say

    Now what is the important thing? Is it important to know what errors were present in every single one of the past versions of data and to work out exactly how correction of these errors has changed every single station reading? This would be a total waste of time.

    NO! ABSOLUTELY NOT! Only a charlatan could think such a thing!
    The asserted “total waste of time” provides uncorrected error to every analysis which uses the data set. No scientist can find that acceptable.

    So, it IS “important to know what errors were present in every single one of the past versions of data”. And that can only be determined in falsifiable manner by knowing “exactly how correction of these errors has changed every single station reading”

    You follow that with this

    The important thing is to be able to show that the new set of data contains none of the original flaws or biases that the data analysts have spent so much time trying to identify. The real deliverable for openness is the new version of code used to make adjustments on the raw data. An expert reading this code could determine the method of adjustment, whether it resolved all the issues identified in the past, and whether it was likely to introduce any systematic biases of its own.

    But that is an admission of “original flaws” in the previous data, and it says the data analysts have identified those “flaws”. Why not publish the “flaws” and their effect if they truly are identified?

    Richard

  61. Ooops.

    I wrote
    The asserted “total waste of time” provides uncorrected error to every analysis which uses the data set. No scientist can find that acceptable.
    I intended to write
    The asserted “total waste of time” provides unquantified error to every analysis which uses the data set. No scientist can find that acceptable.

    Sorry.

    Richard

  62. Dear Mr. Dnes,
    I discovered in March 2012 that NASA-GISS changed retro-actively their temperature data offered in March 2010. I had downloaded the 2010-data and put into my archives. Between March 2012 and January 2013 I evaluated the discrepancies between 2010-data and 2012-data for 120 stations. The results are dealt with in a comprehensive paper. Meanwhile a German version has been successfully peer-reviewed and will be published still this year. I prepared also an English Version and send it, among others, also to Mr. Watts but got no answer. I would like to send you that paper – perhaps we can cooperate – if I had your e-mail address. Please send it. Many thanks in advance and best regards
    Friedrich-Karl Ewert

  63. Richard,

    Here is the current GISTEMP log of each updates where they had problems of one sort or another and what they did about it – makes interesting scanning though most data updates seem to go through with no problem – http://data.giss.nasa.gov/gistemp/updates_v3/.

    The pre-2011 update records are here – http://data.giss.nasa.gov/gistemp/updates/.

    Here’s a overview description of what they do with the data – http://data.giss.nasa.gov/gistemp/sources_v3/gistemp.html

    If you want the current code which does all these things it is linked to from here – http://data.giss.nasa.gov/gistemp/sources_v3/

    And here’s a FAQ page – http://data.giss.nasa.gov/gistemp/FAQ.html

  64. Peter:

    Thankyou for your reply to me at July 5, 2014 at 3:46 am.

    I appreciate your links which may be of use to others who were not aware of them.

    There are two issues your post ignores.

    Firstly, the GISS changes continue to occur almost every month and have done this. Such major changes invalidate any assessments based on the data from earlier ‘versions’.

    Secondly, in response to your having written

    Now what is the important thing? Is it important to know what errors were present in every single one of the past versions of data and to work out exactly how correction of these errors has changed every single station reading? This would be a total waste of time.

    I replied

    NO! ABSOLUTELY NOT! Only a charlatan could think such a thing!
    The asserted “total waste of time” provides unquantified error to every analysis which uses the data set. No scientist can find that acceptable.

    So, it IS “important to know what errors were present in every single one of the past versions of data”. And that can only be determined in falsifiable manner by knowing “exactly how correction of these errors has changed every single station reading”

    Your post purports to be a response to my reply but fails to mention my reply.

    Richard

  65. Friedrich-Karl Ewert says:
    > July 5, 2014 at 2:58 am
    >
    > Dear Mr. Dnes,
    > I discovered in March 2012 that NASA-GISS changed retro-actively their

    First, a note that I do not have a university degree. My only post-secondary piece of paper is a community-college certificate in computer programming many years ago. Beyond that, I’m an interested layman. My main talent is number-crunching data. Sort of like “Harry the programmer” in ClimateGate. I don’t have scientific/statistical background to do an in-depth analysis on my own. For instance, in this article, I simply took available data, reformatted it for importing into a
    spreadsheet, and plotted the data. I’m willing to help within my abilities, but I wanted you to know my limitations.

    By the way, the contact form on your website is apparently “under construction”.

  66. Peter says:
    July 5, 2014 at 1:44 am
    I was actually referring to all the temp dataset holders – which have been shown elsewhere to be erroneous (can’t remember the thread, but I had to point out that BEST di use adjusted data despite Mosh and others protestations to the contrary)
    Returning to GISS in particular, I have no idea if GISS do adjustments ‘on’ adjustments themselves (but we know others do!) but my primary point was to say that with ALL the datasets, and remember that many of which are cross correlated to some degree as they share station data, etc – I have never YET seen a raw dataset, followed by an adjusted dataset COMPLETE with description of each and every ‘wave’ of adjustments (and I don;t mean a generic type description – I mean a station by station algorithmic and then manual QC check followed by careful checking of details (e.g. does the station still exist even though there is missing data! see recent wuwt thread! LOL) and recording of said findings and finally explanation of the adjustment made and why. Simply put – without such detail – any data made available or used in any scientific capacity (such as say, ;proving’ CAGW – lol again!) is scientifically invalidated as the scientific method is clearly being ignored.
    When the harryreadme txt file was released it clearly shows that no one knows the situation with crutemp or hadcrut (cant remember which one) and hence the data is unverified – or more specifically UNVERIFIABLE in any way shape or form! (this of course, bearing in mind that Jones stated the raw data was not available anymore!).
    I have no problem with adjustments – or more specifically ‘corrections’ being made – but it you cannot produce a full traceability trail for the ‘data’ – and I mean FULL – forget it…..without such traceability, the data and the data pushers are putting themselves up for ridicule!

  67. Well, seeing as how WUWT regularly has his facts wrong, blunders in essential methodology (sorting datasets before computing correlation) and so on, I do not see why I should immediatly believe wat they say here.

    [Reply: State which facts are wrong. ~ mod.]

    Paai

  68. Kev-in-UK said “However, when said data is used and upheld as scientifically valid – that changes the game – it must be reproducible and proven as valid. Frankly, to this day, I don’t think we have reliable data – and certainly not without questions as to its history or validity! Ergo, in almost any other scientific endeavour, the ‘results’ or ‘conclusions’ based on such data would be thrown out or at best held in very low esteem (think wagonload of salt here!)”

    Kev,

    Sure we’d all like perfect historical data with proof that is is valid, but for the historical readings we have got only what we have got, so have to make the best of it – you can’t go back in time to do individual station equipment quality checks. All you can do is to quality check your current equipment and then post-process and adjust the historical observations to see what you can get out of them.

    “Throwing out” is not an option – we have no better data, and “esteem” should be expressed in more statistical terms.

    Specifically, with post processing and adjustment of station readings you should not only emerge with a best estimate as to what the temperature actually was at that time, but also with a statistical calculation of the expected error in the readings. That error estimate can be used to tell you how much confidence to place in any results you get when you use the figures, and what the likely bounds of the results should be.

    So maybe a scientist does a sound piece of research using the adjusted data, including a detailed error calculation, and produces a well-written paper containing, maybe a trend with error bars. What happens then? Well it probably gets published in a reputable peer-reviewed journal which only makes articles available online behind a paywall. Scientists and academics have access because their institutions subscribe, but general members of the public (who have often paid for the research in their taxes) would have to pay $20-30 dollars per throw to get at such stuff, and who can afford it, particularly since, despite reading the abstract, you don’t necessarily know until after you have bought a paper whether it really was worth it. Such is life!

  69. Richard, splitting my response to your post this part relates to your statement “Firstly, the GISS changes continue to occur almost every month and have done this [link supplied]. Such major changes invalidate any assessments based on the data from earlier ‘versions’.”

    The link is interesting because it brings out a few points.

    Firstly, the 1987 adjustments were made more than ten years before Mann published his hockey stick paper. In those days everyone assumed that temperature variations were more or less random with no detectable long-term trend, but in a few thousand years time there was likely to be an ice age. In my mind is an image of Hansen poring over the temperature data and asking himself whether anyone but a few academic researchers was ever going to be interested in a new graph of adjusted GISS historical temperature records.

    So even if you believe an some sort of AGW conspiracy post Mann, it clearly was not in existence in 1987, so whoever was responsible for the data set changes clearly thought it was the right thing to do.

    Since it is almost impossible to attribute an ulterior motive to the 1980 / 1987 differences, why is there such keenness to do that for the 1980 / 2007 or 1987 / 2007 differences? Why attribute a change of motive?

    Secondly, the most important thing is surely whether the most recently published version is correct. The existence of older versions with known flaws does not change the confidence in the most recent version, and there should be a an accompanying set of error data with it to allow error bars to be placed on results using that data.

    Thirdly, although the data set does change regularly, in general any significant set of changes is denoted by giving a data set a new version number. Here is the scheme used for the GHCM dataset which is used as source data by the GISS processing :-

    ‘The formal designation is ghcnm.x.y.z.yyyymmdd where
    x = major upgrades of unspecified nature to either qc, adjustments, or station configurations and accompanied by a peer reviewed manuscript
    y = substantial modifications to the dataset, including a new set of stations or additional quality control algorithms. Accompanied by a technical note.
    z = minor revisions to both data and processing software that are tracked in “status and errata”. ‘

    It is surely only worth repeating any assessments when the x or y control number changes. This does not seem to be on a monthly basis, but rather less frequently than annually. Further, if the results from using a new subversion (change in y) are pretty similar to those from the previous subversion then again it is not worth a full assessment. In other words the guys supplying the data go out of their way to make life easy for you.

  70. RichardSCourtney said ‘The asserted “total waste of time” provides unquantified error to every analysis which uses the data set. No scientist can find that acceptable.
    So, it IS “important to know what errors were present in every single one of the past versions of data”. And that can only be determined in falsifiable manner by knowing “exactly how correction of these errors has changed every single station reading”‘

    My reading of your statement is that you are saying that you wish to know about every single station reading in every single version of a temperature data set because somehow it affects the errors present in the most recent version of the data set. If that is not what you are saying then you had better correct me.

    Each new version of GISTEMP or UKHCN stands alone. The metadata documents all errors anomalies and identifiable inaccuracies which have every been discovered with the raw or source data and the code release for each version handles all of these. They are documented and handled not as fixes to a single weather station and/or single reading, but as a fix to a situation which is going to occur for multiple weather stations and/or readings. Further there will be an accompanying set of error data with each release of “best estimate” temperature data.

    So we have a current list of situations handled by the current code. It is difficult to see how the subset of situations handled by previous versions of the code has any bearing at all on the best estimate temperature data or the error estimates for a current data set. Typically an old version would handle only a subset of the anomaly situations, or an inferior algorithm for a particular situation.

    In other words once BEST (or whatever process you are using to validate a data set) had done comparisons with GISTEMP (or some other dataset) at a particular version and validated the processing, then there is no more to be said about any previous versions and no point in making comparisons between older versions pre-validation and newer versions post-validation.

    Even in there is no validation process the presence or absence of differences between the current and past versions of a data set (particularly ones from 20 years ago) does not change the validity or error estimates for the current version, though they might indeed change the error estimates for the older versions.

  71. Peter:

    Thankyou for your post at July 6, 2014 at 7:24 am which attempts to address my objections to your assertions.

    Firstly, you say to me

    My reading of your statement is that you are saying that you wish to know about every single station reading in every single version of a temperature data set because somehow it affects the errors present in the most recent version of the data set. If that is not what you are saying then you had better correct me.

    What I “wish to know” is not relevant. Scientific standards require publication of each and every change to each and every datum because the data has been changed and somebody who wants to use it may need to know the change; n.b. each and every change to each and every datum.

    The effect of what you say is for you to be claiming that an assertion of “Trust me cos I’m a scientist” is an adequate replacement for a detailed exposition of each and every change to each datum. It is not an adequate replacement.

    You make an extraordinary statement about the changes when you write

    They are documented and handled not as fixes to a single weather station and/or single reading, but as a fix to a situation which is going to occur for multiple weather stations and/or readings.

    The documentation is inadequate if it does not apply to each change. Simply, an undergraduate would get a FAIL mark on an assignment in which she said her documentation concerning changes did not report the changes to each “single weather station and/or single reading”.

    You conclude saying

    Even in there is no validation process the presence or absence of differences between the current and past versions of a data set (particularly ones from 20 years ago) does not change the validity or error estimates for the current version, though they might indeed change the error estimates for the older versions.

    The data changes most months and previous versions differ dramatically. As I said,

    Such major changes invalidate any assessments based on the data from earlier ‘versions’.

    You now say to ignore those earlier versions, but those versions – which were used to justify political actions – were asserted to be more accurate than the changes made subsequently. Frankly, the GISS history of error estimation provides doubt concerning the validity of the present error estimates.

    Richard

  72. Richard said “Scientific standards require publication of each and every change to each and every datum because the data has been changed and somebody who wants to use it may need to know the change; n.b. each and every change to each and every datum.”

    The computer run to produce a specific version of an adjusted dataset always starts with the raw data and not with a previous version of the dataset. Thus there is no scientific or statistical requirement to document changes by station from previous versions of the dataset.

    The documentation required is the following :

    1) Raw input data
    2) Specification of the processing which will be done on the raw data.
    This will typically use language describing common conditions. For instance, it might describe how to handle missing readings from stations which would normally report. This is a generic process – there is no requirement to list in the specification all the days missing for each station and new data may come to light at some point which fixes the problem in later runs.
    3) The computer code which does the processing.
    Someone wishing to validate the process can read this code to look for problems and may wish to amend it to make it easier to check how the anomalies relating to certain stations were handled if it does not already write out interim files after each adjustment step. The code should clearly handle all the situations documented in the specification correctly. Another method of checking the code would be to write an independent set of code from scratch (which has been done for GISS adjustment code) and then just compare the two sets of output which should then be substantially the same, though not necessarily absolutely identical. It is very unlikely that the same bug will be present in two independently written versions of code although they might handle the same condition in very slightly different ways.
    4) The adjusted output data

    All this stuff is available for GISS temperature data.

    No-one in their right mind is going to write a spec which includes a detailed set of processing requirements for every individual station – it is a total and utter waste of time as the information can be extracted from a computer run if necessary. It is possible that some conditions are in the spec and apply to only a single station and are documented as such, but this is likely to be a very small number.

    The data on precisely what changes are made to each station reading during the different processing steps within a specific (normally the current) version of code are available from the computer run, if they are needed.

    Now this is a definitive and complete set of information to tell an interested party exactly how the data adjustments have been done in a particular version for each an every station if necessary.

    You seem to be suggesting that such validation should include looking at all previous data set versions , but clearly from the above this is not necessary in the process of validation in order to meet the requirements of the users, the science and the statistics.

  73. Peter:

    You follow your fog of irrelevant detail in your post at July 7, 2014 at 4:28 pm with this statement

    You seem to be suggesting that such validation should include looking at all previous data set versions , but clearly from the above this is not necessary in the process of validation in order to meet the requirements of the users, the science and the statistics.

    No, Peter. I am making two specific points which you are avoiding.

    I copy those points from my most recent post to you at July 6, 2014 at 8:21 am although that post iterated them from earlier posts I put to you.

    Point 1 repeatedly not addressed by ‘Peter’

    What I “wish to know” is not relevant. Scientific standards require publication of each and every change to each and every datum because the data has been changed and somebody who wants to use it may need to know the change; n.b. each and every change to each and every datum.

    The effect of what you say is for you to be claiming that an assertion of “Trust me cos I’m a scientist” is an adequate replacement for a detailed exposition of each and every change to each datum. It is not an adequate replacement.

    i.e. basic scientific standards require that each and every change to each and every datum is fully described and recorded for each and every datum.

    Point 2 repeatedly not addressed by ‘Peter’

    The data changes most months and previous versions differ dramatically. As I said,

    Such major changes invalidate any assessments based on the data from earlier ‘versions’.

    You now say to ignore those earlier versions, but those versions – which were used to justify political actions – were asserted to be more accurate than the changes made subsequently. Frankly, the GISS history of error estimation provides doubt concerning the validity of the present error estimates.

    i.e. Data now provided by GISS indicates that earlier data sets from GISS are so erroneous as to be worthless which renders worthless all analyses that used the earlier data sets and, therefore, the history of the changes GISS data suggests that all data from GISS is so erroneous as to be worthless: this suggestion is supported by the continuing process of regular changes which GISS continues to make to the data.

    I hope the issues are now clear.

    Richard

    • Richard,

      It is fairly clear that your objections to the data adjustments in the GISTEMP data sets are mainly that you do not like the results rather, rather than being based on a clear understanding of precisely how and why temperature data is adjusted.

      In the unlikely event you wish to learn about temperature data sets and how they are processed rather than just making evidence-free political comments then there is an organisation http://www.surfacetemperatures.org/ which is making available temperature data all the way from sources such as images of hardcopy to the finished adjusted data set using code supplied. It also welcomes involvement from outside the climatology community.

      So the opportunity is there to find out for yourself if you wish to avail yourself of it.

  74. For [b]Point 1[/b] Richard Courtney said “.. basic scientific standards require that each and every change to each and every datum is fully described and recorded for each and every datum.”

    You are making up requirements that do not exist. Here is why.

    I was responsible for the design of multiple government systems to track farm animals and their diseases in the UK. The first databases indeed had two time axes – one was the date of an event (e.g. date a cow was born and who its parents were) and the second was the time when the system was notified of that event (e.g. one week after the birth0 which had to be after the event itself, naturally. You could answer questions like “what was our view on 1st January 2005 of the count of cows and their split by breed.” The system had this capability on source data feed, reformatted and final query versions of the database.

    However, in practice this led to a complex and expensive system to build and maintain and the capability was almost never used. The later systems retained the second time stamp on the source data feed only, eliminated the reformatted stage and the final query version became just the most up-to-date version. But should anyone want to go back to the view of a particular animal or farm on a specific date, then it was possible to do this because the information was available in the source data. A competent data analyst could either examine the source records and build a report from them, or the system team could rebuild a version of the database in another place as it would have been on a particular date allowing the standard query tools to be used against it.

    You do not appear to distinguish between the raw data and the adjustment processing. The raw data for a particular [b]does not normally change[/b] between published versions of the adjusted data output. Perhaps occasionally someone might find an additional paper record for a station that was not available before but this is once in a blue moon.

    The processing between versions is different – the new version processing is an incremental improvement on the old version. But the spec, code and adjusted output data is available for each version, and from that a competent person can analyse why a particular reading (e.g. global monthly average for June 2005) changed between versions. It would be a waste of expensive resource for the GISS team to go any further because you cannot really guess what a particular user of the data might want in the way of comparing versions. The user of the data can get what they want out of the data – the capability is there – but they have to be sufficiently computer skilled (which is pretty much a prerequisite for most of this stuff).

    At an overview level it is pretty clear why the new version adjusted output differs from the old version output – it was because the old version did not handle documented anomaly [b]types[/b] X, Y and Z which are handled in the new version and documented as new in the new version spec.

    As far as odd changes to the most recent version output data caused by some records coming in late, this would only affect the most recent few weeks of the adjusted output, and any competent user would know this. Generally the GISS event log documents this. The easy way out of it is to avoid using the most current data (maybe for a period of a few weeks) when doing an important analysis. Or just use it but expect changes – but if you do, don’t come back and moan when the data for the most recent few weeks does change.

    There’s no “trust me I’m a scientist” involved in the comparison for a particular purpose of current with past versions, but certainly whoever does the comparison is going to have have to be a competent computer programmer. All the stuff required is there.

    [Square html brackets do NOT work under the WordPress settings on this site. Use normal html coding angled brackets only. We recommend the "Test" page to verify your work. .mod]

  75. Richard Courtney’s Point 2 was “… data now provided by GISS indicates that earlier data sets from GISS are so erroneous as to be worthless which renders worthless all analyses that used the earlier data sets and, therefore, the history of the changes GISS data suggests that all data from GISS is so erroneous as to be worthless: this suggestion is supported by the continuing process of regular changes which GISS continues to make to the data.”

    (Aside – how do you highlight text here? BBcode tags don’t work. I’m trying raw HTML tags next!)

    Richard, if you are talking about the fact that anything based on the most recent couple of weeks of data changes then this is just a fact of life as data feeds can go wrong and when they do it takes a short while to fix matters. The user of the data always has to be aware of this.

    Clearly the adjusted data output also changes between versions. Assume the same model for anomalies found and corrected as you would for a system test of a new IT system, but extended over a period of years or decades rather than weeks or months.

    In the early days you will easily find gross errors and will find them at a fast rate, plus some of the more subtle errors. However, at least for an IT system test, there are going to be a few subtle errors you could not find at this point as they are masked by other errors.

    As time goes on then most of the gross errors will have been found, and the rate of finding errors will drop. If you plot total errors against time then it will be an asymptotic curve. As a project manager you can work out when you are likely to finish system test by seeing where you are on that curve. Note that you expect you will only find a proportion of errors in testing (hopefully a high proportion of them) – not all of them.

    So where is GISS on the curve of finding and fixing anomalies in the raw data. The only examples you supplied were for 1980, 1987 and 2007 and to be honest the 1987 to 2007 comparison is pretty worthless because you need more points than these to define the curve, particularly more recent points. There were certainly significant changes visible between the three years.

    Nick Stokes (a reputable climate data analyst) has said that the GISS team make very few changes to their processing methods nowadays and that little changes in the historical adjusted data. In the absence of any evidence to the contrary we might as well believe Nick. It is thus increasingly more difficult to find new anomalies to fix, that is indicative that the adjustment process copes with most of the anomalies present in the raw data and that the adjusted data output is mature and likely to be approaching its ultimate accuracy.

    The fact that there are huge changes between, say 1987 and 2007 only tells you that 1987 was wrong – it says nothing about the state of the 2007 data, for which you would need to know the rate of changes going on around 2007.

    A V2 to V3 comparison of data around 1980 (as in your three graphs) would give a good indication of the state of maturity of the GISS feed. In the absence of any such comparison Nick’s comments indicate that the quality of the GISS feed is now high. Comparisons with 1980 and 1987 are worthless in this regard.

  76. Peter:

    Your post at July 8, 2014 at 2:11 am says

    I was responsible for the design of multiple government systems {snip}

    That explains everything; i.e. you are an example of Sir Humphrey Appleby.

    It is no wonder that all I have obtained from you is obfuscation and bloviation.

    And the fact that you messed up the task of designing “multiple government systems to track farm animals and their diseases in the UK” is no reason for GISS to also abandon the scientific method.

    Richard

  77. Peter:

    At July 10, 2014 at 12:51 pm you say to me

    It is fairly clear that your objections to the data adjustments in the GISTEMP data sets are mainly that you do not like the results rather, rather than being based on a clear understanding of precisely how and why temperature data is adjusted.

    NO!
    Anybody who reads the thread can see that is not true.

    I have repeatedly asked for specific justifications using as many different forms of words as I could. You have replied with evasion and obfuscation.

    I know the published details of what is done to “adjust” the data and said so when you first posted links to such procedures as one of your evasions. And I will not lower myself to share in “involvement” with it.

    What is done is not at issue. At issue are
    (a) the validity of what is done
    and
    (b) the effects of what is done.

    I have been querying (a) and (b) and you have tried to discuss details of what is done as a method to avoid the discussion.

    Richard

Comments are closed.