GISS "raw" station data – before and after

I’ve been following this issue a few days and looking at a number of stations and had planned to make a detailed post about my findings, but WUWT commenter Steven Douglas posted in comments about this curious change in GISS data recently, and it got picked up by Kate at SDA, which necessitated me commenting on it now. This goes back to the beginning days of surfacestations.org in June 2007 and the second station I surveyed.

Remember Orland? That nicely sited station with a long record?

Note the graph I put in place in June 2007 on that image.

Now look at the graph in a blink comparator showing Orland GISS data plotted in June 2007 and today:

NOTE: on some browsers, the blink may not start automatically – if so, click on the image above to see it

The blink comparator was originally by Steven Douglas. However he made a mistake in the “after” image which I have now corrected.What you see above is a graphical fit via bitmap alignment and scaling of the images to fit. This is why the dots and lines appear slightly smaller in the “after” image.  I don’t have the GISS Orland data handy at the moment from 2007, but I did have the GISS station plots from Orland from that time and from the present, downloaded from the GISS website today. If I locate the prior Orland data, I’ll redo the blink comparator.

I believe this blink comparator representation accurately reflects the change in the Orland data, even is the dots and lines aren’t exactly the same thickness.

Douglas writes in his notice to me:

It appears that RAW station plots are no longer available, although NASA GISS (Hansen et al) do not say it in this way. Here is the notice on their site:

Note to prior users: We no longer include data adjusted by GHCN and have renamed the middle option (old name: prior to homogeneity adjustment).

I don’t know about the “renamed” option, but the RAW data appears to be NO LONGER AVAILABLE.

Here’s a detailed blink comparison of Orland. All their options now give you an “adjusted” plot of some kind. The “AFTER” in this graph show the “adjustments” to Orland.

Here is what the GISS data selector looks like now, yellow highlight mine, click to enlarge:

Above clip from: http://data.giss.nasa.gov/gistemp/station_data/

Here is the “raw” GISS data plot of Orland I saved back in 2007:

Click for full sized

And here is another blink comparator of Orland raw -vs- homogenized data posted by surfacestations.org volunteer Mike McMillan on 12/29/2008:

click for full size

And here is the “raw” GISS data for Orland today, please note the vertical scale is now different since the pre-1900 data has been removed, the GISS plotting software autoscales to the most appropriate range:

click for source image from NASA GISS

Source:

http://data.giss.nasa.gov/cgi-bin/gistemp/gistemp_station.py?id=425725910040&data_set=0&num_neighbors=1

And it is not just Orland, I’m seeing this issue at other stations too.

For example Fairmont, CA another well sited station well isolated, and with a long record:

Here is Fairmont “raw” from 11/17/2007:

click for full size

And here is Fairmont from GISS today:

click for source image from NASA GISS

Source:

http://data.giss.nasa.gov/cgi-bin/gistemp/gistemp_station.py?id=425723830010&data_set=0&num_neighbors=1

This raises a number of questions. for example: Why is data truncated pre-1900? Why did the slope change? The change appears to have been fairly recent, within the last month. I tried to pinpoint it using the “wayback machine” but apparently because this page:

http://data.giss.nasa.gov/gistemp/station_data/

is forms based, the change in this phrase:

Note to prior users: We no longer include data adjusted by GHCN and have renamed the middle option (old name: prior to homogeneity adjustment).

Appears to span the entire “wayback machine” archive, even prior to 2007. If anyone has a screen cap of this page prior to the change or can help pinpoint the date of the change, please let me know.

It is important to note that the issue may not be with GISS, but upstream at GHCN data managed by NCDC/NOAA. Further investigation is needed to found out where the main change has occurred. It appears this is a system wide change.

The timing could not be worse for public confidence in climate data.

I’ll have more on this as we learn more about this data change.

UPDATE1 from comments:

GISS also just started using USHCN_V2 last month. See under “What’s New”:

http://data.giss.nasa.gov/gistemp/graphs/

“Nov. 14, 2009: USHCN_V2 is now used rather than the older version 1. The only visible effect is a slight increase of the US trend after year 2000 due to the fact that NOAA extended the TOBS and other adjustment to those years.

Sep. 11, 2009: NOAA NCDC provided an updated file on Sept. 9 of the GHCN data used in our analysis. The new file has increased data quality checks in the tropics. Beginning Sept. 11 the GISS analysis uses the new NOAA data set. ”

Share


Sponsored IT training links:

Worried about N10-004 exam? Our 640-802 dumps and 70-680 tutorials can provide you real success on time.


Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
246 Comments
Inline Feedbacks
View all comments
supercritical
December 12, 2009 2:58 pm

Richard S Courtney (02:44:42) :
I hope you will email your post to the recently-appointed CRU Investigator, Sir Muir Russell
… and also to Professor Philip Stott, the well-connected sceptic ( who also has an excellent blog:
http://web.mac.com/sinfonia1/Clamour_Of_The_Times/Clamour_Of_The_Times/Clamour_Of_The_Times.html

December 12, 2009 7:49 pm

Continuing with NASA’s tutorial on how to bake/[RE]construct hockey sticks, using nothing more than raw data, which is done in three phases.
Ingredients:
Raw GHCN Station Plot (MUST BE COOKED – DO NOT EVER SERVE RAW):
Directions:
1) VALUE ADDING
Take raw GHCN data and mix it well with USHCN corrections. This is now “value added” data, so post it for public viewing as:
“raw GHCN data+USHCN corrections”.
2) QUALITY CONTROL
Discard dangerous original raw GHCN data, making it “quality controlled”, by removing it from public view.
3) HOMOGENIZING
The value added (USHCN corrected) data will not be considered complete or useful until after it is homogenized. Do this by folding in homogeneity adjustments as appropriate or necessary. Post this for public viewing as:
“after homogeneity adjustment”
For an example of the above, using Santa Rosa (38.5 N 122.7 W) click on the following link:
http://examples.com/giss/santarosa_phased.gif (140k .gif file)
PHASED steps (.gif still files in the animation)
http://examples.com/giss/santarosa_raw.gif (54k .gif)
http://examples.com/giss/santarosa_ushcn.gif (58k .gif)
http://examples.com/giss/santarosa_homogenized.gif (53k .gif)
Gallery files of Santa Rosa station survey:
http://gallery.surfacestations.org/main.php?g2_itemId=692&g2_page=2
Original charts (all auto-adjusted to different temperature scales):
RAW: http://examples.com/giss/santarosagiss_raw.gif
(from gallery.surfacestation.org)
USHCN: http://examples.com/giss/santarosagiss_ushcn.gif
AFTER HOMOGENEITY: http://examples.com/giss/santarosagiss_homogeneity.gif
With Santa Rosa, they’ve taken what was essentially a raw set of real temperature data that showed long term steady upward trend since 1900, and turned it into something that is essentially trendless, until a final sharp upper tick at the end. AKA – one more variation of a hockey stick.
Nothing short of amazing.

Richard S Courtney
December 13, 2009 1:14 am

supercritical:
You suggest to me ( 14:58:36):
“Richard S Courtney (02:44:42) :
I hope you will email your post to the recently-appointed CRU Investigator, Sir Muir Russell”
I like that idea. Can you – or anybody else – suggest how I can do that, please?
Richard

E.M.Smith
Editor
December 13, 2009 6:48 am

JerryB (08:31:52) : Perhaps the title of this post should be changed, since GISS has been using NCDC adjusted data, not raw data, for USHCN stations for at least 8 years.
GISS does not use “NCDC adjusted” it uses NCDC produced GHCN “unadjusted” AND USHCN (until about a month ago when it changed to USHCN.v2) and where “unadjusted” is in fact adjusted in some ways, but is labeled “Unadjusted” on their web sites… The bulk of the planet data is the GHCN ‘unadjusted’ dataset while the US-HCN only covers the USA.
In STEP0, GIStemp glues together the GHCN data and the USHCN data into a bastard mix of the two (details only to folks with strong stomachs…). Do you call that raw? Adjusted? Cooked? Half cooked? Half baked? Unadjusted? Maladjusted?
Sometimes it passes GHCN unmodified through, sometimes it passes USHCN straight through, and sometimes it “sort of averages” the two to get a smooth blend of two different offset curves. It all depends on what chunks of which it has…

Jerry
REPLY: GISS in their previous presentation advertised it as “raw” so that is where the reference comes from. -A

That is correct. From the GISS web site point of view the “GHCN unadjusted” data set was called “raw”. That’s what they called the “after STEP0” graphs on their web site.
Now; it has “USHCN corrections” but before it said something more like “Raw GHCN + USHCN combined”. The “corrections” word is a new twist…
(a half hour passes doing “QA” and checking things before hitting “submit comment”…)
Dang it all. They moved the cheese again. From:
http://data.giss.nasa.gov/gistemp/sources/gistemp.html
we have:

For US: USHCN – ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly
9641C_200907_F52.avg.gz
ushcn-v2-stations.txt

Notice that file name: 9641…..F52.avg.gz and that is the one I’ve been using in my USHCN.v2 test runs.
They ARE using the “cooked” “adjusted” USHCN. From the README on that server:

– “9641C_YYYYMM_F52.max.gz” if you want fully-adjusted monthly mean maximum
temperatures (with estimates for missing values);

OK, it’s using “fully-adjusted” (whatever that means, but the one with the most cooking in it…) So we do have the 1/2 degree change in 1934.
Before it said:

For US: USHCN – ftp://ftp.ncdc.noaa.gov/pub/data/ushcn
hcn_doe_mean_data.Z
station_inventory

And at that ftp site, the README says:

hcn_calc_mean_data.Z Time of Observation and Filnet Adjusted Mean Monthly
Temperature (Calculated from hcn_doe_max_data.Z and hcn_doe_min_data.Z)

So I’m left to wonder if “fully adjusted” means the same as “TOBS and Filnet”? What about those other types of adjustments? SHAP? was it?
Also, FWIW, in the other description file, status.txt, we have:

07 August 2009
Raw (unadjusted) data series and series adjusted only for the Time of
Observation bias (TOB) have been added. See the readme.txt file for
file naming conventions and data formats.

So the “raw” USHCN.v2 file is fairly new. Though this still leaves open the question of why “raw” and “fully adjusted” are both different from USHCN The Original and from GHCN.
This is just maddening.
Yes, I’m processing the “right” copies through my copy of GIStemp code, but figuring out what the various data sets and various data set manipulations
means it the nutty bit.
So now were taking the GHCN “unadjusted” and the USHCN v2 “fully adjusted” and blending them, then in STEP2 applying UHI adjustments all over again?
Does anyone have any guidance as to IF NCDC definition of “fully adjusted” includes a UHI adjustment?
I’m beginning to think that it is impossible to get anything resembling “raw” out of NOAA / NCDC no matter what it is called (and no matter how often they, or GISS call it “raw”.
I’m going to take a break before I fulminate…
Someone needs to take the samples of the “New USHCN V2 Raw” posted above and check them against the online pdfs of the paper forms and see if the USHCN.v2 “raw” is remotely like what was put on paper. (Yes, I can always hope…)
Oh, and the ftp site ftp://ftp.ncdc.noaa.gov/pub/data/ushcn also has a folder labeled “dailies’ with a 119 mb file. Perhaps it is the real raw daily data…

DJ Meredith
December 13, 2009 7:13 am

The “Team” references a paper that shows the need for correcting satellite data to more closely match ground data in the Seth Borenstein emails of 7/09. One paper is:
The Effect of Diurnal Correction on Satellite-Derived Lower Tropospheric Temperature
Carl A. Mears 1 and Frank J. Wentz 1
1 Remote Sensing Systems, Santa Rosa, CA 94501, USA.
“Satellite-based measurements of decadal-scale temperature change in the lower troposphere have indicated cooling relative to the surface in the tropics. Such measurements need a diurnal correction to prevent drifts in the satellites’ measurement time from causing spurious trends. We have derived a diurnal correction that, in the tropics, is of the opposite sign from that previously applied. When we use this correction in the calculation of lower tropospheric temperature from satellite microwave measurements, we find tropical warming consistent with to that found in surface temperature and in our satellite-derived version of middle/upper tropospheric temperature.”
Why weren’t the ground data corrected to match the satellite data? If the ground data is corrected, then the satellite data is corrected to which ground data…corrected or uncorrected?

December 13, 2009 9:42 am

Most of the blink comparators I’ve seen at have what look to be minor adjustments, ones that just happen to make the plots trend lightly to a more “hockey-stick” shape. But adjustments upwards to 3ºC that span decades?
http://examples.com/giss/santarosa_phased.gif
Those are ENORMOUS adjustments/corrections.
The “trend” in the corrections on most of what I’m looking at now: The farther you go into the past, the greater the adjustment or correction applied, but always essentially toward the same end. Flatten the overall trend of the “shaft” part (pre-1980-90), and make that “shaft” part lower than the recent past (1980-present).
Is it possible that something along the lines of a Mann/Briffa reconstruction has become so accepted by the collective mind that they’re actually being used to “calibrate”, or otherwise “quality control” real temperature data from the past? It would be simple enough to do with the entire dataset – just feed in an algorithm that checks all the raw data against some governing assumption, call it error on the data’s part, discard anything that strays too far from some predetermined envelope, and adjust and correct as necessary. Automatically, no manual intervention needed. Could that be part of the “quality control” to which the raw data sets have being subjected – and without explanation, no less?
What else could possibly justify wholesale swings in past data like the ones seen in Santa Rosa – where the “shaft” is literally SLAMMED upward, smoothed and flattened to the ceiling, then “homogenized” back downward, but only those parts that are at least a few decades old?
Also, do we really need to attack the entire data set (the way they have)? A small, but extremely detailed sampling of a few of the most egregiously adjusted/corrected stations, thoroughly investigated (beginning with the original paperwork, much of which is still available in .pdf form from the servers), should be sufficient, once debunked, to call the value of the entire data set into question.

yonason
December 13, 2009 12:36 pm

CENTRAL ENGLAND – 1700’s vs 1900’s
http://c3headlines.typepad.com/.a/6a010536b58035970c0128762dba59970c-pi
“I tried to pinpoint it using the ‘wayback machine’…”
Good luck with that. They loose information faster than CRU. They are useless when it comes to important stuff. And, if anyone wants them to remove material, all they need is request it, and poof it’s gone. I know, because I tried to find stuff that was there, and then the next year it was not, and that was about 7 years ago, so if anything they are worse now. Don’t get me wrong, I’m not saying it’s not there, only that if it is, you had better not dally in looking for it.

John N
December 13, 2009 12:43 pm

Re: EM Smith 06:48:52
“Oh, and the ftp site ftp://ftp.ncdc.noaa.gov/pub/data/ushcn also has a folder labeled “dailies’ with a 119 mb file. Perhaps it is the real raw daily data…”
These are text files with daily data. How to determine whether “raw”???
There are two versions, 1998 and 2001. I haven’t made any comparisons.
State01
11084 PRCPH I192801 …
11084 SNOWT I192801
11084 SNWD I192801
11084 TMAX F192801
11084 TMIN F192801

yonason
December 13, 2009 12:51 pm
JohnV
December 13, 2009 3:32 pm

I was finally able to finish comparing raw GHCN from Sept 2007 to GHCN from Dec 2009. Here are the results:
Out of 2.9 million station monthly temps prior to Sept 2007:
2339 (0.1%) temps were added (previously blank, now have values)
833 (0.03%) temps were removed (previously had values, now blank)
2 were changed
So whatever changes exist in GISS data are probably due to the change in GISS algorithm. The raw data from GHCN still looks to be essentially the same.
REPLY: Thanks good work, care to share? – A

December 13, 2009 4:07 pm

“yonason (12:51:39) :
WHOA! This guy seems to be onto something.
http://themigrantmind.blogspot.com/2009/12/hundred-years-of-october-cooling.html
Wow. Gives new meaning to the phrase “October surprise”. I wonder what the October’s counter-month, April, looks like. Usually limited heating and air conditioning in those months as well.
And the fact that he was able to glean that from existing GHCN data tells me that…
…NASA has some more adjusting and homogenizing to do. Obviously, if it’s not agreeing with the GCM’s, and only agrees with the tree-rings (which we already know are completely valid and reliable save for the past 60 years), then there’s probably something wrong with the data, and a compelling reason to confine the selection to other months — even it menas defining what the other months are!
Ah well, a climate cowboy’s work is never done.

yonason
December 13, 2009 4:45 pm
yonason
December 13, 2009 4:46 pm

I meant to address Steven Douglas (16:07:29) : in my last.

yonason
December 13, 2009 5:21 pm

ONE BUSY COWBOY
“The actual yearly output numbers [for the “millenium simulatin -AR4”] are in the email. So, I took the column identified as global average and plotted it. What a surprise. The hypocritical hot air coming out of the climatologists that all their models show unprecedented warming is simply not true.”

D L Kuzara
December 13, 2009 7:50 pm

Reto Ruedy replied on 12/12/2009 2:41 PM
Dennis Kuzara,
As noted on our update page
http://data.giss.nasa.gov/gistemp/updates/
we switched on November 13 from USHCN-version 1 data to USHCN-version
2 data. My guess is that you must have compared the data shown on Dec 5 to data shown before Nov 13, the most recent update. The next update will be early next week.
As described in the section “Current Analysis” on
http://data.giss.nasa.gov/gistemp/
we download from the web data sets prepared by NOAA and SCAR that are available to anybody and start from there. Any changes you notice in the station data if you select “raw GHCN + USHCN corrections” occurred before we get the data and you’d have to contact NOAA for further information.
As you know, you can download all station data as they were before and after our homogeneity adjustment.
Over the US we are using satellite night light data to determine whether to adjust a record or not rather than population data, and Orland’s data were bright enough to trigger our adjustment.
NOAA provides both adjusted and non-adjusted data on
ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly
9641C_200907_raw.avg.gz is the file you might want (we use F52). Orland’s data are the lines starting with “046506”. Notice that this file is compressed (use gunzip to uncompress) and the data are in units of one tenth deg_Fahrenheit.
If you have trouble extracting those data, I can send them to you.
Reto Ruedy

December 13, 2009 9:44 pm

“Over the US we are using satellite night light data to determine whether to adjust a record or not rather than population data, and Orland’s data were bright enough to trigger our adjustment.”
I could almost see this as a generalized explanation, but Orland was specifically discussed. Orland, with a 2000 census population of 6,281, “bright enough” to trigger an adjustment (primarily) to its pre-industrial era temperatures, including a complete discard of all pre-1900 data?
Actual adjustments to Orland’s *recent* temperature data (which is, ostensibly, what comparisons should be looking for to trigger adjustments) were negligible. The bulk of the adjustments, and they weren’t minor, were all pre-1942. How could satellite data tell us ANYTHING about data from 1900-1942, which is where all the largest adjustments were made, let alone data recorded from 1880-1900, now discarded?
Furthermore, and just as importantly, all the other data on the site appear to be well archived and still available. Why is there no ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v1 folder?

Glenn
December 13, 2009 10:31 pm

E.M.Smith (06:48:52):
“Oh, and the ftp site ftp://ftp.ncdc.noaa.gov/pub/data/ushcn also has a folder labeled “dailies’ with a 119 mb file. Perhaps it is the real raw daily data…”
After seeing this “hcnmweb” folder placed in this index I wouldn’t be surprised that “dailies” was a to-do list from the little lady.

JJ
December 14, 2009 7:31 pm

““Nov. 14, 2009: USHCN_V2 is now used rather than the older version 1.
The only visible effect is …”
It’s the invisible ones I’m worried about!

Editor
December 14, 2009 11:27 pm

Dave F (22:35:13) :
I meant they have the same “raw” data. But I am not even sure that is the case. Does the data have its own versions too? That seems very odd.

Very odd indeed, but “the data” comes in several flavors and versions… And that, IMHO, is the problem.

Gary Plyler
December 15, 2009 10:39 am

And Winston looked at the sheet handed him:
“Adjustments prior to 1972 shall be -0.2 degrees and after 1998 shall be +0.3 degrees.”
Winston wondered at the adjustment to the data. At this point, no one even knows if the data, prior to his adjustments, was raw data or already adjusted one or more times previously.
It didn’t matter. All Winston was sure of is that one of the lead climatologists needed more slope to match his computer model outputs. He punched out the new Fortran cards and then dropped the old cards into the Memory Hole where they were burned.
“There!” Winston exclaimed to himself. “Now the temperature data record is correct again. All is double-plus good.”

December 17, 2009 11:16 pm

Searching Google Books for “Climatological data: national summary” might yield interesting results.

1 8 9 10