Something hinky this way comes: NCDC data starts diverging from GISS

I got an email today from Barry Hearn asking me if I knew what was going on with the NCDC data set. It seems that it has started to diverge from GISS, and now is significantly warmer in April 2009.

What is interesting is that while NCDC went up in April,  UAH, and GISS both went down. RSS went up slightly, but is still much lower in magnitude, about 1/3 that of NCDC.  HadCRUT is not out yet.

Here is a look at the most recent NCDC data plotted against GISS data:

NCDC-GISS
click for larger image

Here is a list of April Global Temperature Anomalies for all four major datasets:

NCDC   0.605 °C

GISS    0.440 °C

RSS    0.202 °C

UAH   0.091 °C

It is quite a spread, a whole 0.514°C difference between the highest (NCDC) and the lowest (UAH), and a 0.165°C difference now between GISS and NCDC. We don’t know where HadCRUT stands yet, but it typically comes in somewhere between GISS and RSS values.

Source data sets here:

NCDC

ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/monthly.land_and_ocean.90S.90N.df_1901-2000mean.dat

Previous NCDC version to 2007 here: ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/monthly.land_and_ocean.90S.90N.df_1961-1990mean.dat

GISS

http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts+dSST.txt

RSS

ftp://ftp.ssmi.com/msu/monthly_time_series/rss_monthly_msu_amsu_channel_tlt_anomalies_land_and_ocean_v03_2.txt

UAH

http://vortex.nsstc.uah.edu/public/msu/t2lt/tltglhmam_5.2

While it is well known that GISS has been using an outdated base period (1951-1980) for calculating the anomaly, Barry points out that they have been tracking together fairly well, which is not unexpected, since GISS uses data from NCDC’s USHCN and COOP weather station network, along with GHCN data.

Click for larger image
Click for larger image

NCDC made the decision last year to update to a century long base period, this is what Barry Hearn’s junkscience.com page said about it then:

IMPORTANT NOTE May 16, 2008: It has been brought to our attention that NCDC have switched mean base periods from 1961-90 to 1901-2000. This has no effect on absolute temperature time series with the exception of land based temperatures. The new mean temperature base is unchanged other than land based mean temperatures for December, January and February (the northern hemisphere winter), with each of these months having their historical mean raise 0.1 K.

At this time raising northern winter land based temperatures has not

altered published combined annual means but we anticipate this will

change and the world will get warmer again (at least on paper, which

appears to be about the only place that is true).

So even with this switch a year ago, the data still tracked until recently. Yet all of the sudden in the past couple of months, NCDC and GISS have started to diverge, and now NCDC is the “warm outlier”.

Maybe Barry’s concern in the second paragraph is coming true.

So what could explain this? At the moment I don’t know. I had initially thought perhaps the switch to USHCN2 might have something to do with this, but that now seems unlikely, since the entire data set would be adjusted, not just a couple of months.

The other possibility is a conversion error or failure somewhere. Being a USA government entity, NCDC works in Fahrenheit on input data, while the other data sets work in Centigrade. Converting NCDC’s April value of of 0.605(assuming it may be degrees °F) to Centigrade results in 0.336°C, which seems more reasonable.

Unfortunately, since NCDC makes no notes whatsoever on the data they provide on the FTP site, nor even a readme file about it with anything relevant, it is hard to know what units we are dealing with. They have plenty of different datasets here: ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/

But the readme file is rather weak.

What is clear though is that there has been a divergence in the last couple of months, and NCDC’s data went up when other datasets went down.

So, I’m posting this to give our readers a chance to analyze and help solve this puzzle. In the meantime I have an email into NCDC to inquire.

What REALLY needs to happen is that our global temperature data providers need to get on the same base period so that these data sets presented to the public don’t have such significant differences in anomaly.

Standardized reporting of global temperature anomaly data sets would be good for climate science, IMHO.

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

129 Comments
Inline Feedbacks
View all comments
May 18, 2009 4:34 pm

David (15:29:56) :
It seems to be that using the term “anomaly” to describe the difference between a temperature reading and some reference point is incorrect. The term infers that change is anomalous, whereas we know that the climate is always changing. How about using “deviation” instead?
On my articles, I always try to use change, fluctuation or oscillation instead “anomaly” or “anomalous”. “Anomaly” and “anomalous” imply standard smooth and fixed conditions, which is not true in nature. It seems that climatology and ecology are two disciplines with a language that lately has been manipulated with a fixed goal external to authentic science.

noaaprogrammer
May 18, 2009 4:38 pm

JP wrote: “One wonders how many adjustments these organizations calculate, and how many are thrown out. It is just a little too coincidental that whether they are adjusting for time of observation, or using a different time interval for the mean, we always get a little bit warmer. Science is rarely, if ever, that simple.”
au contraire – consider how simple this is:
#include <manipulate I/O)
void AGW( float dataset[], float runningaverage)
{ // This AlGorythm virtually proves that man can
// indeed warm the globe by cooking datasets:
int i = 0;
int inflate = 1;
float CO2fudgefactor;
while (!endofglobe())
{ if(dataset[i] sizeof(dataset))
{ i = 0;
inflate = 1;
}
}
}
Seriously though, there should be Benford type tests for 1st, 2nd, 3rd,… leading digits in meteorological data sets that can detect cooked data sets just as there is in the auditing of financial records. ( See http://www.mathpages.com/HOME/kmath302/kmath302.htm ) However, because of physically restricted numeric ranges on any one meteorological statistic, the distributions on these digits will first have to be established so that a normalization can be carried out.

noaaprogrammer
May 18, 2009 4:40 pm

Note: The special symbols in the core of the while-loop that showed how to increase the data set values did not come through on my previous post.

May 18, 2009 4:45 pm

Anthony Watts
Re: base periods and anomalies
This issue seems to crop up every month. I’m wondering if it’s worth WUWT publishing it’s own adjusted anomalies for each of the 4 main metrics. They could all use the same common base period as that used by the satellites, i.e. 1979-1998. It’s a relatively trivial operation for anyone who’s already got the data in a spreadsheet. In fact, GISS has a tool which does it for you.
On the other hand perhaps we’d better leave it as I’ve just noticed I disagree with zeke’s GISS adjusted anomaly for April, See
Zeke Hausfather (11:50:56) :
…GISS 0.2519 °C

Pamela Gray
May 18, 2009 5:18 pm

The NWS (related to NOAA) has CONSISTENTLY predicted warmer temperatures than what actually occurred in rural NE Oregon. Their dynamical prediction model me thinks has a coolaid hue. You don’t suppose they have decided to adjust temperature sensor data with some kind of “global warming” signature that they felt was somehow missed by rural stations? Could it be some kind of perverse UHI affect applied to low temperature outliers?

henry
May 18, 2009 5:56 pm

RW (09:21:38) :
“This is simply not meaningful. The choice of reference period is completely irrelevant to any analysis. You can trivially renormalise to whatever period you choose. If their reference period was 1880-1910, or if it was 1979-2009, it would make absolutely no difference to anything.”
I’ve been aking about the differences in the reporting periods for awhile now, with always getting the same reply: It’s the TREND thats important, not the zero.
Now this may be true, but it’s still nice to know if a .6 degree upward trend is rising from zero, or from .7 below zero.
People see the “zero” point as the “normal” point. So the choice of zero is important to see if we’re rising from normal, or returning to normal.
This echos previous replies, in that the choice of zero is used to create as large a positive (above zero) change as possible. The more above zero we get, the scarier the news sounds.

May 18, 2009 6:10 pm

John W. (06:48:04) :
“I’d be interested in seeing the actual temperatures”

Probably you will never see them, chances are there is no such raw data, because if they are changing it all the time, by now they surely aren´t able to find out which was the original data.

May 18, 2009 6:46 pm

By changing from 1960 to 1900 as a start year for readings world wide the NCDC can’t have helped getting into trouble. More than 1/3 of the stations, and about 95% in the Arctic and Antarctic, weren’t having humans living there or doing any kind of stations in same place for over 2 month before 1959…. the innerparts of the hugh areas towards the North Pole and the South Pole never been seen by a human at all in 1900…
No weather stations been established before 1920’s in many places where there are stations today in South America respectively in Australia. Some even much later than 1950’s.
This means that data before 1950 in most cases are completely dataestimated, extrapolated etc and have no connection to real values at all.
But it’s more. Take Sweden for example. In 1995 scholars were given opportunity to have correct readings for waterareas, such as lakes and rivers. They refused them since it was easier doing computerbased calculations of years before 1989…… How I know? My then retired father, gone since last year, was one of them who offerred detailed studies and values from 1957…..

steven mosher
May 18, 2009 7:36 pm

If you want to know the “real” temp as opposed to the anomaly, just add
about 14C to the anomaly…. So if the anomaly is .5
you get 14+.5 = 14.5
so.
Raw temp =14.5
Anomaly =.5
There, feel better?
Nothing whatsoever turns upon the anomaly period. Nothing. It just doesn’t matter. it’s a waste of time to even raise it as an issue, a distraction from the real problems in temperature series

REPLY:
mosh did you get the email I sent earlier? – Anthony

Richard Wright
May 18, 2009 7:47 pm

It seems to be that using the term “anomaly” to describe the difference between a temperature reading and some reference point is incorrect. The term infers that change is anomalous, whereas we know that the climate is always changing. How about using “deviation” instead?

Yes, that’s exactly my point. No one, even Hanson, “expects” (from the definition of anomaly) future temperatures to be exactly flat and coincident with the baseline. Therefore, the use of the term “anomaly” is inappropriate and misleading. How, for example, does one describe an outlying data point when all data points are referred to as anomalies? Is the outlier an unexpected anomaly (unexpectedly unexpected)? Just because climate researches choose to misuse a word, doesn’t mean it is a good choice.
My point is simply that the term is misleading and I think the concept is as well, because it implies, through the use of a baseline, that global temperatures should be flat, which is baseless. A plot of actual temperatures would show the same variations as these plots of anomalies but without the bias of the baseline assumption. That is the bias I’m talking about and I think it is very important to understand.
One could show actual temperatures and adjust the scale so that the same plus or minus 1°C spread is shown. But consider that without this artificial, horizontal baseline, the discussions of the data would, I think, be much different. Because then one has to ask the question, what should the temperatures be? Should they be flat? Should they be rising after an ice age? Putting in a baseline is an assumption at best or a conclusion at worst. It is not necessary in order to analyze the data and in fact confuses any analysis.

Richard Wright
May 18, 2009 7:52 pm

Here’s a very simple example to show the folly of the baseline. Global temperature varies throughout any year from Winter to Summer and back to Winter. Yet average monthly temperatures are compared to a flat baseline. What could possibly be the point doing that?

Just The Facts
May 18, 2009 10:50 pm

John Peter (06:16:53) :
David L. Hagen (09:08:20) :
In terms of being able to search WUWT for old comments, for now the best tool is Google’s new Blog Search:
http://blogsearch.google.com/?hl=en&ned=us&tab=nb
For example, if you wanted to see every comment by me you could use the advance search feature like this:
http://blogsearch.google.com/blogsearch/advanced_blog_search?as_q=&num=10&hl=en&client=news&um=1&ie=UTF-8&ctz=240&c2coff=1&as_oq=&as_eq=&as_drrb=q&as_qdr=a&as_mind=1&as_minm=1&as_miny=2000&as_maxd=18&as_maxm=5&as_maxy=2009&lr=&safe=active&q=%22just+the+facts%22+blogurl:wattsupwiththat.com
And then once you find and select the thread you are looking for hit Ctrl F to do a “find” on an Explorer or Firefox window to find the full comment within the thread.

Chriscafe
May 18, 2009 11:31 pm

Surely the major problem is that NOAA and GISS manipulate the raw data using unpublished algorithims. Occasionally these algorithms change and we are treated to a recasting of the temperature /time series, usually with older temperatures lowered to increase the upward trend.
Although they say they eliminate the effects of UHI, only spagetti code has been published bt GISS to provide a clue as to how. The Climate Audit analysis of this code,mentioned in this thread by Stephen McIntyre, shows that the claimed adjustment is not achieved.
Another major problem is that the unmanipulated raw data do not appear to be available.

David
May 19, 2009 12:00 am

“Here’s a very simple example to show the folly of the baseline. Global temperature varies throughout any year from Winter to Summer and back to Winter. Yet average monthly temperatures are compared to a flat baseline. What could possibly be the point doing that?”
If all temperature readings were taken in one hemisphere then that point would be valid – but they are not. Whether the readings in each hemisphere are equally weighted is another matter.

Evan Jones
Editor
May 19, 2009 12:07 am

But what is the difference between GISS, HadCRUT and NCDC ? Do they use data from different monitoring stations ? Do they have differnt methods of making up their data….oops, I mean different methods for analysing their data ? What`s the difference ?
I would have thought there is only the need for one group to monitor the worlds surface stations ?? I`m sure there’s a good reason ?

Well, believe it or not, it’s like this: NCDC takes its data and adjusts it (much controversy there). GISS takes the fully adjusted NCDC data and “unadjusts it” via some strange algorithm and then applies its own adjustments to the mangled results. Why they do not simply start off with NCDC raw data is a mystery for the ages.
HadCRUT, as I understand it, starts off with NCDC raw data, but does not reach all over the north pole by extrapolating from the “Siberian Thought Criminal” stations, so it generally clocks in a bit cooler than NCDC or GISS.
Satellites measure lower troposphere using microwave reflection proxies, and are not surface temps. They should (acc. to AGW theory) be warming faster than surface, yet they don’t. The sats are in pole-to-pole orbit and their cameras stick out the sides, so they can’t look directly down at the poles (I also think there is a problem measuring reflections off the ice). So the N/S polar temps are a lot less certain than they otherwise might be.

May 19, 2009 3:48 am

Well fictive data, corrected due to this and that instead of facts from real temperature readings been used more than once by the so called scholars.
In Fiction or facts climate threats readings I present some close to home, Sweden, examples that say more than the so called scholars probably understood while writing their papers….

RW
May 19, 2009 4:35 am

henry:
“People see the “zero” point as the “normal” point. So the choice of zero is important to see if we’re rising from normal, or returning to normal.”
There is no meaningful definition of ‘normal’ in this context and hence no non-arbitrary choice of reference period.
“This echos previous replies, in that the choice of zero is used to create as large a positive (above zero) change as possible. The more above zero we get, the scarier the news sounds.”
If a difference of 0.1°C makes you scared, you’re a bit more timid than most people! I find the suggestion that the baseline was chosen in the way you suggest to be obviously absurd. Given that the GISTEMP record was first produced in the early 1980s, using the previous three decades as the base period seems rather obvious. If the desire was to produce larger positive figures, why didn’t they use 1880-1910 as the reference period?
chriscafe:
“Surely the major problem is that NOAA and GISS manipulate the raw data using unpublished algorithims”
Have you ever looked at any of the papers mentioned here or here? What definition of ‘unpublished’ are you using?
evanmjones:
“Well, believe it or not, it’s like this: NCDC takes its data and adjusts it (much controversy there). GISS takes the fully adjusted NCDC data and “unadjusts it” via some strange algorithm and then applies its own adjustments to the mangled results.”
That’s pure fiction. Have you ever read GISS’s own description of the actual procedure?

Steve Keohane
May 19, 2009 5:25 am

Anthony, OT
In case you haven’t caught this yet: Scientific American 5/18/09
Trees boost air pollution–and cool temperatures–in southeast U.S
Why is the southeastern U.S. getting cooler while the rest of the globe is warming? Thank the trees, say some researchers.

May 19, 2009 5:59 am

David (00:00:48) :

“Here’s a very simple example to show the folly of the baseline. Global temperature varies throughout any year from Winter to Summer and back to Winter. Yet average monthly temperatures are compared to a flat baseline. What could possibly be the point doing that?”

If all temperature readings were taken in one hemisphere then that point would be valid – but they are not. Whether the readings in each hemisphere are equally weighted is another matter.
The point is not valid either way. Monthly anomalies are temperature departures relative to the mean temperature for that month during a given ‘base period’. GISS use the 1951-1980 period. So, if the 1951-1980 mean temperature for May is 14 deg and the temperature for May 2009 is 14.5 then the May anomaly will be 0.5. The base period is irrelevant. If you don’t like the GISS base period then use one of your own. Use the satellite base period (1979-1998) if you prefer. This gives a GISS April anomaly of ~0.2 deg, i.e. similar to the RSS anomaly.
If you have the GISS data, converting from one base period to another is trivial. But you don’t even need to do that because GISS will do to for you . Click here ->
http://data.giss.nasa.gov/gistemp/maps/
Then select Hadl/Reyn_v2 in the Ocean drop down menu. Enter your preferred base period (e. Begin: 1979 End: 1998) then click on Make Map . You should end up with this ->
http://data.giss.nasa.gov/cgi-bin/gistemp/do_nmap.py?year_last=2009&month_last=04&sat=4&sst=1&type=anoms&mean_gen=04&year1=2009&year2=2009&base1=1979&base2=1998&radius=1200&pol=reg
which if it’s worked will show a global anomaly map. In the top RH corner you will find the monthly “anomaly” relative to the chosen base period. In this example it is 0.19 deg.

Steve Keohane
May 19, 2009 7:23 am

Anthony, OT, still, from the above Scientific American Post comments section, someone posted this:
http://www.ncdc.noaa.gov/img/climate/research/2009/apr/05_04_2009_DvTempRank_pg.gif
Shows a year long cooler than normal northwest, central and eastern US.

steven mosher
May 19, 2009 8:37 am

Anthony,
Naw I didnt read the mail. I’m ears deep in alligators. 20 hour days is killer.
Stuck in Asia with a bad connection. I’ll check later tonight, when I finish
work. I’m grumpy.

Richard Wright
May 19, 2009 10:26 am

he point is not valid either way. Monthly anomalies are temperature departures relative to the mean temperature for that month during a given ‘base period’. GISS use the 1951-1980 period. So, if the 1951-1980 mean temperature for May is 14 deg and the temperature for May 2009 is 14.5 then the May anomaly will be 0.5.

Thanks for clearing that up. Every time I have seen an explanation of the baseline, it is always something like “the average global temperature from 1951-1980” which is very different from what you are describing. However, I still maintain that the term “anomaly” is poorly chosen for the reasons I previously mentioned. And I still would like to see error bars or something that gives an idea of the spread of data used to calculate the “baseline”. How are we to know if a 0.2° difference is significant. If the standard deviation of the data used to calculate the baseline was 0.2, for example, then it would not be significant. Without this information, we have no way of knowing whether we are looking at real differences or just noise (even if the data itself were not suspect).

RW
May 19, 2009 2:18 pm

“And I still would like to see error bars or something that gives an idea of the spread of data ”
For GISTEMP, look here. Had you ever looked for this kind of information? It has been there all along and very easy to find.

Richard Wright
May 19, 2009 7:16 pm

“And I still would like to see error bars or something that gives an idea of the spread of data ”
For GISTEMP, look here. Had you ever looked for this kind of information? It has been there all along and very easy to find.

The graph at your link has 3 green error bars with no explanation of what they mean. Each data point should have error bars and we need to know what the error bars mean. And how would we know how much error is due to the baseline data and how much is due to the ongoing monthly measurements? Graphs that show averages without defined error bars do not allow an assessment of the meaning of the data.
I looked up Hansen’s Global Temperature Change paper that appears to be the source of the graph you linked to. Here is his explanation of error bars:

Estimated 2 sigma error (95% confidence) in
comparing nearby years of global temperature (Fig. 1 A), such as
1998 and 2005, decreases from 0.1°C at the beginning of the 20th
century to 0.05°C in recent decades (4). Error sources include
incomplete station coverage, quantified by sampling a model-
generated data set with realistic variability at actual station loca-
tions (7), and partly subjective estimates of data quality problems
(8).

I particularly like the last part about subjective estimates. Boy, that sounds precise and repeatable. So I guess his error bars are of the same quality as his data, which is not surprising. I would just like to see plus or minus 2 sigma based on the measurements that were averaged so we could see the spread of the data.

Just The Facts
May 19, 2009 9:47 pm

There’s no need to quibble about the differences between anomalies, 0.605 °C versus 0.440 °C, adjustments, base periods and all that, because “Earth’s median surface temperature could rise 9.3 degrees F (5.2 degrees C) by 2100, the scientists at the Massachusetts Institute of Technology found”…
http://www.reuters.com/article/latestCrisis/idUSN19448608
Can we get a copy of this “study” and have a thread where we give it a good peer review?

Verified by MonsterInsights