Distribution analysis suggests GISS final temperature data is hand edited – or not

UPDATE: As I originally mentioned at the end of this post, I thought we should “give the benefit of the doubt” to GISS as there may be a perfectly rational explanation. Steve McIntyre indicates that he has done an analysis also and doubts the other analyses:

I disagree with both Luboš and David and don’t see anything remarkable in the distribution of digits.

I tend to trust Steve’s intuition and analysis skills,as his track record has been excellent. So at this point we don’t know what is the root cause or even if there is any human touch to the data. But as Lubos said on CA “there’s still an unexplained effect in the game”.

I’m sure it will get much attention as the results shake out.

UPDATE2: David Stockwell writes in comments here:

Hi,

I am gratified with the interest in this, very preliminary analysis. There’s a few points from the comments above.

1. False positives are possible, for a number of reasons.

2. Even though data are subjected to arithmetric operations, distortions in digit frequency at an earlier stage can still be observed.

3. The web site is still in development.

4. One of the deviant periods in GISS seems to be around 1940, the same as the ‘warmest year in the century’ and the ‘SST bucket collection’ issues.

5. Even if in the worst case there was manipulation, it wouldn’t affect AGW science much. The effect would be small. Its about something else. Take the Madoff fund. Even though investors knew the results were managed, they still invested because the payouts were real (for a while).

6. To my knowledge, noone has succeeded in exactly replicating the GISS data.

7. I picked that file as it is the most used – global land and ocean. I haven’t done an extensive search of files as I am still testing the site.

8. Lubos relicated this study more carefully, using only the monthly series and got the same result.

9. Benfords law (on the first digit) has a logarithmic distribution, and really only applies to data across many orders of magnitude. Measurement data that often has a constant first digit doesn’t work, although the second digit seems to. I don’t see why last digit wouldn’t work, and should approach a uniform distribution according to the Benford’s postulate.

That’s all for the moment. Thanks again.


This morning I received an email outlining some work that David Stockwell has done in some checking of the GISS global Land-Ocean temperature dataset:

Detecting ‘massaging’ of data by human hands is an area of statistical analysis I have been working on for some time, and devoted one chapter of my book, Niche Modeling, to its application to environmental data sets.

The WikiChecks web site now incorporates a script for doing a Benford’s analysis of digit frequency, sometimes used in numerical analysis of tax and other financial data.

The WikiChecks Site Says:

‘Managing’ or ‘massaging’ financial or other results can be a very serious deception. It ranges from rounding numbers up or down, to total fabrication. This system will detect the non-random frequency of digits associated with human intervention in natural number frequency.

Stockwell runs a test on GISS and writes:

One of the main sources of global warming information, the GISS data set from NASA showed significant management, particularly a deficiency of zeros and ones. Interestingly the moving window mode of the algorithm identified two years, 1940 and 1968 (see here).

You can actually run this test yourself, visit the WikiChecks web site, and paste the URL for the GISS dataset

http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts+dSST.txt

into it and press submit. Here is what you get as output from WikiChecks:

GISS

Frequency of each final digit: observed vs. expected

0 1 2 3 4 5 6 7 8 9 Totals
Observed 298 292 276 266 239 265 257 228 249 239 2609
Expected 260 260 260 260 260 260 260 260 260 260 2609
Variance 5.13 3.59 0.82 0.08 1.76 0.05 0.04 4.02 0.50 1.76 17.75
Significant * . *
Statistic DF Obtained Prob Critical
Chi Square 9 17.75 <0.05 16.92
RESULT: Significant management detected. Significant variation in digit 0: (Pr<0.05) indicates rounding up or down. Significant variation in digit 1: (Pr<0.1) indicates management. Significant variation in digit 7: (Pr<0.05) indicates management.

Stockwell writes of the results:

The chi-square test is prone to produce false positives for small samples. Also, there are a number of innocent reasons that digit frequency may diverge from expected. However, the tests are very sensitive. Even if arithmetic operations are performed on data after the manipulations, the ‘fingerprint’ of human intervention can remain.

I also ran it on the UAH data and RSS data and it flagged similar issues, though with different deviation scores. Stockwell did the same and writes:

The results, listed from lowest deviation to highest are listed below.

RSS – Pr<1

GISS – Pr<0.05

CRU – Pr<0.01

UAH – Pr<0.001

Numbers such as missing values in the UAH data (-99.990) may have caused its high deviation. I don’t know about the others.

Not being familiar with this mathematical technique, there was little I could do to confirm or refute the findings, so I let it pass until I could get word of replication from some other source.

It didn’t take long. About two hours later,  Lubos Motl, of the Reference Frame posted his results obtained independently via another method when he ran some checks of his own:

David Stockwell has analyzed the frequency of the final digits in the temperature data by NASA’s GISS led by James Hansen, and he claims that the unequal distribution of the individual digits strongly suggests that the data have been modified by a human hand.

With Mathematica 7, such hypotheses take a few minutes to be tested. And remarkably enough, I must confirm Stockwell’s bold assertion.

But that’s not all, Lubos goes on to say:

Using the IPCC terminology for probabilities, it is virtually certain (more than 99.5%) that Hansen’s data have been tempered with.

To be fair, Lubos runs his test on UAH data as well:

It might be a good idea to audit our friends at UAH MSU where Stockwell seems to see an even stronger signal.

In plain English, I don’t see any evidence of man-made interventions into the climate in the UAH MSU data. Unlike Hansen, Christy and Spencer don’t seem to cheat, at least not in a visible way, while the GISS data, at least their final digits, seem to be of anthropogenic origin.

Steve McIntyre offered an explanation in the way rounding occurs when converting from Fahrenheit to Centigrade, but Lubos can’t seem to replicate the same results he gets from the GISS data:

Steve McIntyre has immediately offered an alternative explanation of the non-uniformity of the GISS final digits: rounding of figures calculated from other units of temperature. Indeed, I confirmed that this is an issue that can also generate a non-uniformity, up to 2:1 in the frequency of various digits, and you may have already downloaded an updated GISS notebook that discusses this issue.

I can’t get 4,7 underrepresented but there may exist a combination of two roundings that generates this effect. If this explanation is correct, it is a result of much less unethical approach of GISS than the explanation above. Nevertheless, it is still evidence of improper rounding.

Pretty strong stuff, but given the divergence of the GISS signal with other datasets, unsurprising.  I wonder if it isn’t some artifact of the GISS Homogenization process for surface temperature data, which I view as flawed in its application.

But let’s give the benefit of the doubt here. I want to see what GISS has to say about it, there may be a perfectly rational explanation that can be applied that will demonstrate that these statistical accusations are without merit. I’m sure they will post something on RC soon.

Stay tuned.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
161 Comments
Inline Feedbacks
View all comments
vg
January 15, 2009 5:45 am

Honestly me thinks this whole debate will probably be redundant if temps continue down for the next 10-20 yrs`as predicated by sun (of course they could go up) LOL

January 15, 2009 7:12 am

Dr. Roy Spencer has a new blog entry on how the IPCC models treat cloud feedbacks…
Does Nature’s Thermostat Exist? A Global Warming Debate Challenge
http://www.drroyspencer.com/2009/01/does-nature%E2%80%99s-thermostat-exist-a-global-warming-debate-challenge/

Richard P
January 15, 2009 8:14 am

Boy did Benford’s law drag some cobwebs out of the back filing cabinets. That was so many years ago that Carter was still in office. However, to those that want to “bury” this issue I say no. All questions should be open and up for discussion, if they are reasonable. Benford’s law is a well documented test for manipulation of data either from the system, fabrication, or manual adjustment. To dismiss it out of hand is not appropriate given the errors that have been made in many data sets regarding this issue. The manipulation may be valid such as unit conversion, manual data entry errors, or other unknown issues. But, it is better to know of it’s existence than to bury it and ignore what is happening. This is not a PR game, but a scientific inquiry and both data and analysis must be shared. Otherwise we are not following scientific methods which with the proper feedback lead to the correct answer.
In my investigations I found an interesting piece of software from Kirix. It allows for the pulling of data from a web page directly for analysis. Looks cool, but is a bit pricey. Follow the link for a look:
http://www.kirix.com/

crosspatch
January 15, 2009 8:39 am

Nylo:
“I have been making numbers with an excel sheet. ”
Might be interesting, where you find missing numbers in a recent year, see if you can identify the station from another source. Weather Underground, for example, keeps data back a few years for a lot of stations. See if you can find the missing value there, plug it in and see if it makes a difference in the outcome.
“This gives an idea of the order of magnitude of the FRAUD we are just watching.”
I wouldn’t attribute to fraud what can be explained by sloth combined with a need to validate one’s own hypothesis. Remember, the entire purpose of the GISSTEMP is to validate Hansen’s climate model.

Editor
January 15, 2009 8:55 am

John Philip (16:07:46) :

I confidently predict that this will turn out to be a non-story. As our host is a meteorologist and interested in matters climatic, and on the day that WUWT was awarded Best Science Blog, can we expect some comment on the news that the American Meteorological Society has honoured the custodian of the GISS dataset with its highest commendation, the href=”http://www.nasa.gov/centers/goddard/news/topstory/2009/hansen_ams.html”> Carl-Gustaf Rossby Research Medal.
Newsworthy, surely?

More on this via Joe D’Aleo’s http://icecap.us/ (see his comments there) is at http://dotearth.blogs.nytimes.com/2009/01/14/weather-mavens-honor-climate-maven/ but be sure to read the comments, e.g.
http://community.nytimes.com/blogs/comments/dotearth/2009/01/14/weather-mavens-honor-climate-maven.html?permid=17#comment17

crosspatch
January 15, 2009 9:17 am

“Remember, the entire purpose of the GISSTEMP is to validate Hansen’s climate model.”
To clarify … GISSTEMP wasn’t designed “to see if the Earth was warming”, it was designed to “show that the Earth is warming”.

hunter
January 15, 2009 9:23 am

If the AGW promoters were open about their sourcing and methods this would not be a credible issue. They are not, however. They are secretive and defensive, and so the issue will remain.

Bernie
January 15, 2009 9:23 am

The deviations in the data records are interesting and should certainly be explored but I feel the notion that someone is deliberately manipulating the data is way off base. Moreover, getting excited about it and suggesting conspiracies undermines the significance of more trenchant criticisms such as siting, infilling, UHI and simply poorly specified GCMs.

G Alston
January 15, 2009 10:03 am

Richard P — However, to those that want to “bury” this issue I say no.
It’s one thing to look at numbers to see if there’s some sort of systemic bias (i.e. equipment issues) but something else entirely to carry on as if there is wrongdoing. The post should have been oriented to looking at a queer set of interesting numbers to the effect of asking how they got there naturally rather than premising nefarious intent.
The former makes sense. The latter doesn’t, especially in that the overall effect seems negligable.
REPLY: If there were not so many odd adjustments or mistakes identified in the GISTEMP record, or if Jim Hansen had decided not to come to the defense of vandals in England, then most certainly this would have been looked at with less suspicion, maybe even not at all. But the list of precedence of odd things seen thus far cause questions like this to be raised in a context that questions the credibility of the dataset and the keeper of it. I gave the benefit of the doubt on this specific issue immediately with my first posting, and like you I think that this is likely an artifact of little significance, but I still have concerns for the overall dataset integrity. – Anthony

stan
January 15, 2009 10:08 am

I posted this at CA, but Steve may snip. So:
“Luis,
You are wrong to belittle this exercise and the surface stations project. You seem to have failed to grasp what the real issue is — credibility. And there are two aspects to the credibility issue. The first is honesty and the second is competence. The possibility that data has been manipulated goes to honesty. It matters not one bit whether the possible manipulation has a significant impact on trendlines, etc. Even if the impact is tiny, if data’s been manipulated, the parties involved are dishonest and all their work should be regarded as unreliable. After all, climate scientists don’t bother to check or replicate each other’s work. If someone’s untrustworthy, their work is untrustworthy. Period. [Note, this standard is especially appropriate for one who thinks that those who disagree with him should go to jail.)
Competence, the second aspect of the credibility issue, is directly addressed by the surface stations study you disparage. Once it was shown that hundreds of stations violate basic scientific standards for placement, the burden was no longer on Watts to demonstrate some quantifiable way to correct the temperature record. The burden properly rests upon those who consider the record authoritative to demonstrate why such incredibly shoddy work has any scientific credibility at all. And further, why the people in charge of such shoddy practices should be given any credence with respect to the rest of their scientific work. Most people expect that those who endeavor to build sophisticated scientific structures using the temperature record ought to first bother to find out if the thermometers are accurate. [maybe it should be a law that climate scientists demonstrate minimal proficiency with a thermometer before receiving a government grant.]
This climate science is the driver for an extraordinary array of political policies. As Steve noted, he first got interested in the hockey stick because the “findings” were being used to drive public policy in Canada. Of course, Hansen has been at the forefront of using this science for political purposes. We may decry the politicization of the science, but we cannot deny that the two are now inextricably intertwined. So your belittling of a statistical exercise which may shed light on the honesty of a central figure to the debate and a database crucial to the scientific arguments reflects either a misunderstanding of the issues or an attempt at obfuscation.”

Andrew
January 15, 2009 10:28 am

Bernie:
The problem is that climate science is such a mess, that we don’t know what set of numbers is dubious… unless someone asks ALL the questions- questions dumb, questions old and questions “unscientific” even.
To rely on one group’s take on the climate (even if that group is a group of scientists) is a fallacy. The possibility does exist that someone is “cooking the books” to enhance the stature of themselves or their particular group they like.
So ALL the questions need to be perpetually asked, including who and who may not be biased (on purpose) in one direction or another. Do you see what I’m getting at? Perhaps one day if some actual evidence is offered, some questions can actually be answered.
Andrew ♫

Magnus
January 15, 2009 12:06 pm

Bernie 09:23. I agree with you!

Bernie
January 15, 2009 12:36 pm

Andrew:
The paranoid will always find someone who is “after” them. I understand the history and the need to ask questions … but asking questions is different from essentially accusing people of wrongdoing when there is, at best, inconclusive evidence. In the scheme of things and without an estimate as to the size and direction of any possible effects – this issue can be seen as a curiousity or another example of skeptics grasping at straws. David and Lubos should look more closely at the data but without the attributions as to intent – at least for now.

Steven Talbot
January 15, 2009 1:57 pm

I gave the benefit of the doubt on this specific issue immediately with my first posting, and like you I think that this is likely an artifact of little significance, but I still have concerns for the overall dataset integrity. –
Are you saying that was your intention with the headline? (“Distribution analysis suggests GISS final temperature data is hand edited – or not” – incidentally, was the “or not” there originally?). If so, I think your words failed you. Your headline insinuates that GISS may be responsible for “editing”. Whatever the strengths or limitations of the distribution analysis, it offers no evidence of that whatsoever.
Saying “X may have done something improper” is not a good way of expressing the benefit of the doubt over such a matter. I am very surprised that you are seemingly so unaware of the implication in your choice of words.

Richard P
January 15, 2009 2:24 pm

G Alston: “It’s one thing to look at numbers to see if there’s some sort of systemic bias (i.e. equipment issues) but something else entirely to carry on as if there is wrongdoing.”
It was not my intent to imply any wrongdoing by anyone. There are very few things that I ascribe to malice or fraud especially on complex issues. I do believe in making sure that the data is accurate, and use tests even on data I generate as a double check of accuracy and quality. My point was to make sure that we understood that it was not a problem rather than dismissing it out of hand.
My first rule of engineering for interns fresh from school was do not make any assumptions without acknowledging what they are. Many times problems were resolved because what we thought was happening turned out not to be true. By questioning those assumptions you sometimes find the problem much faster rather that going along fat dumb and happy. Now you don’t question basic physics, ascribe to conspiracy theories, or assume that the aliens did it, however, reasonable questions should be allowed especially if errors have been seen in the past.
My prediction for what it is worth would be that this is either a non issue or minor property of the system and will have no effect on the outcome. Of course I am making some assumptions that may be wrong, so now my interns may get some payback ;-).

George E. Smith
January 15, 2009 2:30 pm

“” Pamela Gray (21:02:34) :
I should stop trying ti type afyter a nicr glass of gnarlay head zingfandel windee. I wanted them so I went with them.
And now back to science. “”
Lordy lordy ! lady do you also arm rassle down at the local selloon ?
That there zingfandel is lumberjack rocket fuel. Can’t you find a nice lady like wine like Tawny Port or something. I betcha dring Guinness Stout too !
I’d buy the raw leather boots; with the studs on them, ‘case you gots to kick some hooligins outa that place !
We’ll mind our manners a bit more round here, case we draw on yer ire !
George

Andrew
January 15, 2009 3:07 pm

Bernie,
In climate science, the “evidence” is dependent on what you personally believe. Accusations of wrongdoing should be pervasive and plentiful in a place where the general population claims to know something it dosen’t.
Andrew

Sekerob
January 15, 2009 3:32 pm

If you want to know if the title was edited, use the time machine functions of google or simply google search. Someone copied the original post: and there is no – or not in there
http://www.bolsanobolso.com/showthread.php?t=26859
In fact, someone posted the original link elsewhere on a :
http://wattsupwiththat.com/2009/01/14/distribution-analysis-suggests-giss-final-data-is-hand-edited/ and this is how it still shows in the browser address bar.
Editing? No, Not here and certainly not without audit trail, something Steve McI would get aghasted over.

Mike Bryant
January 15, 2009 3:53 pm

I’m aghasted too… NOT.

Steven Talbot
January 15, 2009 3:58 pm

At this point, since there is evidence both ways we don’t really know and I think the title reflects that.
What evidence is there of GISS having hand edited their final data? None whatsoever. Saying “though maybe not” after the event does not make your insinuation “fair”. If I were to insinuate that your post is “evidence” of malicious denigration, then what would you think of that? I suspect you would find such an insinuation offensive. The fact of something being a possibility does not mean that it gives evidence of itself!

January 15, 2009 4:48 pm

What evidence is there of GISS having hand edited their final data? None whatsoever.

Actually, distribution analysis is evidence. It is not proof, and whether it is conclusive remains to be seen.
And as Steve McIntyre says, “…after refusing for a long time and under protest, Hansen did archive his source code, which is a mess – which is probably why he didn’t want to archive it.”
So we have a source code that’s a mess, indicating at the very least a lack of competence, and we have evidence that the data was manipulated.
Maybe there’s an innocent explanation for the distribution analysis findings. But there is no excuse for a shoddy source code. How can that result in good science?
And how can Hansen justify altering the past record — based on new temperature measurements?

Steven Talbot
January 15, 2009 5:00 pm

Actually, distribution analysis is evidence. It is not proof, and whether it is conclusive remains to be seen.
It’s evidence, but we don’t know what it’s evidence of. Insinuating that it’s evidence of GISS having hand-edited their final results is simply unfounded. Very obviously, the distribution analysis could be evidence of all sorts of things that have nothing whatsoever to do with GISS. It’s not even evidence yet of the data being “manipulated” as you state, let alone manipulated by GISS. You are simply expressing confirmation bias in your post, IMV, by referencing your other beefs with GISS. That’s humanly understandable, maybe, but you should distinguish it from scientific enquiry.

Steven Talbot
January 15, 2009 6:11 pm

Anthony,
Have you made similar complaints over at Lubos Motl’s site about his choice of the word “cheating” in the title?
Or to David Stockwell for use of the word “fraud” in his title while referencing GISS in the story?

No, I’d not been aware of those sites. I will follow up tomorrow and post my views. I think it is entirely improper to suggest fraud or cheating on the basis of what has been presented. If it were to turn out that anyone were engaged in such practices then they should face the severe consequences of their actions. However, I take it as a basic principle of human decency that one should presume innocence until guilt is proven.
I’m just wondering if your issue is specific to this blog or to the issue of choice of words in titles about the story in general.
No, my issue is to do with the suggestion of possible cause in respect of evidence which may have no relationship to the posited cause. If Hansen (or whoever) were to do this I expect you would be all over him with your criticism, and you would be right in that.
I think this ‘story’ may have contributed to doubts about GISS in ways that turn out to be entirely unjustified. You have responsibility in what you say, as do I, and as do Motl and Stockwell. I will post my opinions on their sites. I do not think it is proper to suggest malfeasance without proof of the matter.