Distribution analysis suggests GISS final temperature data is hand edited – or not

UPDATE: As I originally mentioned at the end of this post, I thought we should “give the benefit of the doubt” to GISS as there may be a perfectly rational explanation. Steve McIntyre indicates that he has done an analysis also and doubts the other analyses:

I disagree with both Luboš and David and don’t see anything remarkable in the distribution of digits.

I tend to trust Steve’s intuition and analysis skills,as his track record has been excellent. So at this point we don’t know what is the root cause or even if there is any human touch to the data. But as Lubos said on CA “there’s still an unexplained effect in the game”.

I’m sure it will get much attention as the results shake out.

UPDATE2: David Stockwell writes in comments here:

Hi,

I am gratified with the interest in this, very preliminary analysis. There’s a few points from the comments above.

1. False positives are possible, for a number of reasons.

2. Even though data are subjected to arithmetric operations, distortions in digit frequency at an earlier stage can still be observed.

3. The web site is still in development.

4. One of the deviant periods in GISS seems to be around 1940, the same as the ‘warmest year in the century’ and the ‘SST bucket collection’ issues.

5. Even if in the worst case there was manipulation, it wouldn’t affect AGW science much. The effect would be small. Its about something else. Take the Madoff fund. Even though investors knew the results were managed, they still invested because the payouts were real (for a while).

6. To my knowledge, noone has succeeded in exactly replicating the GISS data.

7. I picked that file as it is the most used – global land and ocean. I haven’t done an extensive search of files as I am still testing the site.

8. Lubos relicated this study more carefully, using only the monthly series and got the same result.

9. Benfords law (on the first digit) has a logarithmic distribution, and really only applies to data across many orders of magnitude. Measurement data that often has a constant first digit doesn’t work, although the second digit seems to. I don’t see why last digit wouldn’t work, and should approach a uniform distribution according to the Benford’s postulate.

That’s all for the moment. Thanks again.


This morning I received an email outlining some work that David Stockwell has done in some checking of the GISS global Land-Ocean temperature dataset:

Detecting ‘massaging’ of data by human hands is an area of statistical analysis I have been working on for some time, and devoted one chapter of my book, Niche Modeling, to its application to environmental data sets.

The WikiChecks web site now incorporates a script for doing a Benford’s analysis of digit frequency, sometimes used in numerical analysis of tax and other financial data.

The WikiChecks Site Says:

‘Managing’ or ‘massaging’ financial or other results can be a very serious deception. It ranges from rounding numbers up or down, to total fabrication. This system will detect the non-random frequency of digits associated with human intervention in natural number frequency.

Stockwell runs a test on GISS and writes:

One of the main sources of global warming information, the GISS data set from NASA showed significant management, particularly a deficiency of zeros and ones. Interestingly the moving window mode of the algorithm identified two years, 1940 and 1968 (see here).

You can actually run this test yourself, visit the WikiChecks web site, and paste the URL for the GISS dataset

http://data.giss.nasa.gov/gistemp/tabledata/GLB.Ts+dSST.txt

into it and press submit. Here is what you get as output from WikiChecks:

GISS

Frequency of each final digit: observed vs. expected

0 1 2 3 4 5 6 7 8 9 Totals
Observed 298 292 276 266 239 265 257 228 249 239 2609
Expected 260 260 260 260 260 260 260 260 260 260 2609
Variance 5.13 3.59 0.82 0.08 1.76 0.05 0.04 4.02 0.50 1.76 17.75
Significant * . *
Statistic DF Obtained Prob Critical
Chi Square 9 17.75 <0.05 16.92
RESULT: Significant management detected. Significant variation in digit 0: (Pr<0.05) indicates rounding up or down. Significant variation in digit 1: (Pr<0.1) indicates management. Significant variation in digit 7: (Pr<0.05) indicates management.

Stockwell writes of the results:

The chi-square test is prone to produce false positives for small samples. Also, there are a number of innocent reasons that digit frequency may diverge from expected. However, the tests are very sensitive. Even if arithmetic operations are performed on data after the manipulations, the ‘fingerprint’ of human intervention can remain.

I also ran it on the UAH data and RSS data and it flagged similar issues, though with different deviation scores. Stockwell did the same and writes:

The results, listed from lowest deviation to highest are listed below.

RSS – Pr<1

GISS – Pr<0.05

CRU – Pr<0.01

UAH – Pr<0.001

Numbers such as missing values in the UAH data (-99.990) may have caused its high deviation. I don’t know about the others.

Not being familiar with this mathematical technique, there was little I could do to confirm or refute the findings, so I let it pass until I could get word of replication from some other source.

It didn’t take long. About two hours later,  Lubos Motl, of the Reference Frame posted his results obtained independently via another method when he ran some checks of his own:

David Stockwell has analyzed the frequency of the final digits in the temperature data by NASA’s GISS led by James Hansen, and he claims that the unequal distribution of the individual digits strongly suggests that the data have been modified by a human hand.

With Mathematica 7, such hypotheses take a few minutes to be tested. And remarkably enough, I must confirm Stockwell’s bold assertion.

But that’s not all, Lubos goes on to say:

Using the IPCC terminology for probabilities, it is virtually certain (more than 99.5%) that Hansen’s data have been tempered with.

To be fair, Lubos runs his test on UAH data as well:

It might be a good idea to audit our friends at UAH MSU where Stockwell seems to see an even stronger signal.

In plain English, I don’t see any evidence of man-made interventions into the climate in the UAH MSU data. Unlike Hansen, Christy and Spencer don’t seem to cheat, at least not in a visible way, while the GISS data, at least their final digits, seem to be of anthropogenic origin.

Steve McIntyre offered an explanation in the way rounding occurs when converting from Fahrenheit to Centigrade, but Lubos can’t seem to replicate the same results he gets from the GISS data:

Steve McIntyre has immediately offered an alternative explanation of the non-uniformity of the GISS final digits: rounding of figures calculated from other units of temperature. Indeed, I confirmed that this is an issue that can also generate a non-uniformity, up to 2:1 in the frequency of various digits, and you may have already downloaded an updated GISS notebook that discusses this issue.

I can’t get 4,7 underrepresented but there may exist a combination of two roundings that generates this effect. If this explanation is correct, it is a result of much less unethical approach of GISS than the explanation above. Nevertheless, it is still evidence of improper rounding.

Pretty strong stuff, but given the divergence of the GISS signal with other datasets, unsurprising.  I wonder if it isn’t some artifact of the GISS Homogenization process for surface temperature data, which I view as flawed in its application.

But let’s give the benefit of the doubt here. I want to see what GISS has to say about it, there may be a perfectly rational explanation that can be applied that will demonstrate that these statistical accusations are without merit. I’m sure they will post something on RC soon.

Stay tuned.

Get notified when a new post is published.
Subscribe today!
0 0 votes
Article Rating
161 Comments
Inline Feedbacks
View all comments
Andrew
January 15, 2009 6:31 pm

Methinks someone may be scared.
Andrew

Eric Chieflion
January 15, 2009 6:42 pm

Steven Talbot (17:00:31) you are missing the underlying point. If Dr. Hanson’s methodologies were transparent, there would not be the accusations about which you protest.

Steven Talbot
January 15, 2009 6:47 pm

Anthony,
Well, regardless of whether or not you are surprised, I didn’t realise they were sources where I could post comments. I have already posted a comment at Stockwell’s site but am struggling with Google passwords at Motl’s. In future I guess I’d better make sure that if I make any critical comment here I have already made it at the sites of those whom you reference.
You are not presenting a case in a court of law, ‘in camera’ before a judge. You are publishing your views on your own public web site. If you do not consider that you have responsibility to avoid the insinuation, without proof, of base motives to others in such a forum, then we must accept that we have different standards.

Andrew
January 15, 2009 7:07 pm

I think it’s a joke that past temps are changed. I think changing information as you go should not be tolerated period, by anyone. You’d think that science would police and eliminate such a practice, were it discovered.
I knew AGW was a fraud all along, but the day Governator Ahnold came out and said, “The science is in… man has created the global warming” I had an extra piece of pie for dessert that night and slept like a baby.
Andrew

Pamela Gray
January 15, 2009 7:29 pm

Sometimes data analysis can produce a false positive. Been there done that. But the only way to know is to do a different analysis to back up or contradict the first analysis. It seems that the kind of analysis done here would call for a closer inspection of how data is handled.
To tell the truth, even seasoned and educated scientists can make biased choices, all the while swearing truthfully they believed they were being objective. “Facilitated Communication” is one such phenomena that was finally given sufficient scrutiny to discover it was not what it seemed to be and was actually fraught with human tinkering, though done with the best of intentions. What was eventually uncovered was that the human facilitators had no idea they were the ones causing the phenomena to happen, but indeed they were doing exactly that.
Those that have a fanatical belief in the dire consequences of global warming and are intricately involved in the data surrounding global temperatures and CO2 measures, to the point that they are lauded for it, are at high risk of doing the same thing; facilitating the phenomena to occur, even though intentions may be honorable and they faithfully swear that they have been “objective” in their data handling.
This analysis should be taken seriously by global climate change scientists and should be followed up with in-house analysis to see if indeed, hands of been at work.

Pamela Gray
January 15, 2009 7:31 pm

geez, just three sips of a nice warm sherry. “…hands HAVE been at work.”

crosspatch
January 15, 2009 7:39 pm

My personal take on this whole thing:
The data can not be expected to pass a random test because it isn’t random. The final numbers after calculating fills, averaging, and then “adjusting” will show significant artifacts from those processes. In fact, I would expect those artifacts to change from month to month because, for example, if you add a new month, when calculating a new “average” for a fill value, you divide by an additional number of months.
I would expect the raw input data to be more random over a long period of time (but not over short periods of say, less than 10 years).
This particular issue is, I believe, a non-issue and tends to distract the focus from important measures such as the positive feedback built into the calculations where recent warming increases past warming (and cooling back past the “hinge” point). For example, lets say we get a temperature for a month from a station and it includes all daily readings; no missing values. Lets say it is warmer than “average”. That number impacts all the calculated fill values for previous months’ missing values. It will also impact the future too because future temperatures that might be missing will be calculated with this warmer temperature. Missing values become extremely important when they happen during record warm and cold periods.
The impact of all of this is to exaggerate trends. One might say that this method of using “averages” doesn’t matter because 50% of the time the temperature will be above average and 50% below. That isn’t really true. First of all, we are using an AVERAGE not a median. Secondly, there are natural cyclic modes that cause climate to change in the same direction for long periods of time. The 50% rule would probably only apply over periods of longer than 50 years. For about 30 years you can expect temperatures above “average” and for about 30 years temperatures below “average” with very little time at “average” temperatures. GISS happens to select the coldest period in the past 60 years as its “baseline” further inflating the appearnce of “warming” as their “average” is actually a very cool period.
This is why I believe that our using 30 year periods of time as the “normal” baseline is incorrect. I believe we should use a period that captures the most recent entire PDO cycle (both the warm and cool phase) for Northern Hemisphere temperature baselines. And we should expect 30 years of above “average” and 30 years of “below average” temperatures as normal climate variability. This should end up being something around 60 years but it will vary in duration as cycles vary in duration.
In the meantime, what I believe would be a more responsible way for GISS or even NOAA to provide their product would be first to attempt to see if a missing value from the standard source of data is available from a different source of the same data. If so, that value should be filled from the actual measurement and not from calculated “fills”. Secondly, stations that have dropped out of the record should be reinstated *or* the ratio of urban/rural stations should be adjusted. Third, as a sanity check, the deviation of GISSTEMP from RSS and/or UAH should be monitored and if that deviation changes greatly, particularly over time, it should act as a flag for some QA to see what is going on and why the deviation has changed. Of particular concern to me are months where the GISS monthly change differs from the satellite measurements in both magnitude and direction.
Has anyone ever done an RSS / GISS anomaly plot over time to see what the differences are, if they have changed at certain points in time since 1979 or if they are diverging?

Mike Bryant
January 15, 2009 7:44 pm

I think it should be ok for me to change the numbers on my speedometer, odometer, bathroom scale, driver’s license, birth certificate and paycheck… retroactively even.

crosspatch
January 15, 2009 7:51 pm

“what do you think about this before and after graph showing two GISS temperature dataset releases that have been made public, differing only by date:”
Looks to me like there is a significant “hinge” or “pivot” point at about 1965. Temperatures after that date are adjusted warmer in the newer picture, temperatures before that date are adjusted colder.

Pamela Gray
January 15, 2009 8:06 pm

You can do that on bathroom scales??????? WHO KNEW????????

January 15, 2009 8:20 pm

Steven Talbot,
I am not as certain about previous references to Mr. David Stockwell, but I know that Luboš Motl’s interesting website has been referenced here many, many times. And not just in the recent past. I really wonder how you could visit here so often and still be unaware of his site.
Like WUWT, Dr. Motl’s The Reference Frame should be required reading for AGW proponents, so they can at least understand the opposing arguments.

Jeff Alberts
January 15, 2009 8:30 pm

Pamela Gray said: “What was eventually uncovered was that the human facilitators had no idea they were the ones causing the phenomena to happen, but indeed they were doing exactly that.”

I don’t believe they had no idea at all. Perhaps a very small number didn’t, but I think the vast majority knew exactly what they were doing.

Mike Bryant
January 15, 2009 9:01 pm

Crosspatch, I think everyone who lived in 1965 knows that it really was a “hinge point” as you call it. Things are actually much cooler now than they were before 1965, and that’s hot, to quote Paris Hilton. So I believe that Hansen may have been a resident of Haight-Ashbury. Of course this conjecture could be wrong.
Thanks for the blink temperature comparator, 1996, 2007… Steve resurrected 1934, but Big Jim buried it alive again…

G Alston
January 15, 2009 9:11 pm

Anthony:
Mr. Talbot says above — “You are not presenting a case in a court of law, ‘in camera’ before a judge. You are publishing your views on your own public web site.”
This is what I’d been thinking; mostly I was concerned about the PR aspect of all of this. It smacks of nitpicking so as to go after Dr. Hansen. It’s just not YOU. This isn’t the sort of mistake you make.
Now while I wholeheartedly agree that Hansen OUGHT to be gone after, I’d much rather see him in a wrestling smackdown (complete with folding chair blows) for things like changing past temps or encouraging criminals.
Going after him for what appears as an instrumental or algorithmic artifact is like ignoring the egregious stuff and settling for gossip about him and his intern.
And make no mistake, in the mile high view this can (and will) be seen as going after Dr. Hansen. A cheap shot. Even if that wasn’t the intent. You know that Tamino et al are going to eat this up, and this can damage hard won credibility.
That’s my view. I’ll shut up now. Thanks for listening. 🙂

Jeff Alberts
January 15, 2009 9:33 pm

So I believe that Hansen may have been a resident of Haight-Ashbury.

That would explain the combover.

Editor
January 15, 2009 10:08 pm

Nylo (02:50:29) :

Anyway, what kind of “missing values” in 1900 are being calculated from averages which include data from 2008? Are we talking of specific days in specific stations, or monthly averages of specific stations, or monthly averages of entire regions, or global monthly averages? Obviously all of them will be changed, but where does the problem start?

There was a post several months ago detailing how USHCN adjustments are made, and IIRCC, Steve McIntyre made some forays into seeing how many datapoints are missing.
I didn’t find the link I wanted, but http://wattsupwiththat.com/2008/09/23/adjusting-pristine-data/ is informative.
Too late for more digging.
Temp near Concord, NH: -11.2°F. Cold, and 3rd coldest in my five years of
detailed records:
mysql> select dt, hi_temp, lo_temp from daily where lo_temp <= -10;
+————+———+———+
| dt | hi_temp | lo_temp |
+————+———+———+
| 2004-01-10 | 7.2 | -11.0 |
| 2004-01-14 | 0.9 | -13.5 |
| 2004-01-16 | 10.4 | -11.0 |
| 2005-01-22 | 10.2 | -13.0 |
+————+———+———+
Like MattN noted, "If we get any more global warming, we’ll all freeze to death!"

Editor
January 16, 2009 5:10 am

Yay – Concord NH set a new low temp record of -22F this AM. I figure if you have to endure extreme weather, it might as well be record setting.
Here at home, about 10 miles from the Concord weather station, I reached only -17.5F. I’m in a valley where cold air pools on clear calm nights, but trees, and buildings block some sky exposure.
The Concord station is at small airport (no regular commercial traffic) on a flat (duh!) plain near the Merrimack river valley. Under really good radiational cooling they usually get colder than I do.
My last 48 hours of data are at http://home.comcast.net/~ewerme/wx/current.htm

Steven Talbot
January 16, 2009 6:48 am

Steven Talbot, same question that I posed to Andrew.
That is:
what do you think about this before and after graph showing two GISS temperature dataset releases that have been made public, differing only by date:
http://zapruder.nl/images/uploads/screenhunter3qk7.gif
Do you think it is OK the change temperatures in the past? Do you think that type of behavior would be tolerated in say, a medical study? A vehicle safety test? Or how about financial records?
Lets hear what you think about it.

Firstly I think it’s not the subject of this thread and, as I have suggested to Smokey above, referencing other beefs with GISS is indicative of confirmation bias at work, IMV.
However, since you ask what I think –
1. I am unclear as to whether you are concerned about the GISS process only or about the USHCN data with which they work. I’ll presume you are considering the whole ‘package’, though I think it relevant to recognise different responsibilities.
2. As you are well aware, the most significant adjustments were described in Hansen et al. 2001: http://pubs.giss.nasa.gov/abstracts/2001/Hansen_etal.html. In biref, they encompassed USHCN adjustments for time of observation bias, station history, instrumentation changes and urban warming. The main difference between USHCN and GISS adjustments was GISS’s greater negative correction for urban bias (about -0.15°C over 100 years, compared with a USHCN urban adjustment of -0.06°C).
3. You ask if I think it’s OK to change the data from the past. I certainly do. For example, if the influence of UHI can be analysed, should it not be adjusted for? If a station has a warming bias because of its location, should that not be adjusted? Or if time of observation has systematically changed from afternoon to morning, then how can we assess a meaningful view of temperature change over time without adjusting for that?
Given that the whole thrust of your surface stations project is to examine the extent to which temperature readings are uncorrupted by spurious influences I find your questions here rather puzzling. The USHCN (GISS) adjustments have been done in order to remove spurious biases. I think that is entirely proper and, yes, I would expect exactly the same from a medical study, safety test or financial record (to give you an example of the last, I would expect the performance of financial products to be assessable in the context of inflation adjustment).
If it were to be recognised that UHI effects have been underestimated for the last thirty years, say, would you object to those temperature records being adjusted downwards to give us an unbiased assessment of the situation?
It was, of course, an incorrectly computed time of observation bias (satellite drift) that had corrupted the UAH records up to 2005. Did you object to Christy & Spencer recognising their error when it was pointed out to them and thereafter changing their own past data?

Richard P
January 16, 2009 7:34 am

Something just struck me this morning. It is -30F at my house in Norway, IA, and Debuque, IA tied a record at -30F for today. This record for today was from 1888. Have these early records been adjusted in similar ways to the GISS data? Or, are only the records for “climate change” analysis purposes changed, and not he actual readings. For if there was truly a problem with the data from these early periods then should not the actual records be adjusted?
Just a question.

Steven Talbot
January 16, 2009 7:38 am

Incidentally, this entry on the GISS ‘Updates to Analysis’ page may be relevant to the distribution issue (apols if mentioned before) –
Aug. 11, 2008: Nick Barnes and staff at Ravenbrook Limited have generously offered to reprogram the GISTEMP analysis using Python only, to make it clearer to a general audience. In the process, they have discovered in the routine that converts USHCN data from hundredths of °F to tenths of °C an unintended dropping of the hundredths of °F before the conversion and rounding to the nearest tenth of °C. This did not significantly change any results since the final rounding dominated the unintended truncation. The corrected code has been used for the current update and is now part of the publicly available source.
http://data.giss.nasa.gov/gistemp/updates/

Steven Talbot
January 16, 2009 8:36 am

Anthony,
I am not asserting the absolute accuracy of the methodology, though I would probably go along with your word ‘representative’. Clearly there is error in the process, though whether or not that exceeds the GISS error bands I do not know.
I am well aware there are examples of poorly sited sensors, and I am sure that you report on microclimate biases that introduce spurious cooling as well as those that introduce spurious warming, such as instances of sensors influenced by tree canopy. I understand from R.P. Sr’s papers that the influence of asphalt is seasonal – cooling in winter, warming in summer. The quantification of that is tricky enough, let alone a total quantification of all biases. I do think that the GISS methodology seeks to dilute the influence of anomalous readings, but whether that is effective or not is a reasonable line of enquiry.
I look forward to a time when there are more satellites in space which have been designed specifically to assess our climate. They will doubtless have their own teething problems and systemic errors, but the closer we get to fuller and more accurate information the better.
So, I wholly agree that there is evidence of inaccuracy in the record. I don’t find that very surprising, really, but I agree that work should continue in the direction of improvement. What I do not share with many commentators here is the ready presumption that inaccuracies and adjustments are evidence of human bias in favour of showing a warming climate. I am not aware of any specific evidence of that, though it seems to be established in the minds of many as if it were a matter of fact. That, it seems to me, is more a matter of ‘faith’ than anything I have said. My view is that we should not presume cause unless it is evidenced, and that, I suggest, is a sceptical stance.

Andrew
January 16, 2009 10:00 am

When you change the record of anything, you are changing the history. Temperature is history. History is supposed to be what happened and what people thought at the time. When you give yourself the right to change the record you are giving yourself the right to change history. Anyone who wants to do this wants to hide the past for some reason. It doesn’t matter if they have a good reason or bad to want to change it. It doesn’t matter. Changing it at all is a bad idea.
I work for financial institutions for my day job, and we document everything within reason that is job related. The documentation has to reflect the history, so we can continue to learn from it, for one reason. If you change it, there is nothing for you to help you remember what actually happened. If a mistake is made, or something is wrong, the good and bad are all recorded and we go correctly from there. If I ever went back and tried to change the history of what happened to my company, I would be fired, not to mention my personal relationship with the people that run the places would be destroyed. They would consider it a betrayal.
Andrew ♫