UPDATE: See the first ever CONUS Tavg value for the year from the NCDC State of the Art Climate Reference Network here and compare its value for July 2012. There’s another surprise.
Glaring inconsistencies found between State of the Climate (SOTC) reports sent to the press and public and the “official” climate database record for the United States. Using NCDC’s own data, July 2012 can no longer be claimed to be the “hottest month on record”. UPDATE: Click graph at right for a WSJ story on the record.
First, I should point out that I didn’t go looking for this problem, it was a serendipitous discovery that came from me looking up the month-to-month average temperature for the CONtiguous United States (CONUS) for another project which you’ll see a report on in a couple of days. What started as an oddity noted for a single month now seems clearly to be systemic over a two-year period. On the eve of what will likely be a pronouncement from NCDC on 2012 being the “hottest year ever”, and since what I found is systemic and very influential to the press and to the public, I thought I should make my findings widely known now. Everything I’ve found should be replicable independently using the links and examples I provide. I’m writing the article as a timeline of discovery.
At issue is the difference between temperature data claims in the NCDC State of the Climate reports issued monthly and at year-end and the official NCDC climate database made available to the public. Please read on for my full investigation.
You can see the most current SOTC for the USA here:
In that SOTC report they state right at the top:
Highlighted in yellow is the CONUS average temperature, which is the data I was after. I simply worked backwards each month to get the CONUS Tavg value and copy/paste it into a spreadsheet.
In early 2011 and late 2010, I started to encounter problems. The CONUS Tavg wasn’t in the SOTC reports, and I started to look around for an alternate source. Thankfully NCDC provided a link to that alternate source right in one the SOTC reports, specifically the first one where I discovered the CONUS Tavg value was missing, February 2011:
That highlighted in blue “United States” was a link for plotting the 3-month Dec-Feb average using the NCDC climate database. It was a simple matter to switch the plotter to a single month, and get the CONUS Tavg value for Feb 2011, as shown below. Note the CONUS Tavg value at bottom right in yellow:
All well and good, and I set off to continue to populate my spreadsheet by working backwards through time. Where SOTC didn’t have a value, I used the NCDC climate database plotter.
And then I discovered that prior to October 2010, there were no mentions of CONUS Tavg in the NCDC SOTC reports. Since I was recording the URL’s to source each piece of data as well, I realized that it wouldn’t look all that good to have sources from two different URL’s for the same data, and so for the sake of consistency, I decided to use only the CONUS Tavg value from the NCDC climate database plotter, since it seemed to be complete where the SOTC was not.
I set about the task of updating my spreadsheet with only the CONUS Tavg values from the NCDC climate database plotter, and that’s when I started noticing that temperatures between the SOTC and the NCDC climate database plotter didn’t match for the same month.
Compare for yourself:
NCDC’s SOTC July 2012:
Screencap of the claim for CONUS Tavg temperature for July 2012 in the SOTC:
Note the 77.6°F highlighted in blue. That is a link to the NCDC climate database plotter which is:
Screencap of the output from the NCDC climate database, note the value in yellow in the bottom right:
Note the difference. In the July 2012 State of the Climate Report, where NCDC makes the claim of “hottest month ever” and cites July 1936 as then benchmark record that was beaten, they say the CONUS Tavg for July 2012 is: 77.6°F
But in the NCDC climate database plotter output, the value is listed as 76.93°F almost 0.7°F cooler! They don’t match.
I initially thought this was just some simple arithmetic error or reporting error, a one-off event, but then I began to find it in other months when I compared the output from the NCDC climate database plotter. Here is a table of the differences I found for the last two years between claims made in the SOTC report and the NCDC database output.
In almost every instance dating back to the inception of the CONUS Tavg value being reported in the SOTC report, there’s a difference. Some are quite significant. In most cases, the database value is cooler than the claim made in the SOTC report. Clearly, it is a systemic issue that spans over two years of reporting to the press and to the public.
It suggests that claims made by NCDC when they send out these SOTC reports aren’t credible because there are such differences between the data. Clearly, NCDC means for the plotter output they link to, to be an official representation to the public, so there cannot be a claim of me using some “not fit for purpose” method to get that data. Further, the issue reveals itself in the NCDC rankings report which they also link to in SOTC reports:
Note the 76.93°F I highlighted in yellow. Since it appears in two separate web output products, it seems highly unlikely this is a “calculation on demand” error, but more likely simply a database output and that is simply displayed data.
Note the claim made in the NCDC July 2012 SOTC for the July 1936 CONUS Tavg temperature which is:
The previous warmest July for the nation was July 1936, when the average U.S. temperature was 77.4°F.
But now in two places, NCDC is reporting that the CONUS Tavg for July 2012 is 76.93°F about 0.47°F cooler than 77.4°F claimed as the previous monthly record in 1936, meaning that July 2012 by that comparison WAS NOT THE HOTTEST MONTH ON RECORD.
The question for now is: why do we appear to have two different sets of data for the past two years between the official database and the SOTC reports and why have they let this claim they made stand if the data does not support it?
There’s another curiosity.
Curiously, the last two months in my table above, October and November 2012 have identical values between the database and the SOTC report for those months.
What’s going on? Well, the explanation is quite simple, it’s a technology gap.
You see, despite what some people think, the nation’s climate monitoring network used for the SOTC reports is not some state of the art system, but rather the old Cooperative Observer Network which came into being in the 1890’s after Congress formed the original US Weather Bureau. Back then, we didn’t have telephones, fax machines, radio, modems or the Internet. Everything was observed/measured manually and recorded by hand with pen and paper, and mailed into NCDC for transcription every month. That is still the case today for a good portion of the network. Here’s a handwritten B91 official reporting form from the observer at the station the New York Times claims is the “best in the nation”, the USHCN station in Mohonk, New York:
Note that in cases like this station, the observer sends the report in at the end of the month, and then NCDC transcribes it into digital data, runs that data through quality control to fix missing data and incorrectly recorded data, and all that takes time, often a month or two for all the stations to report. Some stations in the climate network, such as airports, report via radio links and the Internet in near real-time. They get there in time for the end of the month report where the old paper forms do not, hence the technology gap tends to favor more of a certain kind of station, such as airports, over other traditional stations.
NCDC knows this, and reported about it. Note my bolding.
NOAA’s National Climatic Data Center (NCDC) is the world’s largest active archive of weather data. Each month, observers that are part of the National Weather Service Cooperative Observer Program (COOP) send their land-based meteorological surface observations of temperature and precipitation to NCDC to be added to the U.S. data archives. The COOP network is the country’s oldest surface weather network and consists of more than 11,000 observers. At the end of each month, the data are transmitted to NCDC via telephone, computer, or mail.
Typically by the 3rd day of the following month, NCDC has received enough data to run processes which are used to calculate divisional averages within each of the 48 contiguous states. These climate divisions represent areas with similar temperature and precipitation characteristics (see Guttman and Quayle, 1996 for additional details). State values are then derived from the area-weighted divisional values. Regions are derived from the statewide values in the same manner. These results are then used in numerous climate applications and publications, such as the monthly U.S. State of the Climate Report.
NCDC is making plans to transition its U.S. operational suite of products from the traditional divisional dataset to the Global Historical Climatological Network (GHCN) dataset during in the summer of 2011. The GHCN dataset is the world’s largest collection of daily climatological data. The GHCN utilizes many of the same surface stations as the current divisional dataset, and the data are delivered to NCDC in the same fashion. Further details on the transition and how it will affect the customer will be made available in the near future.
The State of the Climate reports typically are issued in the first week of the next month. They don’t actually bother to put a release date on those reports, so I can’t give a table of specific dates. The press usually follows suit immediately afterwards, and we see claims like “hottest month ever” or “3rd warmest spring ever” being bandied about worldwide in news reports and blogs by the next day.
So basically, NCDC is making public claims about the average temperature of the United States, its rank compared to other months and years, and its severity, based on incomplete data. As I have demonstrated, that data then tends to change about two months later when all of the B91’s come in and are transcribed and the data set becomes complete.
It typically cools the country when all the data is used.
But, does NCDC go back and correct those early claims based on the new data? No
While I’d like to think “never attribute to malice what can be explained by simple incompetence“, surely they know about this, and the fact that they never go back and correct SOTC claims (which drive all the news stories) suggests some possible malfeasance. If this happens like this in CONUS, it would seem it happens in Global Tavg also, though I don’t have supporting data at the moment.
Finally, here is where it gets really, really, wonky. Remember earlier when I showed that by the claims in the July 2012 SOTC report the new data showed July 2012 was no longer hotter than July 1936? Here’s the SOTC again.
Note the July 1936 words are a link, and they go to the NCDC climate database plotter output again. Note the data for July 1936 I’ve highlighted in yellow:
July 1936 from the NCDC database says 76.43°F Even it doesn’t match the July 2012 SOTC claim of 77.4°F for July 1936. That can’t be explained by some B91 forms late in the mail.
So what IS the correct temperature for July 2012? What is the correct temperature for July 1936? I have absolutely no idea, and it appears that the federal agency charged with knowing the temperature of the USA to a high degree of certainty doesn’t quite know either. Either the SOTC is wrong, or the NCDC database available to the public is wrong. For all I know they both could be wrong. On their web page, NCDC bills themselves as:
How can they be a “trusted authority” when it appears none of their numbers match and they change depending on what part of NCDC you look at?
It is mind-boggling that this national average temperature and ranking is presented to the public and to the press as factual information and claims each month in the SOTC, when in fact the numbers change later. I’m betting we’ll see those identical numbers for October and November 2012 in Table 1 change too, as more B91 forms come in from climate observers around the country.
The law on such reporting:
Wikipedia has an entry on the data quality act, to which NCDC is beholden. Here are parts of it:
The Data Quality Act (DQA) passed through the United States Congress in Section 515 of the Consolidated Appropriations Act, 2001 (Pub.L. 106-554). Because the Act was a two-sentence rider in a spending bill, it had no name given in the actual legislation. The Government Accountability Office calls it the Information Quality Act, while others call it the Data Quality Act.
The DQA directs the Office of Management and Budget (OMB) to issue government-wide guidelines that “provide policy and procedural guidance to Federal agencies for ensuring and maximizing the quality, objectivity, utility, and integrity of information (including statistical information) disseminated by Federal agencies”.
Sec. 515 (a) In General — The Director of the Office of Management and Budget shall, by not later than September 30, 2001, and with public and Federal agency involvement, issue guidelines under sections 3504(d)(1) and 3516 of title 44, United States Code, that provide policy and procedural guidance to Federal agencies for ensuring and maximizing the quality, objectivity, utility, and integrity of information (including statistical information) disseminated by Federal agencies in fulfillment of the purposes and provisions of chapter 35 of title 44, United States Code, commonly referred to as the Paperwork Reduction Act.
Here’s the final text of the DQA as reported in the Federal Register:
Based on my reading of it, with their SOTC reports that are based on preliminary data, and not corrected later, NCDC has violated these four key points:
In the guidelines, OMB defines ‘‘quality’’ as the encompassing term, of which ‘‘utility,’’ ‘‘objectivity,’’ and ‘‘integrity’’ are the constituents. ‘‘Utility’’ refers to the usefulness of the information to the intended users. ‘‘Objectivity’’ focuses on whether the disseminated information is being presented in an accurate, clear, complete, and unbiased manner, and as a matter of substance, is accurate, reliable, and unbiased. ‘‘Integrity’’ refers to security—the protection of information from unauthorized access or revision, to ensure that the information is not compromised through corruption or falsification. OMB modeled the definitions of ‘‘information,’’ ‘‘government information,’’ ‘‘information dissemination product,’’ and ‘‘dissemination’’ on the longstanding definitions of those terms in OMB Circular A–130, but tailored them to fit into the context of these guidelines.
I’ll leave it to Congress and other Federal watchdogs to determine if a DQA violation has in fact occurred on a systemic basis. For now, I’d like to see NCDC explain why two publicly available avenues for “official” temperature data don’t match. I’d also like to see them justify their claims in the next SOTC due out any day.
I’ll have much more in the next couple of days on this issue, be sure to watch for the second part.
UPDATE: 1/7/2013 10AMPST
Jim Sefton writes on 2013/01/07 at 9:51 am
I just went to the Contiguous U.S. Temperature July 1895-2012 link you put up and now none of the temperatures are the same as either of your screen shots. Almost every year is different.
2012 is now 76.92 & 1936 is now 76.41 ?
That’s verified, see screencap below made at the same time as the update:
This begs the question, how can the temperatures of the past be changing?
Here’s comment delimited data for all months of July in the record:
For now, in case the SOTC reports should suddenly disappear or get changed without notice, I have all of those NCDC reports that form the basis of Table 1 archived below as PDF files.