Guest Post by Willis Eschenbach
As many folks know, I’m a fan of good clear detailed data. I’ve been eyeing the buoy data from the National Data Buoy Center (NDBC) for a while. This is the data collected by a large number of buoys moored offshore all around the coast of the US. I like it because it is unaffected by location changes, time of observation, or Urban Heat Island effect, so there’s no need to “adjust” it. However, I haven’t had the patience to download and process it, because my preliminary investigation a while back revealed that there are a number of problems with the dataset. Here’s a photo of the nearest buoy to where I live. I’ve often seen it when I’ve been commercial fishing off the coast here from Bodega Bay or San Francisco … but that’s another story.
And here’s the location of the buoy, it’s the large yellow diamond at the upper left:
The problems with the Bodega Bay buoy dataset, in no particular order, are:
• One file for each year.
• Duplicated lines in a number of the years.
• The number of variables changes in the middle of the dataset, in the middle of a year, adding a column to the record.
• Time units change from hours to hours and minutes in the middle of the dataset, adding another column to the record.
But as the I Ching says, “Perseverance furthers.” I’ve finally been able to beat my way through all of the garbage and I’ve gotten a clean time series of the air temperatures at the Bodega Bay Buoy … here’s that record:
Must be some of that global warming I’ve been hearing about …
Note that there are several gaps in the data
Year 1986 1987 1988 1992 1997 1998 2002 2003 2011 Months 7 1 2 2 8 2 1 1 4
Now, after writing all of that, and putting it up in draft form and almost ready to hit the “Publish” button … I got to wondering if the Berkeley Earth folks used the buoy data. So I took a look, and to my surprise, they have data from no less than 145 of these buoys, including the Bodega Bay buoy … here is the Berkeley Earth Surface Temperature dataset for the Bodega Bay buoy:
Now, there are some oddities about this record … first, although it is superficially quite similar to my analysis, a closer look reveals a variety of differences. Could be my error, wouldn’t be the first time … or perhaps they didn’t do as diligent a job as I did of removing duplicates and such. I don’t know the answer.
Next, they list a number of monthly results as being “Quality Control Fail” … I fear I don’t understand that, for a couple of reasons. First, the underlying dataset is not monthly data, or even daily data. It is hourly data … so while the odd hourly record might be wrong, how could a whole month fail quality control? And second, the data is already checked and quality controlled by the NDBC. So what is the basis for the Berkeley Earth claim of multiple failures of quality control on a monthly basis?
Moving on, below is what they say is the appropriate way to adjust the data … let me start by saying, whaa?!? Why on earth would they think that this data needs adjusting? I can find no indication that there has been any change in how the observations are taken, or the like. I see no conceivable reason to adjust it … but nooo, here’s their brilliant plan:
As you can see, once they “adjust” the station for their so-called “Estimated Station Mean Bias”, instead of a gradual cooling, there’s no trend in the data at all … shocking, I know.
One other oddity. There is a gap in their records in 1986-7, as well as in 2011 (see above), but they didn’t indicate a “record gap” (green triangle) as they did elsewhere … why not?
To me, all of this indicates a real problem with the Berkeley Earth computer program used to “adjust” the buoy data … which I assume is the same program used to “adjust” the land stations. Perhaps one of the Berkeley Earth folks would be kind enough to explain all of this …
AS ALWAYS: If you disagree with someone, please QUOTE THE EXACT WORDS YOU DISAGREE WITH. That way, we can all understand your objection.
R DATA AND CODE: In a zipped file here. I’ve provided the data as an R “save” file. The code contains the lines to download the individual data files, but they’re remarked out since I’ve provided the cleaned-up data in R format.
BODEGA BAY BUOY NDBC DATA: The main page for the Bodega Bay buoy, station number 46013, is here. See the “Historical Data” link at the bottom for the data.
NDBC DATA DESCRIPTION: The NDBC description file is here.