Guest Post by Willis Eschenbach
As many folks know, I’m a fan of good clear detailed data. I’ve been eyeing the buoy data from the National Data Buoy Center (NDBC) for a while. This is the data collected by a large number of buoys moored offshore all around the coast of the US. I like it because it is unaffected by location changes, time of observation, or Urban Heat Island effect, so there’s no need to “adjust” it. However, I haven’t had the patience to download and process it, because my preliminary investigation a while back revealed that there are a number of problems with the dataset. Here’s a photo of the nearest buoy to where I live. I’ve often seen it when I’ve been commercial fishing off the coast here from Bodega Bay or San Francisco … but that’s another story.
And here’s the location of the buoy, it’s the large yellow diamond at the upper left:
The problems with the Bodega Bay buoy dataset, in no particular order, are:
• One file for each year.
• Duplicated lines in a number of the years.
• The number of variables changes in the middle of the dataset, in the middle of a year, adding a column to the record.
• Time units change from hours to hours and minutes in the middle of the dataset, adding another column to the record.
But as the I Ching says, “Perseverance furthers.” I’ve finally been able to beat my way through all of the garbage and I’ve gotten a clean time series of the air temperatures at the Bodega Bay Buoy … here’s that record:
Must be some of that global warming I’ve been hearing about …
Note that there are several gaps in the data
Year 1986 1987 1988 1992 1997 1998 2002 2003 2011 Months 7 1 2 2 8 2 1 1 4
Now, after writing all of that, and putting it up in draft form and almost ready to hit the “Publish” button … I got to wondering if the Berkeley Earth folks used the buoy data. So I took a look, and to my surprise, they have data from no less than 145 of these buoys, including the Bodega Bay buoy … here is the Berkeley Earth Surface Temperature dataset for the Bodega Bay buoy:
Now, there are some oddities about this record … first, although it is superficially quite similar to my analysis, a closer look reveals a variety of differences. Could be my error, wouldn’t be the first time … or perhaps they didn’t do as diligent a job as I did of removing duplicates and such. I don’t know the answer.
Next, they list a number of monthly results as being “Quality Control Fail” … I fear I don’t understand that, for a couple of reasons. First, the underlying dataset is not monthly data, or even daily data. It is hourly data … so while the odd hourly record might be wrong, how could a whole month fail quality control? And second, the data is already checked and quality controlled by the NDBC. So what is the basis for the Berkeley Earth claim of multiple failures of quality control on a monthly basis?
Moving on, below is what they say is the appropriate way to adjust the data … let me start by saying, whaa?!? Why on earth would they think that this data needs adjusting? I can find no indication that there has been any change in how the observations are taken, or the like. I see no conceivable reason to adjust it … but nooo, here’s their brilliant plan:
As you can see, once they “adjust” the station for their so-called “Estimated Station Mean Bias”, instead of a gradual cooling, there’s no trend in the data at all … shocking, I know.
One other oddity. There is a gap in their records in 1986-7, as well as in 2011 (see above), but they didn’t indicate a “record gap” (green triangle) as they did elsewhere … why not?
To me, all of this indicates a real problem with the Berkeley Earth computer program used to “adjust” the buoy data … which I assume is the same program used to “adjust” the land stations. Perhaps one of the Berkeley Earth folks would be kind enough to explain all of this …
w.
AS ALWAYS: If you disagree with someone, please QUOTE THE EXACT WORDS YOU DISAGREE WITH. That way, we can all understand your objection.
R DATA AND CODE: In a zipped file here. I’ve provided the data as an R “save” file. The code contains the lines to download the individual data files, but they’re remarked out since I’ve provided the cleaned-up data in R format.
BODEGA BAY BUOY NDBC DATA: The main page for the Bodega Bay buoy, station number 46013, is here. See the “Historical Data” link at the bottom for the data.
NDBC DATA DESCRIPTION: The NDBC description file is here.





“Steven Mosher November 29, 2014 at 7:06 am”
If that is all he can say about this post I have gone even further OFF Berkeley Earth.
This one “made my day” “Finally one last time.
The data isn’t adjusted.
It’s a regression. The model creates fitted values.
Fitted values differ from the actual data. Durrr
Lots of people like to call these fitted values adjusted data.
Think through it”.
Any temperature record that shows cooling must be adjusted. The adjustment process is “spring loaded” to adjust any sudden downward change. That “sudden” change might be the natural result of a few days of missing data when the season is naturally cooling, might be due to a change in wind direction, a change in ocean current, could be anything. But the process is built to look for a “adjust” any sudden downward change upward. The idea that there could be a NATURAL sudden downward change is completely alien to them. Only upward changes are “natural”, apparently.
The Bezerkeley ‘scalpel’ paper:
http://scitechnol.com/2327-4581/2327-4581-1-103.php
Extract from the summary:
Iterative weighting is used to reduce the influence of statistical outliers. Statistical uncertainties are calculated by subdividing the data and comparing the results from statistically independent subsamples using the Jackknife method. Spatial uncertainties from periods with sparse geographical sampling are estimated by calculating the error made when we analyze post-1960 data using similarly sparse spatial sampling.
Trying to pretend the data isn’t ‘adjusted’ by calling the adjustments ‘fitted data’ borders on the bizarre.
The full paper is available (for free).
If you use iterative weighting to reduce outliers the resulting “fitted” data converges to the mean rather than eliminating a trend.The only way to eliminate a trend is to adjust the mean over a time period that is long enough to be insensitive to outliers but much shorter than the time series (e.g. several months in a decadal time series sampled daily).
What their algorithm does, I think, is turn a station with a falling trend (say) into 3 stations (say) with no trend. Any ‘statistical’ outliers are removed whether or not there is any good reason to do so.
Did you ask BE for comment before posting? There is that rhetorical (?) question at the bottom, so I am wondering whether someone at BE is expected to read here and act upon your invitation, or whether they didn’t/wouldn’t grace you with a reply anyway?
Willis,
Just a simple “Thank you.” I always learn something from your posts.
And I LOVE how the mighty fall with just some simple investigative questions!
I think Berkeley must have chosen only the two hours of normal temp collection times as if it were a land thermometer and used that for data.
TimTheToolMan November 29, 2014 at 5:30 am
Tim, the indicated months are not knocked out in the quality control process because of missing data. There are other months with more missing data that are included. Their own documentation says it is because they are “Regional Climatology Outliers”, that is to say, they don’t agree with the nearby land stations.
Finally, since we are not looking for daily mins or maxes but for monthly averages, you can actually knock out a good chunk of a month’s data without changing much. Remember that a month contains 30 * 24 = 720 observations, if you knock out even a quarter of them at random you won’t get much change in the average.
I just tested this against the actual data. I looked at June 2005. The monthly mean is 11.92°C. Knocking out a quarter of the data for 1000 trials give averages with a standard deviation of 0.02°C and a range from 11.86°C to 11.99°C … meaning that none of the 1,000 trials were in error by more than seven hundredths of a degree.
So I’m sorry, but you are very wrong when you say that “you wouldn’t need to lose much data to put a month in question”. In fact, knocking out a full quarter of the monthly data leads to a MAXIMUM error in 1000 trials of seven hundredths of a degree …
Regards,
w.
A little further to missing data..
I’ve got myself three of the little USB data and humidity loggers. One is dangling off my washing line in my garden, another is near the middle of a 125 acre patch of permanent grassland cow-pasture (nearest building is a holiday home, empty 11 months/year and 500 metres away) The third is temp logger only and is in the company of the stopcock on my farm’s water supply ~30” underground.
I did have them taking readings every 5 minutes so as to make a good comparison with my local (3 mile away) Wunderground station which broadcasts a new reading every 5 minutes. After a couple of years, Excel on my 7yo lappy was ‘feeling the strain’ of 288 data points for every day.
Out of curiosity, as you do, I compared the daily average of the 288 points to the average of the maximum and minimum temperatures I’d recorded (just the two data values) whenever they occurred between midnight and midnight on whatever day.
To one decimal place (the loggers only record to ±0.5°C anyway), there was no difference if I used 288 points or just two data points to get the daily average. The answer was the same. It was really really surprising – 286 readings were redundant. And believe it or not, the same applies to the data coming from the Wunderground station.
It kinda makes the whole business of TOBS adjustment redundant as well dunnit and reveals what a fraud it is.
Pete writes “there was no difference if I used 288 points or just two data points to get the daily average.”
Now that IS interesting. It does make you wonder just how justified the TOBs adjustment is. I mean the TOBs adjustment makes sense conceptually but I wonder if the justification for its use was against enough varied test data. It’d be very agenda driven “Climate Science” if they found the effect a few times and extrapolated it to every station regardless.
Of course the reverse could be true and your experience is the exception rather than the rule…
Actually having thought about that a little more, I’m not convinced simply taking the average is the right answer. You probably need to take the average starting at say 10am for 24 hours worth of readings and compare that to taking the average starting at 10pm for 24 hours of readings to get the “TOB” bit in there.
Oh and, for example, the min of any 24 hours cant be less than the min at the start point.
Of course I meant
The min of any 24 hours cant be less than the temperature at the start point.
Bah, I’ll get the statement right eventually :-S
The min of any 24 hours can only be equal to or less than the temperature at the start point. It cant be greater (and consequently a single min may be counted twice for consecutive “days”). A similar argument applies to the max.
Common sense time.
Someone with a bit of gravitas in the field of meteorology should compile two lists: the first would show the common ways that temperature can suddenly drop, for example a cold front passage, thunderstorm, santa anna, mistral, katabatic wind, etc. The second could show the common ways the temperature might suddenly spike upward, such as a sirocco. Follow up with a discussion characterizing the relative quantities of each “disturbance” and the likelihood that “fitting” would alter valid observations due to these conditions.
I would expect that such a conversation would reveal that many more below-expected observations would need to be “fitted” than higher-than-expected observations.
Well said. Thinking back over my almost 6 decade life I can think of very few times that it has suddenly gotten “hotter” but many times when it has suddenly gotten “cooler.” The hotter episodes are limited to small short wind gusts in desert gullies or washes. The cooler episodes happen a lot here in Florida when T-storms wind through. We’d get the weird, dry cold chunks of air during tornado season in the Midwest. Cold blasts coming down off a mountain etc. Sudden drops seem to be more the norm than sudden rises in temps from a purely subjective pov.
Would a land station or a buoy station be appropriate to smear (another word?) across large stretches of ocean without stations?
After reading this post and all these comments, I thought Willis did an exceptional job.
As for Mosher in defending the BEST system? Just a “durr”. That says it all. Mosher being Mosher without enlightening us. It is not adjustment. It is some other gyration that by any other name is still a rose.
Willis writes “So I’m sorry, but you are very wrong when you say that “you wouldn’t need to lose much data to put a month in question”. In fact, knocking out a full quarter of the monthly data leads to a MAXIMUM error in 1000 trials of seven hundredths of a degree …”
There is a difference between “knowing” and having an acceptable estimate. You’re talking about the latter but in the same breath wonder how Berkley Earth “lose” a whole month. Perhaps you have a better speculative answer?
TimTheToolMan November 29, 2014 at 12:33 pm
Thanks, Tim. Not sure what you mean by “knowing” in quotes, or what your point is.
Actually, the Berkeley folks lost an entire year, which has all but 13 of the 8,760 hourly measurements … so obviously, the loss has nothing to do with missing data.
w.
Willis writes “Not sure what you mean by “knowing” in quotes, or what your point is.”
And the point is that if they choose to drop data rather than estimate it, then a relatively small amount of data might make a month unusable.
Above, you said “Actually, the Berkeley folks lost an entire year”
But earlier you wondered “so while the odd hourly record might be wrong, how could a whole month fail quality control?”
I speculated on your monthly statement, not on how a whole year might be missing.
Pat Frank November 29, 2014 at 11:52 am
Agreed in general, but each case is different … and in any case, this does not affect the calculation of trends.
Why? What I show above is not an anomaly. It is the average of actual temperatures.
Mmmm … well, that depends on a couple things. First, are the errors symmetrical about the actual temperature? And second, what is the precision (repeatability) of the instruments?
IF the errors are symmetrical around the actual temperature, then they DO average away. And IF the instruments are precise, then the trends will be valid even though the absolute numbers may be off.
Now, according to the data you reference, the errors are in fact symmetrical, as they are given as ± 1°C (as opposed to say +0.5/-1.5°C).
Now, there are no less than about 720 hourly measurements per month. If each one of them has a uniform error in the range of ±1.0°C, we can take for example June of 2005. It has a mean of 11.92°C. If we add an uniform random error in the range of ±1°C to the data, the standard deviation of the resulting means of 1,000 trials is 0.2°C, two hundredths of a degree … and the range is from 11.86°C to 12.00°C, which is about ± seven hundredths of a degree.
So I don’t think that your claim that the error in the monthly average data is ±1.4°C is correct.
Next, let’s assume that the error is NOT symmetrical, but instead is in the range from 0°C to 1°C. In that case, of course, the standard deviation of the mean is half the size, 0.01°C. However, the range is from 12.40°C to 12.48°C … so we have precision which is twice as good, but with an offset of half a degree C.
So even in that case, the error is not 1.4°C, but only half a degree … and thats a worst case scenario. In fact, errors are generally not uniformly distributed, but instead have something related to a normal distribution.
Best regards,
w.
Willis, “Now, according to the data you reference, the errors are in fact symmetrical, as they are given as ± 1°C (as opposed to say +0.5/-1.5°C).”
That’s just the way accuracy is written, Willis. It’s just the empirical 1-sigma standard deviation of the difference between a calibration known and the measurement. It doesn’t imply that the errors in accuracy are in fact symmetrical about their mean.
Any distribution of error, as unsymmetrical as one likes, will produce an empirical (+/-)x 1-sigma accuracy metric.
An accuracy of (+/-)1 C transmits that the true temperature may be anywhere within that range. But one does not know where, and the distribution of error is not necessarily random (symmetrical and normal).
Here’s a NDBC page with links at the bottom to jpeg pictures of the various buoys. On the coastal buoys in particular, one can make out the gill shield of a standard air temperature sensor — similar to those used in land stations — mounted up on the instrument rack. It’s especially evident in this retrieval picture.
Those sensors are naturally ventilated, meaning they need wind of >=5 m/sec., to remove the internal heating produced by solar irradiance.
In land station tests of naturally ventilated shields, under low wind conditions, solar heating can cause 1 C errors in temperature readings, with average long-term biases of ~0.2 C and 1-sigma SDs of ~(+/-)0.3 C. None of the distributions of day-time error were symmetrical about their means. Night-time errors tended to be more symmetrical and smaller. Average error was strongly dominated by day-time errors.
So, there isn’t any reason to discount the (+/-)1 C buoy accuracy metric as the standard deviation of a random error.
Thanks for discussing buoy temperatures, Willlis. A careful and dispassionate appraisal of accuracy in marine temperatures is long overdue.
Anything that self-aggrandisingly gives itself the acronym ‘BEST’ had honestly better make sure of itself prior to sticking its neck above the parapet. Because if it’s not the ‘BEST’, then it’s going to get found out sooner or later and isn’t that a fact.
Mr Mosher’s arrival upon and very rapid about-turn and departure from the battlefield tell their own story. He did three things in his post:
He called Mr Eschenbach “Simple Willis” for his own amusement hoping that nobody would pick him up on it.
He said “The station is treated as if it were a land station. Which means it’s going to be very wrong”.
Which kind of fouls up the BEST idea.
And he said “Finally one last time. The data isn’t adjusted. It’s a regression. The model creates fitted values. Fitted values differ from the actual data. Durrr”
Regressions and fitted values are kinds of adjustments to data. Are they not?
Overall, to mix my sporting metaphors, I’d call this game, set and match to Willis Eschenbach by a knockout.
Willis writes “In fact, errors are generally not uniformly distributed, but instead have something related to a normal distribution.”
“something related to a normal distribution” seems right but not necessarily normally distributed around the mean. The sensor itself might behave like that but its the whole measurement process that you need to consider and that includes errors introduced by varying voltages provided by the battery, its condition and how/when it charges.
The Team’s Torquemadas torture ocean data until they confess because without such manipulation, surface “data” sets would show cooling, since that’s what’s actually happening.
Phil Jones admits that after adjusting all the land records upward (even when adjusting for UHIs!), then the ocean data need to be upped even more so that they won’t be out of line with the cooked land books.
All the surface series, HadCRUT4, GISTEMP & BEST, are literally worse than useless, except as boogiemen to raise money for their perpetrators.
Catherine, the oceans are 71% of the planets surface. Until Argo, they were grossly undersampled. Some maintain they still are. Other than seafarers, we don’t experience these surface temperatures.
No matter how ‘adjusted’/fitted/homogenized, land temperatures can say nothing useful about global averages. There is a reason Earth is called the Blue Planet. Regards.
Thanks for reminding readers of this Catherine –
I was taught that:
“Assumption is the mother of all mess-up’s”
Now, I start thinking that:
“Adjustment is the mother of all mess-up’s”
Or rather:
“Adjustment is the father of all mess-up’s”
And, when an Assumption is coupled with an Adjustment they frequently give birth to an Ad hominem.
The data needed to be adjusted because it is contrary to the GCM outputs.
But it seems clear that there is severe yellow diamond pollution near the California coast.
They’re going to cover them up with wind turbines so no one will notice them… especially the gulls and pelicans and condors and falcons and eagles…
Steve Mosher,
Come on, we are waiting for a reasoned answer. Clearly, ocean measurements should not be treated as land measurements, but on the face of it, they have been so treated. What say you (and Berkley Earth)?
Thanks Willis. Perhaps you have done for buoy data what Watts did for the land station data, not in the siting, but in the confidence we should have in the accuracy of the data. If you have the time, I would like to see if other buoy data are also similarly adjusted, oh sorry, fitted.
The list of the buoys used by Berkeley Earth is here, you can look for yourself.
w.
Boy oh Buoy! Looks like things are getting cooler.
Note, these are hourly measurements, not predominately min/max as all the land datasets are. And are therefore more valid measures of temperature changes over time.
In addition, they are free of the numerous local to regional scale effects on measured temperatures that exist on land.
Being fixed they are also free of any drift biases that exist with Argo.
IMO, perhaps the best air temperature trend dataset we have.
No real surprise to me they show a cooling trend.
It is a catastrophe that the IPCC (i.e. Dr Pachauri and his Panel mates), pro-global warming politicians and bureaucrats, most of the media, academia, the environmental movement, and other global warming alarmist supporters, do not give a damn about real world observational data on climate. They are only interested in climate propaganda driven by the UN.
There is now an overwhelming amount of data and research that demonstrates the IPCC’s supposition of catastrophic man-made global warming is wrong. Yet the grand deception goes on and on.
It no longer matters what the weather and temperature does anymore because, whichever way they go, the climate change charlatans just blame it all on “climate change”… global warming or cooling… droughts or floods… hurricanes or no hurricanes… winter blizzards or no wonder blizzards… sea level rise or sea-level decline… it matters not, anymore.
Pro-CAGW data isn’t data until it’s made a pass through the Gruber engine. Thus refined, it is fit to publish.
Willis, I have a question regarding short term ocean temperature changes. With arctic air upon our area for the first time this year, decided to look at a buoy off the Washington coast to see how cold the water was. NDBC 46087 (Neah Bay) at 11/29/14, 1720 hours, Air temp of 36.3F and Water temp of 52.3F. At 11/30/14, 0720 hours, Air temp of 34.3F and Water temp of 50.0F. The two air temp readings seemed logical, but the larger difference of the two water temp readings surprised me.
Is it unusal to have that much change in the water along the coast? Or, is the effect of tides and currents moving the water temperature in larger swings than the air normal?
Also, was curious if all the buoys report temperatures in Farenheit?
Thanks
Thanks, Windsong. First, the buoys report in Celsius, at least the Bodega Bay buoy does on the NDBC page.
Regarding the water temperatures, the west coast of the US is a roiling, bubbling mass of water due to the deep currents that strike the coast and upwell all along the western edge of the continent. These currents vary daily both in strength and location.
This has several results. First, it is responsible for the green pea-soup water all along the coast, in contrast to the clear blue ocean water not far offshore. The deep water is rich in a variety of minerals, and as a result, it supports huge amounts of green phytoplankton.
Another result is that the various measures of the water like salinity, pH, temperature, and density are all changing on a very short timescale.
Finally, Neah Bay is at the mouth of the Strait of Juan de Fuca. As a result, it is strongly tidal. Particularly in the fall and winter you get fresh water in the inshore section. In addition, the inshore water is shallow, and thus warms up at a much different rate than the outer ocean. As a result the inshore water will often have a very different temperature than the open ocean water.
All the best, Neah Bay is a lovely spot, say hi to the north coast for me.
w.