Guest Post by Willis Eschenbach
As many folks know, I’m a fan of good clear detailed data. I’ve been eyeing the buoy data from the National Data Buoy Center (NDBC) for a while. This is the data collected by a large number of buoys moored offshore all around the coast of the US. I like it because it is unaffected by location changes, time of observation, or Urban Heat Island effect, so there’s no need to “adjust” it. However, I haven’t had the patience to download and process it, because my preliminary investigation a while back revealed that there are a number of problems with the dataset. Here’s a photo of the nearest buoy to where I live. I’ve often seen it when I’ve been commercial fishing off the coast here from Bodega Bay or San Francisco … but that’s another story.
And here’s the location of the buoy, it’s the large yellow diamond at the upper left:
The problems with the Bodega Bay buoy dataset, in no particular order, are:
• One file for each year.
• Duplicated lines in a number of the years.
• The number of variables changes in the middle of the dataset, in the middle of a year, adding a column to the record.
• Time units change from hours to hours and minutes in the middle of the dataset, adding another column to the record.
But as the I Ching says, “Perseverance furthers.” I’ve finally been able to beat my way through all of the garbage and I’ve gotten a clean time series of the air temperatures at the Bodega Bay Buoy … here’s that record:
Must be some of that global warming I’ve been hearing about …
Note that there are several gaps in the data
Year 1986 1987 1988 1992 1997 1998 2002 2003 2011 Months 7 1 2 2 8 2 1 1 4
Now, after writing all of that, and putting it up in draft form and almost ready to hit the “Publish” button … I got to wondering if the Berkeley Earth folks used the buoy data. So I took a look, and to my surprise, they have data from no less than 145 of these buoys, including the Bodega Bay buoy … here is the Berkeley Earth Surface Temperature dataset for the Bodega Bay buoy:
Now, there are some oddities about this record … first, although it is superficially quite similar to my analysis, a closer look reveals a variety of differences. Could be my error, wouldn’t be the first time … or perhaps they didn’t do as diligent a job as I did of removing duplicates and such. I don’t know the answer.
Next, they list a number of monthly results as being “Quality Control Fail” … I fear I don’t understand that, for a couple of reasons. First, the underlying dataset is not monthly data, or even daily data. It is hourly data … so while the odd hourly record might be wrong, how could a whole month fail quality control? And second, the data is already checked and quality controlled by the NDBC. So what is the basis for the Berkeley Earth claim of multiple failures of quality control on a monthly basis?
Moving on, below is what they say is the appropriate way to adjust the data … let me start by saying, whaa?!? Why on earth would they think that this data needs adjusting? I can find no indication that there has been any change in how the observations are taken, or the like. I see no conceivable reason to adjust it … but nooo, here’s their brilliant plan:
As you can see, once they “adjust” the station for their so-called “Estimated Station Mean Bias”, instead of a gradual cooling, there’s no trend in the data at all … shocking, I know.
One other oddity. There is a gap in their records in 1986-7, as well as in 2011 (see above), but they didn’t indicate a “record gap” (green triangle) as they did elsewhere … why not?
To me, all of this indicates a real problem with the Berkeley Earth computer program used to “adjust” the buoy data … which I assume is the same program used to “adjust” the land stations. Perhaps one of the Berkeley Earth folks would be kind enough to explain all of this …
w.
AS ALWAYS: If you disagree with someone, please QUOTE THE EXACT WORDS YOU DISAGREE WITH. That way, we can all understand your objection.
R DATA AND CODE: In a zipped file here. I’ve provided the data as an R “save” file. The code contains the lines to download the individual data files, but they’re remarked out since I’ve provided the cleaned-up data in R format.
BODEGA BAY BUOY NDBC DATA: The main page for the Bodega Bay buoy, station number 46013, is here. See the “Historical Data” link at the bottom for the data.
NDBC DATA DESCRIPTION: The NDBC description file is here.





So, no one anywhere knows how the planet’s temp is measured, across decades as measurement has changed (supposedly for the better), or what the planet’s temp is, but we have all been certain for decades that the planet has been warming. OK, I got it.
Willis,
Great job on getting the data and presenting it in a form that tells the story.
As an engineer I appreciate your use of raw data. If engineers corrupted the data the way the “climate scientists” do we would have structure failures and bridges falling down and processes melting the containment equipment.
In evaluating equipment failures and operating performance Engineers would never accept the data modification employed by the “team”
Mosher dos’nt know anything about engineering so he can’t comprehend first principals of data integrity ergo BEST data is suspect. Yes cAGW suffers from integrity failure just like a badly designed bridge fails structural integrity.
What Mosher knows or doesn’t know is an irrelevant and unpleasant ad hominem. Either his claims are correct or they are not, regardless of what he might or might not not know.
w.
Steven Mosher November 30, 2014 at 12:42 pm
Mosh, thanks for returning to answer some the questions. I don’t understand the last part of this. You say that improving the local detail doesn’t improve the overall regression r^2 … when you make the answer better it doesn’t improve the answer? Could you explain that?
Next, the “climate” in your equation doesn’t change with time. In other words, other than seasonal variations, the “climate” doesn’t change from year to year. Per your “Methods” paper:
So this means that for the long-term trends, the climate is irrelevant, and all that counts is the weather … an odd concept indeed. You are assuming that climate is constant … say what?
However, let’s take that all as read and see where it leads us:
I assume that the “predicted local value” is what you are calling the “climate”. But since the climate (in your framework) doesn’t vary year to year, this seems not to have much meaning. In any case, it is in no sense a “regression based prediction”, because the results of the regression are CONSTANT OVER TIME. How is something that doesn’t change over time a “prediction”?
As far as I know, nobody said that the predicted local value is “adjusted data”. In fact, it is a CONSTANT with respect to time, so it’s not adjusted anything.
But what is adjusted is all the rest, the “weather” in your framework. And before you jump on people for calling it “adjusted”, and start repeating that you don’t adjust data, that’s what YOU call it, “breakpoint adjusted data”. That’s the meat of your results, not the regression values, and that is indeed adjusted.
Not only is it adjusted, it is adjusted very poorly, as the Bodega Bay buoy data shows. You have a breakpoint in the Bodega Bay data 2005 for a simple reason—your computer screwed up. There was no gap in the data in 2005, it was just shoddy quality control on your part. And when you found that gap, you chopped the data at that point, with absolutely no justification for doing so.
The other part is, you have this bizarre idea that chopping one continuous dataset into shorter chunks is not “adjusting” it … but when you look at the raw Bodega Bay data compared to the “breakpoint adjusted” dataset, their trends are very different. How can you claim with a straight face that you get a different trend but you haven’t “adjusted” anything?
Hang on, I don’t follow. “Data that are predicted from a global model” is the C for Climate part of your results. We’re not looking at that, we’re looking at YOUR “breakpoint adjusted data” … and presumably, from your description, this is NOT “data that are predicted from a global model”. What am I missing here?
Next, you say we have the option of deciding how to treat buoys … and that you treat them as land stations.
?????
I’m sorry, but I don’t understand how you would tell if adding a land class to the regression “adds anything” … what would you compare it against? In other words, how can you tell if your “climate” is improved?
Next, in the “Methods” paper it says:
They say it “should be trend neutral”? That means that they’re just guessing, and they haven’t tested it. Color me unsurprised.
However, look at what happens in the Berkeley Earth Bodega Bay record. The introduction of unnecessary breakpoints converts a trend of falling temperatures into almost no trend at all. And this is further exacerbated by Berkeley Earth throwing out a fifth! of the data because it doesn’t fit their fancy algorithm. According to Berkeley Earth, before any adjustments, the trend was -2.75°C per century. After the so-called “quality control” throws out a fifth of the data for no reason … hey, guess what? The trend is reduced to -1.549°C per century … and after the “breakpoint alignment” the trend was further reduced to a mere -0.24°C per century.
Those are not my figures, they are yours … and you say that you’re not adjusting anything?
Next, the “Methods” paper says:
However, the Bodega Bay dataset has no gaps of a year … and even in the Berkeley Earth incorrect version of the same dataset there is only one gap of a full year. Despite that, you show two breakpoints in the data … why is that? Is it the result of an undocumented change in the method, and if so, are there others?
Best regards,
w.
Willis,
Sort of cuts thru it, doesn’t it. Well done.
Thanks, Willis. Food for thought about BEST; not good.
I wonder the trend would like with the climate fails included?
That like would look better if it was actually a look.
OMG – last try, promise! I wonder how that trend would look with the climate fails included . . . there!” ?
Hi everyone. I’ve been lurking these pages for quite a few years and this is my first post. Yay! Some of you might know me as The Old Bloke form the BBC bias web site, if you do, hello also. I’ve been “interested” in meteorology for 53 years now and have “seen it all” here in the U.K. I am also a pilot. Concerning the data sets from buoys, please have a look at this:
http://www.crondallweather.co.uk/uk-buoy-live-weather-data.php#.VHzfi-kqXcs
Mosher confirms here what many of us have long known about BEST’s algorithm. It is not an objective statistical treatment of actual measurements, but the purposeful creation of a pseudo-scientific fiction, whose rationale is couched in Orwellian double-speak.
Dear heavens, 1sky1, could you please put a cork in the ad hominem attacks? You have no clue what the motives of the large number of folks involved with Berkeley Earth are … and assuredly they do not all have the same motives.
Posting this kind of vitriolic nonsense, without a single scrap of evidence or science in it, does far more damage to your reputation than to theirs. Look, I think that Richard Mueller is a double-plus ungood fellow for traducing Anthony in front of the US Senate … but that has nothing to do with his scientific claims.
Finally, I know Mosh, and he’s a decent, honest man doing his best to put his point of view forwards, as we all do. I certainly disagree with him often about the science … but that doesn’t make him a bad or underhanded person. It just means we disagree about the science.
If you have issues with the Berkeley Earth methods (as I certainly do) then please, bring them up. That’s what the site is for, attacking the science. But that kind of ugly personal attack has no place here on WUWT.
w.
You think it’s decent & honest to blame skeptics for Obama’s executive orders?
More evidence that Lalaland includes Berkeley.
Obama? Who said anything about Obama? QUOTE WHAT YOU DISAGREE WITH, your claim as stated makes no sense at all.
w.
I guess you missed the recent comments in which Steven jumped the shark, even by his own high standard of cartilaginous fish vaulting.
If in your bubble he’s a great guy, who am I to pop it?
So you refuse to quote what you are babbling about, or provide a link to where it might be found … and you expect folks to take you seriously? Really?
Not gonna happen. I just laugh at that kind of pretentious posturing and hand-waving. If you want to have a discussion, link to what you’re talking about, and I’m more than happy to talk about it.
But you trying to send me on a snipe hunt to find some vague unspecified comment that you read on some unknown thread somewhere at some time or another?
Sorry, I don’t fall for those kind of childish games.
w.
Good grief, Willis, you have no idea what an actual “ad hominem attack” is. What I’m attacking is BEST’s patently tendentious algorithm and Mosher’s Orwellian defense of it here.
1sky1 December 1, 2014 at 4:26 pm Edit
Having been the recipient of so many of them, I’m a bit of an expert on ad hominem attacks. You are attacking the motives of the people who put the dataset together, and not their science. You have claimed that they are purposefully deceiving people, which is an ad hominem attack.
w.
PS—In your words, the people who made the Berkeley Earth dataset were engaged in
And that’s ad hominem root and branch, my friend.
I’m not so sure Willis. Is it an ad hominem attack to say the IPCC are tailoring their reports for self preservation?
I do think the people at Berkeley Earth are genuinely trying to do better science but I also think they underestimate the problems with their scalpel approach and probably overestimate their ability to correctly process source data according to their stated rules.
TimTheToolMan December 2, 2014 at 4:56 am Edit
Thanks, Tim. What Milodon said about Mosher was:
To me, this is not an ad hominem attack, either by Milodon or Mosher. Why? Because (as far as I know, since milodon hasn’t identified WTF he’s babbling about) neither one is attacking the person INSTEAD OF attacking the person’s scientific claims. Not all personal attacks are ad hominem attacks. They may be political statements, they may simply be talking about the person and their habits, they may be a host of things.
1sky1, on the other hand, said:
Note what he is doing there. He is trying to discredit the ALGORITHM by attacking the supposed motives of its creators. This is an ad hominem attack, that is to say trying to throw doubt on scientific results by throwing doubt on the scientists involved.
Best regards,
w.
BEST’s deliberate fiction, which creates the illusion that more is
known than is possible from available data, relies upon on two unwarranted
assumptions:
1) That a global regression of observed temperature upon latitude and
elevation provides a realistic criterion for evaluating and adjusting
actual station data throughout the globe. The claimed R^2 of ~0.8 simply
doesn’t stand up, however, when the periodic seasonal component is removed
to isolate the aperiodic climate signal, rendering the projections of the
regression model unfit for the purpose.
2) That decade-scale “scalpeling” can preserve bona fide low-frequency
(multidecadal) climate signal components and “kriging” can establish
reliable estimates of entire time-series where no measurements at all have
been made. While Monte Carlo testing of “break-point” detection routines on
AR(1)processes may not show a low-frequency bias, the power spectra of
actual climate signals are very different from the monotonically decaying
structure of such processes. Likewise, successful kriging is entirely
dependent upon spatial homogeneity of temporal variation, which is seldom
encountered in nature over scales greater than a few hundred miles. Yet
BEST produces time-series even in locations more than 1000 miles away from
the nearest station.
What is Orwellian about Mosher’s defense of BEST’s methods is not just their
justification, but the characterization of actual measurements as being “wrong.”
And, of course, Mueller has presented the “results” of BEST’s “findings” to Congress and the media as if they were purely the product of diligent analysis of hard empirical data, rather than
of the over-reach of academic presumption.
Only someone sitting on a branch that he himself is sawing would pretend that my verifiable observations constitute ad hominem argumentation.
Willis writes “He is trying to discredit the ALGORITHM by attacking the supposed motives of its creators. This is an ad hominem attack, that is to say trying to throw doubt on scientific results by throwing doubt on the scientists involved.”
Am I not doing the same when I say the IPCC is tailoring their reports for self preservation?
Tim:
To attack personally one’s character or motives in order to distract attention away from the substantive ISSUES of a debate is indeed ad hominem argumentation. To observe the function of someone’s public stance RELATIVE to the issue is not. Nowhere in my critique of BEST’s methodology do I impute motive.
It wasn’t an ad hominem attack, but a statement of fact.
I guess you missed this comment by Mosher & the subsequent exchange:
http://wattsupwiththat.com/2014/11/14/claim-warmest-oceans-ever-recorded/
Steven Mosher
November 14, 2014 at 9:54 am
When the pause officially ends folks will go back to some other nonsense to deny what they dont need to deny:
C02 warms the planet. the question is how much.
Now, skeptics who want to make an impact ( like Nic Lewis) focus on the real question. Imagine what would happen if all skeptics learned from his example?
Instead they clown around denying basic physics. They clown around chasing the orbit of Jupiter.
They clown around complaining about anomalies and the colors of charts. Faced with clowns like this, Obama pulls out his phone and pen.
In short, some of the craziness spouted by fringe skeptics gets used to paint the whole tribe. And that
picture gets used to justify executive action. By denying basic physics fringe skeptics enabled the like of Lewandowski. They give cover for an imperial president .
Pete Ross
November 14, 2014 at 10:46 am
This comment is beyond Orwellian, blaming sceptics for Obama’s craziness.
1sk1 writes “That decade-scale “scalpeling” can preserve bona fide low-frequency (multidecadal) climate signal components”
It seems to me that a fundamental assumption of climate science is that there can be no multidecadal scale regional climate change. At least not without some other part of the planet compensating. Archaeology tells us regional climate change is real and compensation at multidecadal scales is an assumption based on naive views that the energy cant be retained or lost at different rates over time.