BEST practices step uncertainty levels in their climate data

Boyfromtottenham

January 29, 2015 12:50 pm

Tell the CIA! BEST is guilty of torturing the data till it confesses!

0

MattS

Reply to Boyfromtottenham

January 29, 2015 7:24 pm

Where do you think they got their torturers from?

0

Brandon Shollenberger

January 29, 2015 12:50 pm

I just got an alert about this post going up. I’m glad it has because I think these issues deserve attention, but I’m a little annoyed at myself because of the timing of my posts. When I published a little eBook giving the first part of my overview of the hockey stick debate:
http://www.amazon.com/Hockey-Stick-Climate-Wars-Introduction-ebook/dp/B00RE7K3W2/
last month, I had planned to finish the second part this month. Even when I wrote these two posts, I thought I could still manage to do it if I focused entirely on getting it done. That’s not going to happen if I want to discuss what I wrote about BEST.
Oh well. If this can spark some discussion of the BEST methodology, it’ll be worth it. I’ll just have to delay the publication a bit. And yes, I did mostly just write this comment to advertise that eBook. I know it’s crass, but I think a number of people here might genuinely be interested in it.

0

Rud Istvan

Reply to Brandon Shollenberger

January 29, 2015 1:27 pm

Congrats on the book.

0

Brandon Shollenberger

Reply to Rud Istvan

January 29, 2015 1:39 pm

Thanks!

0

Stephen Rasey

January 29, 2015 12:54 pm

All told, BEST’s uncertainty levels are a complete mess.
The breakpoint process is a complete mess, too.
They have reduced the temperature record to analyzing the noise after throwing away the signal.

0

Stephen Rasey

Reply to Stephen Rasey

January 29, 2015 1:08 pm

Since the subject of the post is error estimation, I will repost my concluding paragraphs from the link:

Returning to BEST, all those fragments of temperature records are equivalent to the band-pass seismic data. Finding the long term temperature signal is equivalent to inverting the seismic trace, but the error in the data must also accumulate as you go back in time. Since the temperature record fragments are missing the lowest frequencies, where is the low frequency control in the BEST process? In the seismic world, we have the velocity studies to control the low-frequency result.
What does BEST use to constrain the accumulating error? What does BEST use to provide valid low-frequency content from the data? What is the check that the BEST result is not just a regurgitation of modeler’s preconceptions and contamination from the suture glue? Show me the BEST process that preserves real low frequency climate data from the original temperature records. Only then can I even begin to give Berkley Earth results any credence.

Also refer to the June 28 – July 7, 2014 thread:
Problems with the Scalpel Method – Willis Eschenbach

0

Rud Istvan

Reply to Stephen Rasey

January 29, 2015 1:49 pm

Dedekind’s prior post, to which Willis was trying to get BEST to respond, was about ‘scalpeling’. Technically it is called Menne stitching in homogenization algorithms. And the inherent warming bias Dedekind explained to WUWT has been confirmed for actual stations. See Zhang et.al. Effect of data homogenization…in Theor. Appl. Climatol. 115: 365-373 (2014. Results from the anchoring on most recent data.

0

1sky1

Reply to Stephen Rasey

January 29, 2015 5:41 pm

Amen!

0

Mike M.

January 29, 2015 1:25 pm

It seems to me the issue here is random error versus systematic error. The former would be errors due to such things as imprecision in reading instruments and the effects of limited sampling. The latter would be things like instruments being out of calibration or measurements not being made according to protocol.
It is common in scientific papers for uncertainty estimates (when they are provided at all) to only include random errors, for the simple reason that systematic errors are very hard to estimate, because what really matters are the systematic errors you don’t know about or can’t control. The proper thing to do in such cases is to say that the error estimates are for random errors only.
It sounds like BEST is making an estimate of random errors. I think the very real issues Anthony raises pertain to possible systematic errors. So the question then is whether BEST has properly identified their error estimates as being for random errors only.

0

Brandon Shollenberger

Reply to Mike M.

January 29, 2015 1:35 pm

Mike M., sadly, BEST doesn’t even handle random errors properly. Resampling a data set is a common way to try to estimate the uncertainty in that data set, but BEST rebaselines its results every time it resamples them. The problem with this is there is uncertainty in those baselines, and BEST doesn’t account for it (even though it has a variable to store them).
The issues are more clear if you read the posts at my site, but basically, if you want to see how much variance there is between two series, you have to be careful how you compare them. If you set both series to have the same mean for only one segment (in this case, the 1960-2010 segment), you will force the series to agree better in that segment and worse anywhere else. That causes your variance, and thus your uncertainty, to come out wrong.
And of course, if you’re going to test your methodology on a subset of your data, you have to test the entire methodology. BEST doesn’t. BEST doesn’t rerun its breakpoint calculations. That means BEST does nothing to try to establish how much uncertainty there is in one of the most important steps of its process.

0

maccassar

Reply to Brandon Shollenberger

January 29, 2015 2:50 pm

Brandon
Thank you for delving into this data. I would encourage all to take a look at BEST and see what they think about the various graphs etc. There are many issues and questions that pop out just from a cursory review. Keep up the good work.

0

Mike M.

Reply to Brandon Shollenberger

January 29, 2015 2:52 pm

Brandon Shollenberger,
It is not clear to me that the use of of a common reference period is inappropriate with respect to computing uncertainties in an anomaly series. I can see arguments on both sides. The test would be whether the uncertainty time series depends on choice of reference period. Has that been tried?
The issue with the homogenization surely seems to be a question of systematic, not random, error. So it is not obvious that it should be included in a test of random errors, although it is obvious that it is important to derive a test of the homogenization. That is, after all the issue: Does the homogenization introduce a bias into the trend?
It appears as if BEST has largely adopted the procedures used by others, rather than examining those procedures de novo. Having an “honest broker” using the same procedures would be valuable if the issue were fraud in handling the data. But that is not the case. The real issue is whether confirmation bias has led to the acceptance of procedures that would not have been accepted otherwise.

0

Brandon Shollenberger

Reply to Brandon Shollenberger

January 29, 2015 7:24 pm

Mike M, the math involved in showing the choice of baseline period causes the step change evident in the graph is fairly simple. I gave a simple demonstration of it in the first post I wrote on this. It shows how if a diferent vaseline period was used, a different set of uncertainties would been generated.
But really, there’s no reason uncertainties should plummet just because they are during your baseline period.

0

Steven Mosher

Reply to Brandon Shollenberger

January 29, 2015 7:58 pm

Breakpoints are not actually very important.
The empirical breakpoint approach was pretty rigorously tested to both increases in breakpoints and decreases in breakpoints. Not much effect.
The reason for this is that globally adjustments dont amount to much as many of us have shown long before berkeley ever did its project.

0

Brandon Shollenberger

Reply to Brandon Shollenberger

January 29, 2015 10:30 pm

Steven Mosher, it’s easy to say something “doesn’t matter,” but saying it doesn’t make it so. This is especially true when your argument is in the form:

The empirical breakpoint approach was pretty rigorously tested to both increases in breakpoints and decreases in breakpoints. Not much effect.
The reason for this is that globally adjustments dont amount to much

There is no reason people should only care whether or not breakpoints have effects “globally.” Even if breakpoints didn’t change the global trend at all, they could still be hugely important.
Heck, if we take Zeke’s response to me as true, a significant portion of why the step change in uncertainty I highlighted exists (or at least, exist in that particular location) may well be because of two breakpoints.

0

Steven Mosher

Reply to Brandon Shollenberger

January 30, 2015 8:06 am

Testing makes it so.
Looking at the difference between adjusted and unadjusted makes it so.

0

Brandon Shollenberger

Reply to Brandon Shollenberger

January 30, 2015 10:11 am

Steven Mosher does a good job of showing what is wrong with BEST’s handling of things:

Testing makes it so.
Looking at the difference between adjusted and unadjusted makes it so.

BEST doesn’t publish results of such tests. BEST doesn’t justify its methodological decisions. All it does is say, “It doesn’t matter, trust us.” And if you don’t just trust them, they say, “Test it yourself” even though they know such testing can take at least weeks of runtime if one knows exactly what to do. People needing to familiarize themselves with the problem could need months of runtime.
The reality is if BEST ought to discuss its methodological decisions so it can justify them and explain any caveats that may be involved in them. It doesn’t. It claims to be completely open and transparent, but it intentionally fails to disclose known issues with its work. And if other people try to discuss those issues, it tells them to bugger off.
Incidentally, a lot of people may not have a computer they can set aside for months solely to test BEST’s methodology to figure out the results BEST chooses not to release. I know I don’t.

0

Alx

Reply to Mike M.

January 29, 2015 3:20 pm

I am not following the idea that since systematic errors are very hard to estimate and what really matters are the systematic errors you don’t know about or can’t control they should not be considered.
Limited sampling seems to be a systemic error. Reporting the random error of guessing or deriving data due to lack of data is just hiding the systemic error of not having enough reliable data. It’s saying since we do not know how much sampling is required to get a reliable number, we’ll just assume that margin of error no matter how large or small will favor warming and cooling equally and so will ignore it.
I get that I cannot plan driving my car on premise my speedometer at some unknown point in time no longer recording speed accurately (hopefully I would figure it out after 1 or 2 speeding tickets). However I can predict how many cars will be driven improperly since we have an extensive accident history and annual body count due to driving not being made according to protocol so to speak. So I believe we can indeed add the systemic error called human error to the plot.

0

Pat Frank

Reply to Mike M.

January 29, 2015 7:42 pm

Mike M, all the groups working on global surface air temperature — UEA/UKMet, BEST, GISS — all assume that sensor measurement error is random and averages away.
The assumption is promiscuous and entirely unjustifiable. Nevertheless, the Central Limit Theorem is assumed to apply throughout, and they clutch to it in a death grip.
They ignore systematic measurement error entirely, and until recently never even mentioned it in their papers. Available calibration experiments show that temperature sensor systematic error is large and persistent. Solar loading has the greatest impact.
I’ve published on this problem here (870 KB pdf) and here (1 MB pdf). From systematic sensor measurement error alone, the uncertainty in the 20th century global surface air temperature record is about (+/-)0.5 C.
If they ever admitted to the systematic error, obviously present, they’d end up with nothing to report. The prime evidence base of AGW would vanish. One can understand the reluctance, but it’s incompetent science regardless.
I’ve corresponded with Phil Brohan and John Kennedy at UK Met about the papers (Phil contacted me). They can’t refute the work, but have chosen to ignore it. Apparently likewise, everyone else in the field, too.

0

M Courtney

Reply to Pat Frank

January 30, 2015 12:56 am

Having read your first linked paper I think it is very important.
Your conclusion isn’t that the world hasn’t warmed (believable) but rather that the measurement uncertainties are underestimated and that the magnitude of the warming is unknowable. That is very relevant to the core of this website.
“The temperature sensor at each station will exhibit a unique and independent noise variance” – of course!
So you can’t just average them together to get about zero-ish. That makes sense.
In my opinion, this deserves a full post.

0

Pat Frank

Reply to Pat Frank

January 30, 2015 9:00 am

M Courtney, thanks for the positive feedback. After that paper was published, the response amazed me.The idea of systematic error seemed beyond the grasp of so many. There may be a follow-up. Most of the analysis is done. But I’m trying to publish now on the reliability of climate models. The reviewer responses to that submission have been, if anything, more amazing. Modelers know nothing of physical error analysis. That would be worth a post in its own right.

0

anng

Reply to Mike M.

January 30, 2015 3:49 am

Mike,
You make a really good point here.
Systemic errors are those which occur across all data points. For GMST the most obvious one is the post-data collection processing which is run on every item of data.
Because a huge number of weather stations are used, you’d expect the raw data to have a random distribution of instrument calibration and measurement protocol.
However, the more you standardise globally, the more possibilities you have for systemic errors arising. My understanding is that there was a big standardisation effort around about 1997, which would increase the possibility of systemic error to weather-station design, instrumental hardware decay, etc etc..
I don’t believe statistics are useable on GMST.

0

anng

Reply to anng

January 30, 2015 4:06 am

Mike,
You make a really good point here.
Systemic errors are those which occur across a large proportion of data points. For GMST the most obvious one is the post-data collection processing which is run on every item of data.
Because a huge number of weather stations are used, you’d expect the raw data to have a random distribution of instrument calibration and measurement protocol. (Unlike when using one instrument in a lab experiment.)
However, the more you standardise globally, the more possibilities you have for systemic errors arising. My understanding is that there was a big standardisation effort around about 1997, which would increase the possibility of systemic error to weather-station design, instrumental hardware decay, etc etc..
I don’t believe statistics are useable on GMST.

0

Rud Istvan

January 29, 2015 1:26 pm

BEST has other problems also, and they are not subtle.
They use regional expectations to QC outliers. To see what that does, just look at BEST 166900. Rejected 28 months of extreme cold to turn no trend into a warming trend. 166900 is Amundsen Scott at the south pole, arguably the best and certainly the most expensive station on the planet. Nearest station is McMurdo, 1300 km away and 2700 meters lower on the coast. Altogether different climate.
Concerning their website representation of the ingested data, look at BEST 157455. BEST reports two station moves in the metadata (the red diamonds), one about 1972 and one about 2007. The GHCN metadata file has nothing. Easy to check. http://Www.ncdc.noaa.gov/data-access/land-based-station-data/ Go to left side menu, click metadata, enter Puerto Casades.
Found a back door to the summarized ingested BEST data at berkeleyearth.lbl.gov/auto/Stations/TAVG/text/157455-TAVG-Data.txt. The two station moves came from ingested WMO metadata. Flagged by 1 rather than 0 in column 5. The problem is, these were in 2007 and 2013. BEST shows a 1972 station move that is not in the metadata, and not one that is.
Altogether not confidence inspiring. Yet still better than GHCN or GISS or BOM ACORN. You can spot check BEST against other stations that have plainly been improperly fiddled. There are quite a few examples in essay When Data Isnt in ebook Blowing Smoke.

0

Nick Stokes

Reply to Rud Istvan

January 29, 2015 1:51 pm

“The two station moves came from ingested WMO metadata. Flagged by 1 rather than 0 in column 5. The problem is, these were in 2007 and 2013.”
Isn’t it the 1 in col 6? There is one in Feb 1971, one in Sep 2005.

0

John Peter

Reply to Nick Stokes

January 29, 2015 2:15 pm

Does anyone have an answer to this point?

0

Rud Istvan

Reply to Nick Stokes

January 29, 2015 2:27 pm

You are right about column 6, I just checked. But in my .txt it still appears that the breaks header is the fifth column not the sixth. The headers did not align well with the actual numerical columns and there was also a wrap around problem on every line when saved as a .pdf due to page width incompatability. My apologies if misinterpreted your output. I stand corrected on my point 2, but not point 1.

0

Steven Mosher

Reply to Nick Stokes

January 30, 2015 8:07 am

Yes rud doesn’t understand or care to.

0

Streetcred

Reply to Rud Istvan

January 29, 2015 2:27 pm

Shub Niggurath has had a good look at the Puerto Casades issue (h/t Paul Homewood) … https://nigguraths.wordpress.com/2015/01/29/the-puerto-casado-story/

0

Rud Istvan

Reply to Streetcred

January 29, 2015 3:13 pm

Its a good post, saw it earlier. And makes things even more confusing. Shub found three different station location coordinates in the BEST ingestion once he had the backdoor open, yet whether column 5 or column 6 there are only two move flags. He has google earthed all three locations, and all are plausible. Probably just shows how unclean all this stuff is.
Btw, I think Nick is possibly right about column 6 and I misread because of the print wrap around in the saved .pdf.
I just went and looked at the archived BEST output again. 1971 is certainly possible rather than 1972. But apparently not 2005. The flagged move is too far beyond the decade midpoint, still looks about 2007. Could be mid 2006, but not earlier. Plotting bug? Dunno.
I will repeat again what was said upthread. When you check BEST versus GISS, GHCN, BOM ACORN, it still appears more reliable and less provably biased. Rekyavik Iceland, Valencia Ireland, De Bilt Netherlands, Sulina Rumania, Rutherglen Australia, …

0

Bill Illis

Reply to Rud Istvan

January 29, 2015 4:12 pm

Do you think it would matter if the station was moved from the east side of Amundsen Scott station to the west side? The glacier is presumably moving some direction anyway as well as a small amount higher every year.
Amundsen Scott is staffed with 200 scientists in the summer and 50 in the winter. I’m assuming they know what they are doing and noone should be mucking around with the already quality controlled weather data from this station taken by people risking their lives. It is immoral really.
The data is here and BEST should just leave it alone. I noted this problem to Zeke over 2 years ago and nothing has been fixed. BEST has taken Amundsen Scott’s Zero trend since 1957 and turned it into 1.0C of warming. That means BEST’s algorithm’s are biased to find breaks that occur on the downside and far, far fewer on the warming side. Hence, all of their temperatures are biased upwards. They can prove this statement wrong very easily but they will not supply the basic info to prove this wrong.
http://www.antarctica.ac.uk/met/READER/surface/Amundsen_Scott.All.temperature.html

0

Sciguy54

Reply to Rud Istvan

January 29, 2015 5:29 pm

Steven was very generous of his time in explaining to me how BEST dropped those cold outliers at Amundsen. The problem for me is that there is little doubt that they were perfectly valid observations of real temperature conditions. Just as in the US southeast a passing summer thunderstorm can create a cool anomaly. Or a mistral, etc. There just seem to be far more regularly occurring events which can create a valid cool outlier than a hot one. And eliminating them just may bias the results.
I suggested that BEST create a switch which would allow a user to toggle the outliers in or out of the mix to see how it affects results.

0

Steven Mosher

Reply to Sciguy54

January 29, 2015 8:00 pm

the code is there for anyone to modify.

0

tty

Reply to Rud Istvan

January 30, 2015 9:38 am

Don’t get cocky Steve.I’ve spot-checked BEST break-points for stations where I have information that isn’t in GHCN.The breakpoint algorithm regularly misses quite large station moves. However I can’t prove that the break-points that are inserted are actually wrong, since there is always the possibility of undocumented changes in e. g. vegetation.

0

Mike M.

January 29, 2015 1:27 pm

The reduction in uncertainty from the early fifties to early sixties could well be real. For instance, the IGY was during that period (1958?), resulting in a big increase in global monitoring.

0

Brandon Shollenberger

Reply to Mike M.

January 29, 2015 1:38 pm

Mike M., the amount of stations which exist is relevant, but it is tied to spatial uncertainty. The uncertainty I dealt with in these two posts is statistical uncertainty. I’ll quote the explanation I gave in a comment at my site. My post quoted part of the BEST code:

sp_unc = sp.([‘unc_’ types{m}]);
st_unc = st.([‘unc_’ types{m}]);
unc = sqrt( st_unc.^2 + sp_unc.^2 );
Issues with how much of the globe is covered by temperature stations would go in the sp_unc variable. The issues I’m discussing would go in the st_unc variable.
That said, I get the improved coverage for the 1960-2010 period might make someone think that is the best choice of baseline period. The problem is as soon as you choose a segment of your record to be your baseline period for comparing series (such as the eight series created in the BEST jackknife calculations), you reduce the variance of that period and inflate the variance outside that period. That distorts your uncertainty levels. Even worse, it distorts them in a way which fits your expectations (e.g. decreasing the uncertainty in the 1960-2010 period while increasing the uncertainty before it) so you’re less likely to notice it.
The bias this mistake causes may fit BEST’s assumptions, but you can’t just take any answer that fits your assumptions as proof of those assumptions. If BEST hadn’t made a boneheaded mistake here, the uncertainty of modern times would be higher.

0

Steven Mosher

Reply to Mike M.

January 30, 2015 8:09 am

The reduction is due to a reduction of spatial uncertainty.
This is due to the addition of stations.

0

Tonyb

January 29, 2015 1:29 pm

Brandon
As you have just popped over to CE you will probably have seen my reply to Mosh and his clarifications that follow
http://judithcurry.com/2015/01/28/open-thread-23/#comment-669713
Tonyb

0

Brandon Shollenberger

Reply to Tonyb

January 29, 2015 1:43 pm

Yup. I saw them pretty much right away. I just didn’t see that I had anything to add to them.

0

John in Oz

January 29, 2015 1:34 pm

quelle surprise? Not moi.

0

Kev-in-Uk

January 29, 2015 1:38 pm

But…But.. BEST, IIRC, is supposed to be a robust, complete scientifically and statistically valid analysis of ‘all’ available data, etc, etc. (as per the initial fanfare and blurb) and hence produces the most accurate representation of climate data possible, right?
Now I am totally Gutted……my beliefs shattered and in rags..
(/sarc – just in case anyone doesn’t detect the heartfelt sarcasm)
On a more serious note – I do hope this does not actually reflect any internal ‘purpose’ or ‘agenda’ within or by the BEST team – I have no problem with them making genuine errors (other than they should have been spotted in peer review, etc) and correcting them, as that would be ‘normal’ progressive science/advancement. But given the fanfare of the proclamations of AGW being ‘real’, etc, by those concerned, must we now seriously question their results too? (a la NASA/GISS/CRU, etc)

0

Brandon Shollenberger

Reply to Kev-in-Uk

January 29, 2015 1:51 pm

Kev-in-Uk, that’s the only reason I’ve spent any real time examining BEST. BEST was supposed to address all the issues skeptics had with the previous temperature records. It was supposed to use the best methodologies available.
But from the moment I started looking at it, I found obvious problems with it. The more time I spent looking into it, the worse I realized it was. There are a ton of issues this post doesn’t even begin to touch on. For instance, BEST makes a big deal about being completely transparent, but did you know there are at least seven different versions of its final results, none of which were archived? I get that BEST wants to update its results from time to time, but why wouldn’t it keep a record of its old ones for people to look at? Shouldn’t people be able to compare current results to old ones?
I don’t think BEST lives up to its hype, at all, but if you want to see it really screw up, you should take a look at this.

0

Alx

Reply to Kev-in-Uk

January 29, 2015 2:53 pm

BEST, IIRC, is supposed to be a robust, complete scientifically and statistically valid analysis of ‘all’ available data, etc, etc. (as per the initial fanfare and blurb) and hence produces the most accurate representation of climate data possible, right?

Well to the folks at BEST, IIRC it is.
Nothing is clear or clean-cut in the temperature business, there are a lot of problems and challenges with no easy answers and basically comes down to smart people making assumptions and then using those assumptions to take their best guess. The best and most dependable financial analysts all concluded all was hunky-dory right up to when the financial markets went belly-up across the globe.
We blue-sky new new plane designs using this approach but luckily we go much farther before we actually build planes. If climate alarmists built planes the carnage from crashed planes would ground the airlines industry for decades.
Climate alarmists have to come clean on the limitations of the temperature record before building cathedrals using it as a foundation.

0

dbstealey

January 29, 2015 1:45 pm

I have a problem when someone tries to cherry-pick specific years in order to make their point:
You can add a few years to the bottom graph. It will show global T is still flat.

0

Rud Istvan

Reply to dbstealey

January 29, 2015 1:56 pm

Michael Mann was still using that trick at the AGU in 2012. See essay An Awkward Pause.

0

Zeke Hausfather

Reply to dbstealey

January 29, 2015 5:22 pm

You do realize that the top figure goes through 2010, correct? I’m still somewhat confused what is cherry-picked, apart from perhaps the GWPF cherry-picking the start date of the bottom graph…

0

Jean Parisot

January 29, 2015 1:57 pm

How do these data sets represent or transition from spatial error to gross, linear error?

0

1sky1

January 29, 2015 2:01 pm

Glad to see that people are waking up to the fact that BEST’s methodology is very far from that. In fact, in terms of imposing unverified assumptions upon the data base, it’s the worst.

0

Dave in Canmore

January 29, 2015 2:06 pm

Good catch! But am I the only one that wonders why we don’t just use good stations that haven’t moved and forget all the stats gymnastics? The degree of absurdity in the calculation of these larger averages defies anything rational or meaningful.

0

bw

Reply to Dave in Canmore

January 29, 2015 2:41 pm

Correct. There is too much quality control being applied to weather station data that was never intended to be merged into a global climate assessment. See the “surfacestations.org” site that shows thermometers located next to buildings, and in the exhaust stream of air conditioners. Complete failure of samplng methodology. There is no good scientifically maintained global surface temperature network. The US now has the climate reference network, USCRN, but that has only been in place about 10 years, and is USA only.
There are a few scattered weather stations with good data, some at universities. Those show zero warming.
http://hidethedecline.eu/pages/ruti/europe/western-europe-rural-temperature-trend.php
There are a few others, some in the US and one or two in Britain. Generally, stations producing good data for a century show the 1930s as the warmest decade, with a small decline since 1940, the a slow increase in the late 1970s until 2002. Current temps are about equal the the 1930s, not higher.
There are a few scientifically maintained weather stations in Antarctica since 1958. Those all show zero warming since 1958.

0

1sky1

Reply to bw

January 29, 2015 5:39 pm

1934 was the warmest year only in the USA average. Using exclusively stations that pass scientific scrutiny (instead of bogus “homogenization”) the global area-average peaked somewhat earlier, but saw a DEEP decline in the 1960’s and 1970’s. It has since recovered and, in 1998 and 2010, somewhat surpassed the earlier highs. The century-long trend is quite insignificant, nevertheless.

0

m

Reply to Dave in Canmore

January 29, 2015 2:58 pm

Dave in Canmore write: “am I the only one that wonders why we don’t just use good stations that haven’t moved and forget all the stats gymnastics?”
Sounds like a good idea to me. But it would not stop the critics. “They are ignoring 98% of the data!” Just look at the recent flap over ocean pH.

0

Steven Mosher

Reply to m

January 30, 2015 8:13 am

Yep. First the skeptics cried about the great thermometer drop out. They demanded that all the data be used. So best was formed with one goal.
Use all the data.

0

1sky1

Reply to m

January 30, 2015 5:42 pm

Skeptics often have lodged the justified complaint that quasi-century-long time series at many stations were quite arbitrarily truncated in the latter decades by GHCN, thus forcing the use of short time series from other stations to bring regional averages up to date. This, of course, introduces the uncontrolled variable of exact measurement location into the averaging process. Contrary to what Mosher claims, the cry for “all the data” was aimed to avoid such station shuffling, rather than a blind insistence that mere scraps of record from every available station should be used.

0

Rud Istvan

Reply to Dave in Canmore

January 29, 2015 3:46 pm

It is a good idea, and there are quite a few even though coverage is sparse. Long records like at the Valencia observatory in Ireland or Sulina Rumania or Hachijyo Japan where there is no UHI. Shorter records like Taksi Siberia starting in 1936. For anomalies, perhaps sufficient.
Another problem is the oceans. All early data was trade route biased. Argo is newish.
IMO a sufficient answer is satellites starting 1979. UAH and RSS. After all, even diddled records show no warming until the mid to late 1970’s. That was the global cooling scare decade, see essay Fire and Ice. Holdrens thing then. The Satellite record is now long enough to be useful, at least for purposes like calibrating climate models and their performance.

0

Steven Mosher

Reply to Rud Istvan

January 30, 2015 8:17 am

Do some history. Look for skeptics complaining about
Ncdc dropping stations.
They accused people of sample bias.
Read mckitrick
Hell we wrote about the great thermometer drop out here.
So goal number one of best from it’s inception was to answer the Skeptics complaint about folks not using all the data.

0

Lawrence Todd

January 29, 2015 2:15 pm

I have said this for a long time. I knew you were a genius Stephen, you agree with me!

0

The Pompous Git

Reply to Lawrence Todd

January 30, 2015 8:51 pm

From the OED. Genius:

A demon or spiritual being in general. Now chiefly in pl. genii (the sing. being usually replaced by genie), as a rendering of Arab. jinn, the collective name of a class of spirits (some good, some evil) supposed to interfere powerfully in human affairs.

0

Alx

January 29, 2015 2:34 pm

This is a form of homogenization, a process whereby stations in the dataset are made to be more similar to one another.

What?!?
Making stations like other ones in a region completely defeats the purpose of having that station. Why even have that station, shut it down and save some money. Even worse it destroys the notion of measuring anomalies, since you are not measuring the differences in a specific station over time but instead you are comparing that specific station to other stations. That does not make sense.
Instead just admit temperature data is a messy business, with no easy answers or conclusions, and keep working it.
It’s funny, in deflate-gate there are PHDs arguing whether or not the Patriots had significantly less fumbles than other teams since 2006 when the NFL allowed teams to provide their own balls. The statistical arguments, margins of error and accusations of cherry picking is identical to the arguments around temperature data-sets. In football, we are talking about a very closed defined system, with meticulous and precise collections of team and individual statistics and experts cannot agree on how to determine if the Patriots fumble trend has reduced (global warming) and if so due to cheating (man-made) against the rest of the league (natural variation). If we cannot get statistical experts to agree how to pull conclusions from a relatively simple system like football, who in their right mind believes we can be conclusive about global temperatures.

0

Lane Core Jr. (@OneLaneHwy)

Reply to Alx

January 29, 2015 7:37 pm

“Making stations like other ones in a region completely defeats the purpose of having that station.” That strikes to the heart of the homogenization issue better than any other single sentence I’ve ever seen.

0

Steven Mosher

Reply to Alx

January 29, 2015 8:53 pm

“Making stations like other ones in a region completely defeats the purpose of having that station. Why even have that station, shut it down and save some money. ”
The process does not make stations like other stations.
1. All we have is the raw data.
2. there is no independent check on any historical station.
3. All the stations are used and we estimate a surface from that data which minimizes the error.
in other words.. given all this raw data what is the best estimate ( minimize the error) we can make for this region.
This is essentially the method suggested at skeptical sites.

0

Curious George

January 29, 2015 2:44 pm

I asked BEST people if they had a professional statistician on the team. Yes, a professor of statistics. But he does not appear as an author anywhere in their papers.

0

Zeke Hausfather

Reply to Curious George

January 29, 2015 5:20 pm

Both David Brillinger and Charlotte Wickham are professional statisticians. David was not a coauthor (though he was an advisor and helped develop the approach to uncertainty calculations). Charlotte was a coauthor.

0

Curious George

Reply to Zeke Hausfather

January 30, 2015 8:25 am

Hollywood directors use a name Smithee when they don’t want to take any credits for their movie.

0

Shub Niggurath

Reply to Curious George

January 29, 2015 6:09 pm

Charlotte Wichkam is. Factoid: She is the sister of Hadley Wichkam, creator of ggplot2.
Having statisticians on the team does not guarantee against errors in inference and reasoning.

0

Tony

January 29, 2015 2:52 pm

+/-0.1 deg C a hundred years ago, when the recording accuracy was +/-0.5 deg C. Yeh right …

0

ferd berple

January 29, 2015 4:10 pm

Does BEST or anyone else make their raw absolute temperatures available in a simple to use format?
Something like: lat, long, date, time, temp?
I’d like to try a run using sampling theory to see if we can’t do something that none of the major temp series appear to have tried. replace all of the averaging and adjustments with randomness, then only calculate the averages as the last step in the whole process. no anomalies, no adjustments no breakpoints, no grids, no homogenizations.

0

Zeke Hausfather

Reply to ferd berple

January 29, 2015 5:18 pm

Berkeley raw data is available here: http://berkeleyearth.org/data
GHCN-M raw data is available here: ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/
GHCN-D raw data is available here: ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/

0

Mike Jonas

Editor

Reply to Zeke Hausfather

January 29, 2015 5:56 pm

Are they really raw? ie, are they simply the original measurements, nothing added, nothing deleted, nothing changed?
If so, is there anyone here with enough time and computer power to do a nice straightforward analysis using simple rules:
1. A station can be split at a known move or station change, and not for any other reason.
2. Stations with insufficient data to be excluded. [Criteria to be determined in advance.].
3. All data for all selected stations is used unaltered, except for obvious typos/errors. [Note: the non-obvious errors can be expected to be reasonably few, to not introduce bias, and to be unidentifiable anyway.].
3. The algorithm for global averaging and for uncertainty to be determined in advance. Wherever there are possible alternatives, simplest wins. [Every complexity introduces its own uncertainty, and simplicity is important for others to be able to reproduce results.].
4. Everything to be documented.

0

Steven Mosher

Reply to Zeke Hausfather

January 29, 2015 8:13 pm

Are they really raw? ie, are they simply the original measurements, nothing added, nothing deleted, nothing changed?
Yes. the vast majority of data is daily data from GHCN-D and GCOS.
daily data is not adjusted.
before I ever went to work for berkeley I did my own global series using GHCN-D.. daily data.
no adjustments WHATSOEVER. guess what, you get an answer within a few percentage points.
If so, is there anyone here with enough time and computer power to do a nice straightforward analysis using simple rules:
1. A station can be split at a known move or station change, and not for any other reason.
instrument changes are also used, time of observation is also used, big gaps in data are also used
2. Stations with insufficient data to be excluded. [Criteria to be determined in advance.].
un necessary. small amounts of data mean the series has small weight.
3. All data for all selected stations is used unaltered, except for obvious typos/errors. [Note: the non-obvious errors can be expected to be reasonably few, to not introduce bias, and to be unidentifiable anyway.].
there are over 10 known QC problems. All listed
3. The algorithm for global averaging and for uncertainty to be determined in advance. Wherever there are possible alternatives, simplest wins. [Every complexity introduces its own uncertainty, and simplicity is important for others to be able to reproduce results.].
4. Everything to be documented.
Its been done six ways since sunday
Lets go back to 2010
2010…
http://wattsupwiththat.com/2010/07/13/calculating-global-temperature/
“Bloggers and researchers who have developed reconstructions so far this year include:
Roy Spencer
Jeff Id
Steven Mosher
Zeke Hausfather
Tamino
Chad
Nick Stokes
Residual Analysis
And, just recently, the Muir Russell report
Here is the bottom line.
pick ANY data source you like. GHCN-D, GHCN-M, GCOS, CRU, whatever
pick any method you like: CAM, RSM, Least squares, Kriging.
apply adjustments or DONT apply adjustments
Use only rural or use all stations.
Guess what?
Your answers will not differ in any way that has any impact on the theory of global warming.
There was an LIA.
It is getting warmer
The question is not whether it has warmed .7C or .8C or .9C
The question is
A) how much of that warming is due to man
B) what future warming can we expect.

0

Robert B

Reply to Zeke Hausfather

January 29, 2015 11:38 pm

The question is how much of the warming since 1950 is natural and how much is due to fossil fuel use. I’ve seen estimates from modellers that it is as low as 0.18°C, so “whether it has warmed .7C or .8C or .9C” is important.

0

tonyb

Editor

Reply to Zeke Hausfather

January 29, 2015 11:48 pm

Mosh says
“There was an LIA. It is getting warmer
The question is not whether it has warmed .7C or .8C or .9C
The question is;
A) how much of that warming is due to man
B) what future warming can we expect.”
—– —— —
Despite the errors and uncertainties it is undoubtedly (and unsurprisingly) getting warmer since the LIA.
It has probably warmed by somewhere around Mosh’s estimates.
He asks two good questions in A and B)
There is a C) however, which is much more interesting.
C) Is this modern warming unusual, or merely part of a cyclical trend of rising and declining temperatures that can be traced throughout the Holocene?
tonyb

0

dbstealey

Reply to Zeke Hausfather

January 30, 2015 12:16 am

Tonyb and Mosh,
Good questions!
Maybe this graph will help answer them:
http://i.snag.gy/BztF1.jpg

0

climatereason

Editor

Reply to Zeke Hausfather

January 30, 2015 12:40 am

dbstealey
I guess we need to add on a bit for warming in that area since the core dates of 2004. (although the two warmest consecutive decades in Greenland remain the 1930’s and 1940’s according to Phil Jones.
In posing my question C) I wanted to add historical context. If today is genuinely the ‘warmest ever’ that is significant. If it isn’t it puts todays values into context. I can’t see any indication that the modern era is any warmer than past eras such as the MWP or Roman era.
Climatologists seem to be fixated on parsing those instrumental values which have a very short history. Climate did not begin in 1980 or even 1880. Incidentally, as you know, the general warming has been going on for much longer than the GISS start date.
So I guess there is a question D) What has caused it to warm for the last 300 years?
tonyb

0

Brandon Shollenberger

Reply to Zeke Hausfather

January 30, 2015 12:46 am

I find it weird Steven Mosher says:

There was an LIA.
It is getting warmer
The question is not whether it has warmed .7C or .8C or .9C
The question is
A) how much of that warming is due to man
B) what future warming can we expect.

I think most people would say it’s difficult to answer A or B without having a decent idea what the answer to the question Mosher dismisses. It’s difficult to see how one can say “how much of that warming is due to man” without knowing how much warming there was.
Similarly, if we can’t tell how much warming there has been, exactly how can we decide how much warming there will be? We use our observations of changes in our world to estimate what changes there will be in the future. If we don’t know what changes there have been, we won’t know what changes there will be.
I know some people would just wave that all way saying we don’t need to worry about small changes like that, but the difference between .7C and .9C is .2C. That could easily be a decade or two worth of warming. I think most policy makers would like to know any problem they face could be mistimed by as much as two decades.
We’re often told there is a consensus humans have caused 50+% of the observed warming. If we don’t know how much warming there’s been, how will we rate somebody who estimates humans have caused .4C of warming?

0

Mi Cro

Reply to Zeke Hausfather

January 30, 2015 6:59 am

Steven Mosher
January 29, 2015 at 8:13 pm

If so, is there anyone here with enough time and computer power to do a nice straightforward analysis using simple rules:
1. A station can be split at a known move or station change, and not for any other reason.
instrument changes are also used, time of observation is also used, big gaps in data are also used
2. Stations with insufficient data to be excluded. [Criteria to be determined in advance.].
un necessary. small amounts of data mean the series has small weight.
3. All data for all selected stations is used unaltered, except for obvious typos/errors. [Note: the non-obvious errors can be expected to be reasonably few, to not introduce bias, and to be unidentifiable anyway.].
there are over 10 known QC problems. All listed
3. The algorithm for global averaging and for uncertainty to be determined in advance. Wherever there are possible alternatives, simplest wins. [Every complexity introduces its own uncertainty, and simplicity is important for others to be able to reproduce results.].
4. Everything to be documented.

This is basically what I’ve done, and because I’m not doing the same thing as what everyone else has done, I get different results.
These are the headlines:
The big swings in average surface temp are from large swings in min temp at different regional location at different times around the world.
There is no loss in night time cooling in surface data collected since the 50’s. Daily rising temp is and has been well matched by the following nights cooling.
Normal swings in surface temp (even for periods as short as an hour, 2F/hour cooling in clear skies sub freezing temps) far exceed any possible effect from Co2.
Land use changes make a larger impact than Co2.
Changes in the amount of clouds make a larger impact than Co2.
There does appear to be a change in the rate annual temps change during the year, but this rate also appears to be changing direction. Potentially back towards the same rate as we had in the recent pass.
So, has anyone else looked at the rate of change in surface data for daily and annual temp cycles? Isn’t there a lot of useful information in this data that everyone else throws away?

0

ferdberple

Reply to Zeke Hausfather

January 30, 2015 7:19 am

If so, is there anyone here with enough time and computer power to do a nice straightforward analysis using simple rules:

There is a MUCH simpler method to calculate average temperature.
1. Start with the very basic observation that your stations change over time.
2. That any adjustments to make your stations appear “unchanged” is thus a source of error.
3. Therefore, it is useless to try and build a temperature record based on fixed stations, as you can never be sure of how much error you introduced.
4. Instead, assume that your station readings are simply random samples in time and space.
5. apply sampling theory to pick a random samples that accurately recreate the spatial and temporal distribution of the earth surface over a year.
6. these samples should fit a normal distribution – check this assumption
7. calculate the average temperature and standard deviation and standard error for the year.
This result should be at least as accurate as any gridding method and has huge computational advantages. Anyone with a modern PC and a good sized drive should be able to tack this. All that is required is a small bit of custom programming to build and analyze the samples. I’ll probably use sql, as the problem lends itself readily to analysis on a database, but many different tools should be able to do the job.

0

RACookPE1978

Editor

Reply to ferdberple

January 30, 2015 8:39 am

ferdberple
A long blockquote copy, but worth repeating here.

There is a MUCH simpler method to calculate average temperature.
1. Start with the very basic observation that your stations change over time.
2. That any adjustments to make your stations appear “unchanged” is thus a source of error.
3. Therefore, it is useless to try and build a temperature record based on fixed stations, as you can never be sure of how much error you introduced.
4. Instead, assume that your station readings are simply random samples in time and space.
5. apply sampling theory to pick a random samples that accurately recreate the spatial and temporal distribution of the earth surface over a year.
6. these samples should fit a normal distribution – check this assumption
7. calculate the average temperature and standard deviation and standard error for the year.
This result should be at least as accurate as any gridding method and has huge computational advantages. Anyone with a modern PC and a good sized drive should be able to tack this. All that is required is a small bit of custom programming to build and analyze the samples.

But, you’re wrong. 8<)
No programming is needed to implement your idea.
Just run the same program as-is that is already processing the floating thermometers (constantly moving, irregular-time-of-day reporting) in the ARGO buoys.

0

Steven Mosher

Reply to Zeke Hausfather

January 30, 2015 8:22 am

I didn’t dismiss the question.
You don’t understand sensitivity analysis.
Think more. Comment less.

0

dbstealey

Reply to Zeke Hausfather

January 30, 2015 10:10 am

tonyb asks:
D) What has caused it to warm for the last 300 years?

Well, that is the central question here, isn’t it? The answer is, we don’t know. Just like we don’t know the cause of the LIA.
But looking at the chart I posted above, we see that current temperatures are very normal. Nothing either unprecedented or unusual is occurring. So I refer you to the climate Null Hypothesis and Mr. Billy Ockham for a reasonable conclusion…

0

David Socrates

Reply to Zeke Hausfather

January 30, 2015 10:23 am

Dbstealey…
..
1) the chart you posted doesn’t show global temps, it shows only a single geographical location
2) the chart you posted ends in 1855, and does not include the recent warming.
3) This study —–> http://www.physics.mcgill.ca/~gang/eprints/eprintLovejoy/neweprint/Anthro.climate.dynamics.13.3.14.pdf should take care of your concerns about the null hypothesis.

0

Nick Stokes

Reply to Zeke Hausfather

January 30, 2015 2:30 pm

“climatereason January 30, 2015 at 12:40 am
I guess we need to add on a bit for warming in that area since the core dates of 2004. (although the two warmest consecutive decades in Greenland remain the 1930’s and 1940’s according to Phil Jones.”
Not since 2004, but 1855. That Allen data stops in 95BP. And P=Present=1950. The Mann Hockey Stick label is false.

0

ferdberple

Reply to Zeke Hausfather

January 30, 2015 5:32 pm

Just run the same program as-is that is already processing the floating thermometers (constantly moving, irregular-time-of-day reporting) in the ARGO buoys.

the ARGO bouys don’t satisfy point 5:
5. apply sampling theory to pick a random samples that accurately recreate the spatial and temporal distribution of the earth surface over a year.
The advantage of sampling is that even if the data isn’t normally distributed, the sample will be so long as ARGO ocean temps within the year are bound by the central limit theorem. this allows you to apply lots of very well known statistical methods to your results.

0

Zeke Hausfather

January 29, 2015 5:16 pm

This analysis is incorrect. The baseline period chosen has only a minor impact on uncertainty (we’ve tried it with various different ones). Rather, the drop in uncertainty around 1960 is almost entirely due to a reduction spatial uncertainty. Prior to 1960 there is no data at all in one of the world’s continents, Antarctica, which significantly increases the uncertainty in the global reconstruction.
See this discussion of the uncertainties present in Berkeley, GISS, and Hadley methods using synthetic data: http://static.berkeleyearth.org/memos/robert-rohde-memo.pdf
Also Figure 8 (and the associated discussion) in the Berkeley methods paper: http://static.berkeleyearth.org/papers/Methods-GIGS-1-103.pdf

0

Zeke Hausfather

Reply to Zeke Hausfather

January 29, 2015 5:28 pm

Specifically, note how the reduction in uncertainty in 1960 is almost entirely due to changes in spatial uncertainty, not statistical uncertainty.
http://i81.photobucket.com/albums/j237/hausfath/ScreenShot2015-01-29at52743PM_zps4ad34718.png

0

Brandon Shollenberger

Reply to Zeke Hausfather

January 29, 2015 8:54 pm

Zeke, it is inappropriate to simply claim an “analysis is incorrect” when two separate points were made and you only take issue with one. The point I am personally more troubled by is the fact that, according to BEST’s own words and code, BEST does not rerun its breakpoint calculations as part of its jackknife algorithm. This means BEST does not account for any uncertainty in its homogenization process. Nothing you say touches upon that issue. As such, the analysis as a whole cannot be incorrect.
That said, let’s consider what you say:

This analysis is incorrect. The baseline period chosen has only a minor impact on uncertainty (we’ve tried it with various different ones).

This is a claim which merits more than a passing, “We’ve examined this and it doesn’t matter.” If the impact exists at all, it is something BEST ought to have discussed at some point. And if the impact truly is minor, there ought to be some demonstration of such. Regardless, the crux of the issue is you say:

Rather, the drop in uncertainty around 1960 is almost entirely due to a reduction spatial uncertainty. Prior to 1960 there is no data at all in one of the world’s continents, Antarctica

This is clearly not true as BEST’s website shows there is data for Antarctica prior to 1960. In fact, there doesn’t seem to be a particularly significant change in the amount of data in Antarctica around 1960.

0

Zeke Hausfather

Reply to Brandon Shollenberger

January 29, 2015 9:24 pm

Brandon,
I should have said 1955, not 1960, as that is when the dip in the graph you highlighted occurs.
I’m less familiar with the interactions between the breakpoint calculations and uncertainty bounds. I know we have experimented with a number of different parameters for breakpoint detection and looked at the results, but I’ll have to check with Robert to see how that factors into statistical uncertainty.
Regardless, the issue you point out regarding the decline in total uncertainty in the original post relates to Antarctic coverage.

0

Brandon Shollenberger

Reply to Zeke Hausfather

January 29, 2015 10:25 pm

Zeke, even if we change your stated value to 1955, your claim is not true. Berkeley Earth requires segments (after breakpoints are calculated) cover at least ten years to be used. There are exactly 54 stations in that list with 120 or more months worth of data which are labeled as “inside” the Antarctica region. Six begin prior to 1950, six more begin prior to 1961, and only four more began prior to 1970. That clearly refutes your claim there was no data prior to 1960, and it also shows there was no particularly large increase in data coverage at the time of the breakpoint.
But there are issues with how BEST’s website defines the Antarctica region so we should consider more stations in the list. Allowing for stations as far as 500km outside the region, we find 61 more stations with 120 months or more data. Adding these in increases the previous numbers to 10 before 1950, 22 more before 1961 and 10 more before 1970 though some of these clearly aren’t in Antarctica.
Here is a breakdown of the 32 stations prior to 1961:
BASE ORCADAS / SOUTH ORKN – No problems, available as early as 1903.
ISLAS ORCADA – Partial duplicate of the above station.
South Orkney – Usable from 1917-1934.
Bellingshausen AWS – Usable from 1947 on.
Faraday – Usable as early as 1944.
Deception Island South Shetan – Usable from 1947-1967.
Esperanza + Hope Bay – Usable from 1952.
Base Esperanza – Partial duplicate of the above, extending further into the future.
Adelaide Island – Unusable prior to 1962.
Signey Island South Orkney – Available as early as 1947, but rendered unusable prior to 1956 due to an “empirical breakpoint” with a small magnitude. How an “empirical breakpoint” can be calculated before there is enough data to estimate the area’s temperature is a fascinating question.
Dumont D’urville – Usable as early as 1956.
Destacamento Naval Deception – No usable data.
Est. Naval Almirante Brown – No usable data prior to 1973.
Admirality Bay – Available as early as 1951, but rendered unusable due to an “empirical breakpoint” of incredibly small magnitude in 1957. Again, how this breakpoint was calculated is beyond me.
Dest. Naval Melchior – Usuable from 1951-1961.
Mawson Base – Usable from 1954 on.
Belgrano – Usable from 1955-1979.
General Belgrano – Duplicate of the above.
Belgrano I – Duplicate of the two above, with a .1 degree shift.
Base Belgrano II – Unusable prior to 1980.
McMurdo Sound NAF – Unusable prior to 1961 due to a station move.
Mirny – Usable from 1956 on.
Amundsen Scott – Usable from 1957 on.
Vostok – Usable from 1957 on.
Byrd Station – Usable from 1957-1971 (though it uses outdated data).
Scott Base – Usable from 1957 on.
S.A.N.A.E. Station – Not usable prior to 1962.
Syowa – Not usuable prior to 1966.
Adare Hallett – Not usable at all.
Davis – Not usable before 1969.
I think I missed two somewhere, but that list still shows eights stations with usable data prior to 1955. The number would be 10 if not for inexplicable “empirical breakpoints” added to two of them. There are only seven stations added between 1955 and 1960. Are you saying seven stations can cause the uncertainty of the BEST record to plummet so dramatically? (I’m excluding the obvious duplicates because obvious duplicates shouldn’t be used.)
I can accept I may have been wrong about this issue, but I think assuming a methodological issue is more generous than assuming you rest so much on a mere seven stations.

0

ferdberple

Reply to Zeke Hausfather

January 30, 2015 7:37 am

If you have no data for antarctic, simple use the climate science (TM) formula:
antarctica temp = (south america temp + new zealand temp) / 2
After all, antarctica is half-way between south america and new zealand, so its temp should be the average of these two stations.
According to climate science (TM) homogenization and interpolation.

0

AndyZ

Reply to ferdberple

February 5, 2015 8:53 pm

I laughed

0

Epiphron Elpis

January 29, 2015 5:20 pm

[snip – Epiphron Elpis is yet another David Appell sockpuppet.]

0

Mike Jonas

Editor

Reply to Epiphron Elpis

January 29, 2015 6:12 pm

“I’m prepared to accept” (not “I accept”) implies the precondition that their methods are valid and transparent. OK, that should have been stated explicitly at the time, but in normal society it is perfectly acceptable to stop trusting someone when they prove to be untrustworthy. BEST, by the faults described many times here and elsewhere, have proved to be untrustworthy.
More importantly, what premise has been proved wrong? BEST too shows global warming way below model predictions, doesn’t it?

0

Epiphron Elpis

Reply to Mike Jonas

January 29, 2015 6:40 pm

We can recognize weaseling when we see it.
[Yes – Epiphron Elpis is yet another David Appell sockpuppet.]

0

Epiphron Elpis

Reply to Mike Jonas

January 29, 2015 7:19 pm

[snip – Epiphron Elpis is yet another David Appell sockpuppet.]

0

dbstealey

Reply to Mike Jonas

January 29, 2015 7:54 pm

Yes, it does. Any more questions?

0

Steven Mosher

Reply to Mike Jonas

January 29, 2015 8:47 pm

we do.
http://berkeleyearth.org/graphics/model-performance-against-berkeley-earth-data-set
Now of course, who would show how the models get it wrong?
this is one of my favorites [showing] how the models get it wrong.
http://berkeleyearth.org/graphics/model-performance-against-berkeley-earth-data-set#warming-in-the-mid-northern-hemisphere-since-1950

0

Eliza

January 29, 2015 6:14 pm

THis is why BEST ect are all BS (based on this data)
https://notalotofpeopleknowthat.wordpress.com/2015/01/26/all-of-paraguays-temperature-record-has-been-tampered-with/
Its time these people/organizations were brought to account for damages.

0

Epiphron Elpis

Reply to Eliza

January 29, 2015 6:39 pm

[snip – Epiphron Elpis is yet another David Appell sockpuppet.]

0

Steven Mosher

Reply to Eliza

January 29, 2015 8:41 pm

0

Shub Niggurath

Reply to Steven Mosher

January 30, 2015 3:31 am

The video directly contradicts your/BEST claims for why adjustments were needed for local records like Puerto Casado.

0

Pamela Gray

January 29, 2015 7:04 pm

Any statistical analysis based on crappy ill-controlled observations needs to be taken with a grain of salt. That said, the results may indeed show a warming trend. So what? The results of said analysis cannot speak to cause and effect.
That last point is the main contention I have with the mad rush to blame what I breath out. What causes warming trends? What causes cooling trends? What causes wetter decades? Dryer decades? What causes erratic swings? The too-quick jump to the AGW cause will bite their ass big time. Too bad I will likely be pushing up daisies when the current cadre of conclusion jumping researchers get their comeuppance.

0

Epiphron Elpis

Reply to Pamela Gray

January 29, 2015 7:20 pm

[snip – Epiphron Elpis is yet another David Appell sockpuppet.]

0

Konrad.

Reply to Epiphron Elpis

January 29, 2015 9:34 pm

Epiphron, the spirit of Prudence, Shrewdness, and Thoughtfulness.
Not so shrewd. Controlled observations is what science is all about. BEST was a crazed attempt to manufacture a warming signal from surface stations provably suffering micro and macro site degeneration. Ie: uncontrolled observations. The data was clearly unfit for purpose. That they tried to unscramble the egg with more time in the blender speaks to motive. It speaks loudly.
And the hope and expectation you tacked on the end? Well, don’t get your hopes up. I’ve looked into the crystal ball and I can tell you what the future holds –

December 2015
At the Paris climate negotiations, hampered by heavy snowfalls, the parties come to a historic agreement. To hold next year’s meeting in Barbados.
Unfortunately the delegates at Paris 2015 don’t get time to debate the IPCC’s new and improved “buck each way” position, however they agree it should be on the Barbados 2016 agenda. The next location debate already goes into extra time, with the Barbados compromise only being reached “at the 11th hour”.
And Barbados is a compromise. The Chinese stall negotiation by asking for the moon, literally. The Chinese argue that lunar orbit is the perfect place for 2016 delegates to observe the utter insignificance of human effects on climate. While other delegates generally agree that China is a impoverished developing nation that should be allowed to emit CO2 forever, the location is rejected. Other nations argue that the lunar location discriminates against other impoverished developing nations that, unlike China, do not have the space launch capability to reach lunar orbit.
Australia’s suggestion of a low cost international teleconference is also dismissed when the issue of adverse impacts on the struggling airline and pre-mix pina colada industries is raised.
It is agreed that Barbados still involves extensive first class airline travel. Also that delegates will still be able to at least view the moon. From the beach at night. While holding a pina colada in a pineapple. Finally it is the acknowledgement that, unlike December in Paris, the only umbrellas required will be purely decorative that gets the Barbados vote over the line at 3.00am.

Epiphron, you and yours should have been far more prudent. The collapse of the GoreBull Warbling hoax is going to destroy the professional Left from one side of the planet to the other. Adding radiative gases to the atmosphere in no way reduces our radiatively cooled atmosphere’s ability to cool our solar heated oceans. Your tears are as nectar.

0

Epiphron Elpis

Reply to Epiphron Elpis

January 29, 2015 9:43 pm

[snip – Epiphron Elpis is yet another David Appell sockpuppet.]

0

Mark T

January 29, 2015 7:50 pm

This might have something to do with the relationship between BEST and Mosher… just sayin.
Mark

0

Steven Mosher

Reply to Mark T

January 29, 2015 8:40 pm

actually all the code was done before I got there.
I was asked to join because I was critical of their UHI approach.
go figure.
but your conspiracy theory is noted.
did we land on the moon?

0

Konrad.

Reply to Steven Mosher

January 29, 2015 10:40 pm

”but your conspiracy theory is noted. did we land on the moon?”
Good lord. Trotting out that old wheeze!
Lewandowsky’s inane attempt to pathologise dissent toward your ridiculous hoax has been thoroughly discredited.
Some of the most prominent sceptics today were involved in the Apollo program. Several walked on the moon. In pre AGW hoax days I met the geologist from 17, Harrison Schmitt. We talked gas pressurised joint design. Smart man. Now a dedicated sceptic to your inane “adding radiative gases to the atmosphere reduces the atmosphere’s radiative cooling ability” hoax.
Did you just turn your “Snivelling Stupidity” dial all the way to 11 Steven?! Men landed on the moon. You can still bounce a laser off the reflectors they left there. JAXA has photographs of landing stages from lunar orbit. And a great number of the engineers, flight controllers and astronauts who made it happen are sceptical regarding your CO2 causes warming BS. How much more epic can your fail possibly be??
Your only fame -”First sleeper at WUWT to snap”
And I remember when you snapped. 2010. The M2010 discussion paper. You and your cronies did something most foul. You attacked a legitimate paper in meteorology because it threatened your climastrology BS. M2010 introduced the concept of horizontal flows being effected by diabatic processes (radiative cooling). You and yours just couldn’t have that. That may raise the question of radiative subsidence in tropospheric convective circulation. You are on record as one of the “Knights of Consensus” who rode out to trash that paper. The record is permanent. Your shame is forever. You did not just scrape the bottom of the barrel, you clawed through the rotting timbers at its base and got yourself elbows deep in the feculant ooze below.
Forever, Steven. Forever.

0

Steven Mosher

January 29, 2015 8:24 pm

Zeke has posted the graph above that illustrates Brandon’s misdiagnosis of the change in uncertainty.
I would expect some sort of editorial correction or update to the head post.
After all, we are practicing blog peer review here.
Here are some other comments from Dr. Rohde.
1) First off, the step-wise shift in uncertainty has nothing to do with normalization or data processing issues. The large increase in uncertainty prior to about 1950 is a simple consequence of the complete absence of weather stations in Antarctica prior to the 1950s. The ability to estimate the global land average dramatically improved once we finally started putting instruments in Antarctica to start placing some constraints on the final 10% of the Earth’s land area. If one looks in our publication where the spatial and statistical parts of the uncertainty calculation are reported separately, the step change is entirely in the spatial part (i.e. a result of reduced coverage) and isn’t related to the statistical uncertainties which have no such step at that time.
So Brandon is wrong. Here is what you see if you read our paper
as you can see and as zeke and robert explained the jump in uncertainty happens in the spatial uncertainty. This is pretty clear in the paper.
http://i61.tinypic.com/mm3ar9.png
Continuing:
2) In the statistical calculation, the choice of a 1960-2010 baseline was done in part for a similar reason, the incomplete coverage prior to the 1950s starts to conflate coverage uncertainties with statistical uncertainties, which would result in double counting if a longer baseline was chosen. The comments are correct though that the use of a baseline (any baseline) may artificially reduce the size of the variance over the baseline period and increase the variance elsewhere. In our estimation, this effect represents about a +/- 10% perturbation to the apparent statistical uncertainties on the global land average. Again, this is completely separate from the large step-increase in uncertainty associated with the absence of Antarctic data.
3) With regards to homogenization, the comments are only partially correct. The step that estimates the timing of breakpoints is presently run only once using the full data set. However, estimating the size of an apparent biasing event is a more general part of our averaging code and gets done separately for each statistical sample. Hence the effect of uncertainties in the magnitude, but not the timing, of homogeneity adjustments is included in the overall statistical uncertainties. Conceptually, it would be desirable to rerun the breakpoint timing detection code on the subsamples as well, to capture the uncertainty in breakpoint timing. However, the effect of doing that is generally very small. Uncertainties in the magnitude of breakpoint adjustments generally contribute much more to the overall uncertainty than the typical errors in breakpoint timing. The breakpoint detection code is also quite computationally intensive because of the large number of station comparisons involved. After performing some tests on this issue, it was decided not to rerun the breakpoint detection code on each subsample due to the very small magnitude of the effect vs. large computational cost.
I hope this helps

0

Brandon Shollenberger

Reply to Steven Mosher

January 29, 2015 10:48 pm

Steven Mosher’s comment repeats the same false argument as Zeke made above, though Zeke said 1960 (then 1955) while this comment says 1950:

1) First off, the step-wise shift in uncertainty has nothing to do with normalization or data processing issues. The large increase in uncertainty prior to about 1950 is a simple consequence of the complete absence of weather stations in Antarctica prior to the 1950s.

But as I’ve shown above, the claim there is a “complete absence of weather stations in Antarctica prior to the 1950s is flat-out wrong. I can accept I may have misdiagnosed the cause of this step-change, but we’ve now had Zeke, Mosher and Rhode all make false claims in explaining it.

The comments are correct though that the use of a baseline (any baseline) may artificially reduce the size of the variance over the baseline period and increase the variance elsewhere. In our estimation, this effect represents about a +/- 10% perturbation to the apparent statistical uncertainties on the global land average. Again, this is completely separate from the large step-increase in uncertainty associated with the absence of Antarctic data.

It is good to know we can all agree I was right about this being a real problem. As far as I can tell, BEST has never discussed this before, so that is progress.
I’ll apologize for my misdiagnosis if it is confirmed I did make one. It may be true the addition of only ten or so stations in Antarctica causes the enormous shift in BEST’s uncertainty. I think that’s a worthwhile point to highlight though. I think most people can agree it is troubling so little data can have so large an effect. If I happened to think there was only one troubling point causing this, when in reality there were two, I think that could be chalked up to just being charitable.

With regards to homogenization, the comments are only partially correct. The step that estimates the timing of breakpoints is presently run only once using the full data set. However, estimating the size of an apparent biasing event is a more general part of our averaging code and gets done separately for each statistical sample.

I’ll admit this comment confuses me. I’ve been told, repeatedly, the scalpel method means the split up segments are used separately so there is no need to calculate “the size of an apparent biasing event.” If that is true, Rhode’s reasoning for why I am supposedly wrong can’t be true. In fact, his remark:

Hence the effect of uncertainties in the magnitude, but not the timing, of homogeneity adjustments is included in the overall statistical uncertainties.

Shouldn’t be true at all as there shouldn’t be any adjustments made for breakpoints.

0

Zeke Hausfather

Reply to Brandon Shollenberger

January 29, 2015 10:59 pm

I’ll reply here. While a bit of Antarctic data may be available prior to 1955, as far as I can tell none is actually used by Berkeley. See http://berkeleyearth.lbl.gov/regions/antarctica
I’m not sure why its not used, but the fact that Berkeley doesn’t have an Antarctica record till 1955 is the reason for the change in uncertainty.
Regarding 1950 vs. 1955 vs. 1960, its just an issue of different people eyeballing the charts.

0

Brandon Shollenberger

Reply to Brandon Shollenberger

January 29, 2015 11:14 pm

Zeke, it is weird you provide a link and tell me to look at it when I myself provided that link to you. Regardless, if you want to say:

While a bit of Antarctic data may be available prior to 1955, as far as I can tell none is actually used by Berkeley.

Then you’re suggesting there’s an even bigger issue than I’ve suggested. I listed what data is presented by BEST. If BEST is listing data as having been used, but simply not using it like you suggest… I don’t know what to say.

I’m not sure why its not used, but the fact that Berkeley doesn’t have an Antarctica record till 1955 is the reason for the change in uncertainty.

One could presume there is a minimum amount of data necessary for a region in order to perform estimates for that region. If so, BEST may using the data I described but finding it insufficient to draw conclusions. That’s not as bad as simply disregarding data.
But it still raises a serious problem. If your uncertainty can change by so much based upon a handful of temperature stations, it stands to reason your overall results could change by meaningful amounts based upon small amounts of data as well.

Regarding 1950 vs. 1955 vs. 1960, its just an issue of different people eyeballing the charts.

I find it peculiar people would say data doesn’t exist based off eyeballing charts rather than doing the simple thing of actually looking at the data. I’d understand it if you guys had said things like, “I don’t think there’s data before X,” but you didn’t. You stated it as fact even though it would take someone less than two minutes to prove you wrong.
It’s hard to take pronouncements from people/groups seriously when they get such easily verified facts wrong. If you’re wrong about something so simple, can we really trust your pronouncements on more complicated matters right?

0

Steven Mosher

Reply to Brandon Shollenberger

January 30, 2015 8:28 am

Brandon you should update your accusations.
And update the head posts

0

Brandon Shollenberger

Reply to Brandon Shollenberger

January 30, 2015 10:03 am

Steven Mosher, you say:

Brandon you should update your accusations.
And update the head posts

This shows your lazy approach to discussions. I have only written one head post in which I made an incorrect claim, and I updated it eight hours before your comment. Not only did I update that post, I wrote a new post explaining the corrections.
I would add an update to this post, but I have no control over that. This post was taken from an e-mail I wrote. I sent a follow-up e-mail explaining what is correct on this issue. It’s up to our host if he wants to add an update.
But since you brought it up, you should update the remarks you provide from Robert Rhode. You, Rhode and Zeke have all told us there is no data for Antarctica before 1960/1955/1960 (depending on the comment) even though that’s clearly not true. When it was shown that’s not true, you… did nothing. It’s a bit weird to tell people to admit mistakes while refusing to admit your own.
[Brandon:
What is the specific change you need to have us make in the “Head text” of this thread? They have to be edited differently than the individual comments? List the change clearly, and we will make that change, but, it does have to be done separately from “comments” or replies. .mod]

0

Brandon Shollenberger

Reply to Brandon Shollenberger

January 30, 2015 3:03 pm

Sorry for the slow response moderator. I didn’t see it since it was added in-line. I don’t know that I’d want anything about this post changed. I’d just add a short update to the end. Modeled after the update I added to my post on the baseline issue, I’d go with something like:

Edit: Brandon says he got his argument wrong as part of a trick. You can find his explanation here, but the short version is the baseline issue highlighted in this post is real but is not the cause of the step change the post shows. Brandon says he intentionally misdiagnosed the cause of the step change to provoke BEST into acknowledging the problems he describes are real.
Brandon apologizes to anyone who is bothered by this but wants to stress the fact it worked. BEST has now, for the first time ever, acknowledged the existence of the problems he described.

And of course, if WUWT would like to distance themselves from my actions, it is welcome to add a statement of its own. I don’t think such is necessary since my deception was just a matter of playing dumb, but I can’t complain if people are bothered by me misleading them.

0

Matthew R Marler

Reply to Steven Mosher

January 30, 2015 4:13 pm

Steven Mosher: However, the effect of doing that is generally very small. Uncertainties in the magnitude of breakpoint adjustments generally contribute much more to the overall uncertainty than the typical errors in breakpoint timing.
I am sure that is true, but it would still be worthwhile I think if the results of the tests were posted.
Thank you, and Zeke for your informative posts in this thread.

0

Steven Mosher

January 29, 2015 8:38 pm

To give you some sense of the computational load the full uncertainty calculation takes days.
DAYS to run.
So, re running breakpoint might change your uncertainty from +-.05 to +-.055 and that
would take you several extra days to compute.
of course we tested that and frankly it’s not worth the time.
Whether the uncertainty is +-.05C after a few days of computing or +-.06C after a couple weeks
isn’t really scientifically interesting. Might make for a cool blog post though.
Now of course, attributing the chance in uncertainty to the jackknife methodology could have been avoided by reading the materials. Or by actually running the code. That is why it is provided. So that folks can run it, change things and test their theories about why things work the way they work

0

Dave in Canmore

Reply to Steven Mosher

January 29, 2015 10:04 pm

Would it not take less time to research the actual stations themselves and find out its actual siting history and changes to it’s actual environment? Everything else honestly seems kind of backwards.

0

richard verney

Reply to Dave in Canmore

January 30, 2015 1:10 am

One would have presumed that the starting point to the compilation of this data series, would have been a thorough audit (including an on site physical review) of each station used in the collection of data that forms the series.

0

Shub Niggurath

Reply to Dave in Canmore

January 30, 2015 4:31 am

Oh no, that would be *real work*, wouldn’t it?
Much easier to run programs on computers.

0

DGH

Reply to Steven Mosher

January 30, 2015 3:05 am

Mosh Dude –
Do you go out of your way to be grammatically and otherwise incoherent? Spicoli might be an intellectual on your side of the pizza. But damn I just don’t have the munchies.
Your BEST defense invokes,
“Relax, all right? My old man is a television repairman, he’s got this ultimate set of tools. I can fix it.”

0

Carrick

Reply to Steven Mosher

January 30, 2015 11:36 am

`Steven Mosher:

To give you some sense of the computational load the full uncertainty calculation takes days.
DAYS to run.

Get a faster computer or rewrite in a compiled language. If you (or anybody) is getting paid to develop the code, expect no sympathy for interpreted code running slowly.

0

Matthew R Marler

Reply to Steven Mosher

January 30, 2015 4:32 pm

Steven Mosher: To give you some sense of the computational load the full uncertainty calculation takes days.
DAYS to run.
So, re running breakpoint might change your uncertainty from +-.05 to +-.055 and that
would take you several extra days to compute.
of course we tested that and frankly it’s not worth the time.
With respect, I think that you are wrong. It would be worthwhile to use the additional computer time to generate the tables and graphs that support your claim (which I think is likely true), that re-estimating the breakpoints adds negligible uncertainty to the output estimates. As you can see from the reading, Brandon Shollenberger and others simply do not believe you.
The burden of proof for a claim rests with the claimant; in this case, the burden rests with the BEST team to support your claim.
Yes, I have done many days worth of extra computations like this, in response to criticisms/questions like these from Brandon Shollenberger. You have to bite the bullet and do the computations and report the results.

0

Related Posts

At The Heartland Climate Conference: “What Is The Proof?”, Earth’s Energy Imbalance Edition

How Did Last Month’s (UK) Rainfall Compare With 1929?

Met Office’s N Ireland Rainfall Dataset Is Worthless

BBC’s Fake Record Rainfall Claims