WUWT Video – Zeke Hausfather explains the new BEST surface data set at AGU 2013

I mentioned the poster earlier here. Now I have the video interview and the poster in high detail. See below.

Here is the poster:

BEST_Eposter2013

And in PDF form here, where you can read everything in great detail:

AGU 2013 Poster ZH

 

Advertisements

  Subscribe  
newest oldest most voted
Notify of
therapist1900

[off topic-mod]

Werner Brozek

Hausfather? in title

When I get home I’ll upload to googledocs if needed

Bill Illis

Obviously, Zeke, Mosher and Robert are smart guys with the best intentions. Its just that we cannot be sure that this methodology is producing true results.
Take 20,000? stations and splice them up into 160,000 (now shorter) records and there is much higher trend than the original raw records show.
Okay, that tells you the methodology produces a higher trend. How does one double-check 160,000 spliced-up records? Why does the methodology produce a higher trend? What is the individual trend-raising rates for each particular component of the methodology. Why is there no warm 1930s period for the majority of the US in the numbers. When a record was set on July 8, 1936 in hometown Kansas, why does that not show up in the new reconstruction.
I think best intentions does not mean true. We need all the data and various other diagnostics to double-check the results.

Rob Dawg

At no time was allowance for UHI mentioned. Despite this the resultant graphs were basically exactly what could be expected from delta temp in order to ake northern climes habitable.

There is too much confusion said the joker to the thief!
Jimi Hendrix – All Along The Watchtower
http://www.dailymotion.com/video/x7eonk_jimi-hendrix-all-along-the-watchtow_music
Weird Statistics to explain weird results!

Energetic

Anthony,
I would like to see the poster Numbered 996 right of the BEST Poster. It is obviously about the global Warming hiatus. Do You know where to find it?
REPLY: I have coverage of that one planned – Anthony

Bill Illis

I always think in terms of “my backyard.”
My backyard has a thermometer and a large garden and there are definitive guidelines of when things can be planted. The dates are exactly the same as they have always been for 100 years.
One can tempt fate and try a little earlier. Nope, Mama Nature gives you a spanking.
This year, the snow melted out at its latest date ever, going back 100 years. But, planting times remained exactly the same. The solar angle and the tilt of the Earth warmed the ground up at exactly the same time as it has always done.
If it were truly 1.5C warmer, those dates would have changed by now and the snow would not have left at its latest date on record. My backyard tells the truth.

u.k.(us)

Can’t wait till the “science is settled”, then we can argue about the, satellite/ argo/balloons/surface stations quantification of the “problem”.
The windmills as-built must already be having an effect.
Can’t figure out why, these cost-effective solutions to saving the planet (and our grandchildren), are not being shouted from the rooftops ?
It can’t be due to a lack of transparency, we’ve been reassured.
So there is that.

@Zeke Hausfather at 12/13 6:32 pm in Dec. 12 post
Stephen Rasey: the march of the thermometers meme was so 2010. Berkeley uses ~40,000 stations, and 2012 has more station date than any prior year (it increases pretty monotonically. GHCN-M version 3 (which NCDC/NASA now use) also has much more station data post-1992 than the prior version 2.
Stephen Rasey at 12/15 12:49 pm
@Zeke Hausfather at Dec 13, 6:32 pm

…..
Let’s take a look at BEST results for Iceland.
You say that BEST has over 40,000 stations. The page lists 40,747.
Dumb question #1. Are these 40,747 “stations”
A) separate thermometer locations with potentially discontinuous records, before the application of the BEST scalpel? or
B) virtual stations CREATED by taking a slice from far fewer locations. For example they are from 8000 geographic locations with an average of 4 scalpel-slice “breakpoints” in each location records.
C) Neither (A) nor (B).
Under (B), you get more “stations” by making more breakpoints. The claim that you have more stations implies you have more coverage and better data. But if stations are created by a new breakpoints, more stations hints at worse data, [shorter average record lengths,] greater uncertainty and more loss of low frequency information.
So what is closer to the truth?
(A) where you have 40,747 station thermometer records you slice into 200,000 segments or
(B) you have fewer than 10,000 thermometer locations you slice into 40,747 “stations.”

This question has been out on a prior post there unanswered for 3 days.
Let’s see what GHCN-M Version 3.2.0 says:

GHCN-M Version 3.2.0 contains monthly climate data from weather stations worldwide. Monthly mean temperature data are available for 7,280 stations, with homogeneity-adjusted data available for a subset (5,206 mean temperature stations). Data were obtained from many types of stations. For the global component of this indicator, the GHCN land-based data were merged with an additional set of long-term sea surface temperature data; this merged product is called the extended reconstructed sea surface temperature (ERSST) data set, Version #3b (Smith et al., 2008).
Source: epa.gov

If the “march of the thermometers”, i.e. the Great Dying of Thermometers “is so 2010” as you say in ridicule,
you must be implying (A);
40,000 weather stations split into about 200,000 segments via BEST breakpoints.
But it appears from GHCN-M to be (B).
About 7,000 weather stations that you slice with break points into 40,747 station record segments.
So what is it Zeke? A or B?
Would 97% of general scientists and engineers agree with your assertion that BEST has “over 40,000 stations” and not 40,000 station record segments or station fragments from about 7,000 weather stations?

markx

Andres Valencia says: December 18, 2013 at 4:24 pm
There is too much confusion said the joker to the thief!
Bob Dylan – All Along The Watchtower
Apparently Dylan loved Hendrix’s version of the song: Quoted (somewhere), “I didn’t know it at the time, but I wrote that song for Jimmy”.

Original Dylan version:
http://grooveshark.com/#!/s/All+Along+The+Watchtower/4nCzHC?src=5
I’m sure Dylan had Climate Change in mind and maybe when he wrote “The Times are a Changing” & maybe “Blowing in the Wind” too.

Reg Nelson

I thought the argument was the the MWP, LIA & RWP were localized events and were therefore dismissed.
Yet, this study seems to focus only on the US weather stations. Am I missing something?
How can you have it both ways (logically)?

Luke Warmist

…..as long as we’re quoting songs, when I look at BEST I think of David Lee Roth in ‘Panama’ …….”We’re running a little bit hot tonight”….

Don

Cannot speak to their methods or results, but kudos to Zeke for giving Anthony crisp, coherent answers on topic that a lay person could at least follow.

Tom J

Zeke Hausfather leaves me a little bit cold (no pun intended), midway through the interview, with his explanation for the absence of error bars. Sure, one can look it up, as he says, but shouldn’t the error range be shown in the presentation? A shaded range would not have made the graphs confusing. He seems bright and articulate (and I apologize for insulting statements I made in a comment a few days ago) but is it possible he’s being a little disingenuous about the confidence in those graphs?

CodeTech

That’s odd… if you want to talk songs, climate stuff always reminds me of Mike and the Mechanics, “Taken In”…

Richard D

Stephen Rasey says: December 18, 2013 at 4:46 pm
Would 97% of general scientists and engineers agree with your assertion that BEST has “over 40,000 stations” and not 40,000 station record segments or station fragments from about 7,000 weather stations?
+++++++++++++++++++++++
deserves an answere

Sweet Old Bob

After watching the clip…..Why do I sense HORSEFEATHERS ?

Bad Andrew

Mosher has declared innumerable times on the internet that according to basic physics, adding C02 to the atmosphere makes the atmosphere warmer. Keeping in mind these declarations, who thinks he will then produce any analysis that shows otherwise?
Andrew

Merrick

Dylan also later rerecorded All Along the Watchtower in a style more similar to Jimi.
Another similar pairing – after Mitch Ryder and the Detroit Wheels recorded Rock & Roll Lou Reed said that their version was much better than his and only after hearing their version did he understand how the song was supposed to be played.

u.k.(us)

I just caught it, Anth-ny did the interview. !!
Nicely done.
Questions asked, questions answered.

john robertson

Interesting talk.
If the original data came up with divergent trends of 0.5 a degree on regional basis, from inputs with a +/- 1 degree error range, why would anyone be surprised?
The trend is not significant.Otherwise known as noise.
The BEST methodology diced and sliced this data into fragments, and reanalysed to to proclaim a trend of … ??Warming of significance.
Yet Stephen Mosher tells us on the previous(Poster) post, the data is crap….
Sorry but as any farmer knows, if you slice and dice crap, mix it up and spread it around, its called manure.

Bill:
Obviously, Zeke, Mosher and Robert are smart guys with the best intentions. Its just that we cannot be sure that this methodology is producing true results.
################################################################
Well, in fact you can. We produce a FIELD that is a prediction for the temperature at any location in the US. All you have to do is find a station that we dont use in the estimation of the field and compare it with the prediction of the field. You can go to the oklahoma Mesonet stations. They are better sited than CRN. You can pay thousands of dollars for that data and
check the prediction. These stations were not used in estimating the field. We also can product the field and hold out stations. Then use the field to predict what we expect to find there.
This in fact was what we did with the first paper, using only a few thousands sites.
Finally, we know that the method is BLUE.
###########################################
Take 20,000? stations and splice them up into 160,000 (now shorter) records and there is much higher trend than the original raw records show.
###################
Wrong. When you have a station that moves from being located at 1000 feet ASL
to 0 feet ASL you have 2 stations. treating these two stations as one is wrong.
Further adjusting this station is also filled with error. So we dont stitch two
different stations together ( AS GISS would do ) and we dont do a SHAP adjustment
as NOAA and GISS and CRU would do. we treat the stations as two stations
BECAUSE they are two stations. The same if the time of observation is changed.
We dont adjust. we treat them as two stations because when you change the observation
time you have created a new stations. And when the station shelter changes, we dont
pretend this is the same station. Its not. Its a new station and new shelter. We dont try to
adjust this. we say what skeptics like willis said: these are diffrent stations
Finally, we conduct a double blind test to prove out the method.
#############
I think best intentions does not mean true. We need all the data and various other diagnostics to double-check the results.
The code and data has been us for years. SVN access with the password is posted.
I note that you havent even looked at it

Tom J says:
December 18, 2013 at 5:32 pm
Zeke Hausfather leaves me a little bit cold (no pun intended), midway through the interview, with his explanation for the absence of error bars. Sure, one can look it up, as he says, but shouldn’t the error range be shown in the presentation?
####################
Not really. Posters are work in progress, basically showng people in an informal way what work you are doing. When the dataset gets published then of course you would add whatever detail the reviewers wanted.
one thing zeke left out is this.
1. we used GISS at 250km interpolation. They dont publish error bars, so we cant plot
what doesnt exist
2. Merra does not publish uncertainty, so we cant plot what doesnt exist
3. narr doesnt
4. UAH doesnt
5. Prism doesnt
6. NCDC doesnt
7. RSS doesnt
So basically we cant plot what doesnt exist for the other guys
in short. all the estimates for CONUS lie on top of each other when you just look at the temporal dimension. thats not what is interesting. Let me repeat that. The point of the dataset is not
to compare the overall trend of all these datasets. The point is to examine the spatial differences. Sure the overal trend matches but what does it look like if you get higher spatial resolution. We have stations every 20km… why average that data into 5 degree grid cells like CRU does or GISS?

Stephan rasey.
1. There are 40000 separate stations in the entire database.
2. there are 20K stations in this CONUS database
GHCN Monthly is rarely used. We use GHCN DAILY. daily data is not adjusted.
As for splitting. It depends on on the time period.

@Mosher:
(On tablet so a bit terse…) The problems I see in your approach are two fold. First, the use of averages of intensive properties is philosophically broken (devoid of meaning. Link in prior posting) and this cannot be avoided. Second: splicing induces splice artifacts. You splice more (nicely, and indrectly, after dicing…) so will have more splice effect, not less. These are both endemic and real problems. You find trend not in individual long lived stations. Something is amiss.

@Mosher at 7:29 pm
When you have a station that moves from being located at 1000 feet ASL to 0 feet ASL you have 2 stations.
Granted. If you have the metadata to support it.
How many breakpoints are in the data base?
For how many breakpoints do you have verified metadata that supports ending one station and starting a new one.
BEST does it backwards. It doesn’t have the metadata. So it looks for likely breakpoints in the data and regional kriging without knowing the physical cause.
What if the breakpoint shift is really from a recalibration?
Finally, we conduct a double blind test to prove out the method.
Link please.
I remember reading one paper from a year ago about a blind test, but that was on the kriging process on unsliced, synthetic data.
Rasey Jan 21, 2013
The Rohde 2013 paper uses synthetic error free data. The scalpel is not mentioned.
http://berkeleyearth.org/pdf/robert-rohde-memo.pdf

Here is the USA data using GHCN-D data that has not been homogenized. Clearly warming is cyclical and not driven by CO2 as the current peaks are similar to the 30s and 40s. http://landscapesandcycles.net/image/75158484_scaled_590x444.png
The question with any re-analysis or any other method data assimilation method, is how do they treat “change points”. As I documented in Unwarranted Temperature Adjustments: Conspiracy or Ignorance? http://landscapesandcycles.net/why-unwarranted-temperature-adjustments-.html , many data sets are adjusted because natural change points due to the Pacific Decadal Oscillation and other natural cycles are misinterpreted. The homogenization process creates new trends and, threats those natural change points as “undocumented station changes”. Whether it is the GISS datasets or Berkeley those ” change point” adjustments need to be critically analyzed.

@E.M.Smith at 7:48 pm
Not to mention that BEST is proud of being able to work with short segments and use the least square slopes.
Never mind that as the length of the segment gets shorter, the slopes will get larger, either positive or negative, just look at the denominator of the slope function. The uncertainty in the slope goes through the roof. How do they use slope uncertainty in their kriging?
If you have 40 years of a record, what is the longest duration climate signal you might tease out? Fourier say only 40 years, but with a principle frequency analysis you might tease out an 80 year hint assuming with a half cycle contains a significant amount of power. Now, you take the scalpel and cut it somewhere in half with an unspecified breakpoint shift. You want to tell me that you can still estimate an 80 year cycle from two uncorrelated 20 year signals? You’d be lucky to get back to the 40 year signal you left on the cutting room floor.
Short segments mean short wavelengths; loss of long term trend. High frequency content cannot predict low frequency content. I don’t care how many thermometers you krig. And kriging with thermometers 500 to 2000 km away from the station doesn’t instill confidence in any of it. (See Iceland)

OssQss

This is a perfect opportunity for WUWT-TV….. Anthony?
Just sayin, How many emails can one avoid with a short discussion “Live” now days?

Gerald Machnee

Steven Mosher says:
***When you have a station that moves from being located at 1000 feet ASL
to 0 feet ASL you have 2 stations. treating these two stations as one is wrong.
Further adjusting this station is also filled with error.**
However you now do not have history. So you cannot say you have a 100 year record, for example. Continuous records show trends or the non-existence of trends until they are “adjusted”.

From the poster …
“Stations outside U.S. boarders are also used in detecting inhomogenities and in the construction of temperature fields”
First – does spelling count? Anyone care to join Climatologists Without Boarders?
Second – rapid drop off in densities over Mexican border vs higher densities over Canadian border. Can one expect the same, high resolution accuracy for San Diego / San Ysidro as, say, Detroit?

Here are a couple of links bake to Berkley Earth Finally Makes Peer Review…” Jan 19, 2013.
This is for reference back to previous discussions on the same topic of the scalpel and krigging
Phil 1/21/13 8:36 pm
Restates Fourier argument that the scalpel loses long wavelengths.
How is BEST not effectively a sort of statistical homeopathy? Homeopathy, as I understand it, is basically taking a supposedly active ingredient and diluting it multiple times until almost none of it is left and then marketing it as a cure for various ailments.
Willis 1/22/13 12:41 am
Rebuttal of Phil.
As long as there is some overlap between the fragments, we can reconstruct the original signal exactly, 100% correctly.
Phil 1/22/13 3:04am, reply to Willis 12:41 am
First, the problem is that there has to be some credible basis on which to reconstruct the original signal, such as the overlap you mention. The problem is that, as I understand the scalpel (and I may not understand it correctly), by definition there isn’t going to be an overlap between fragments of the record of a given station
“Is it the data talking or the seemstress?”
Willis 1/22/13 10:32am reply to Phil 3:04 am
However, you seem to think that we are trying to reconstruct an individual station. We’re not. We’re looking for larger averages … and those larger averages perforce contain the overlaps we need. However, the scalpel doesn’t use “neighboring stations” to put them back together. Instead, it uses krigging to reconstruct the original temperature field.
…Look, Phil, the scalpel method has problems, like every other method you might use. But that doesn’t make it inferior to the others as you seem to [think].

I post. Willis comments. Then I work on a Band-Pass signal and seismic inversion argument.
Rasey 1/22/13 12:05 pm
BEST, through the use of the scalpel, shorter record lengths, and homogenization and krigging is honoring the fitted slope of the segments, the relative changes, more than the actual temperatures. By doing that, BEST is turning Low-Pass Temperature records into Band-Pass relative temperature segments.
With Band-Pass signals, you necessarily get instrument drift over time without some other data to provide the low frequency control. I suspect BEST has instrument drift as a direct consequence of throwing away the low frequencies and giving low priority to actual temperatures.

I then gave an illustration of seismic inversion.
… It is possible to integrate the reflectivity profile to get an Impedance profile, but because the original signal is band-limited, there is great drift [over time], accumulating error, in that integrated profile. The seismic industry gets around that drift problem by superimposing a separate low frequency, low resolution information source, the stacking or migration velocity profile estimated in the course of removing Source-Receiver Offset differences and migrating events into place. …
In a similar vein, BEST integrates scalpeled band-pass short term temperature difference profiles, to estimate total temperature differences over a time-span. Unless BEST has a separate source to provide low-frequency data to control drift, then BEST’s integrated temperature profile will contain drift indistinguishable from a climate signal.
If BEST has a separate source to provide low-frequency control I still don’t know what it would be that that haven’t already minced.

thisisnotgoodtogo

[snip – enough of this name calling/bashing of people you don’t agree with. You don’t even have the courage to put your own name to your own words while bashing somebody who does and who has done some actual work. My advice – cool it or I’ll do it for you – Anthony]

ZootCadiilac

It does not matter who explains it and how many times it is repeated. Anyone with a modicum of integrity has to know that this is a bullshit methodology that has repeatedly given an inflated result.
Anyone who persists in arguing otherwise has an agenda, personal or that of an employer, which requires them to do so.
Stand up and be men. Just say we don’t know because there are not enough data. So your averaged best guess is as good as mine.

@Steven Mosher at 7:42 pm
1. There are 40000 separate stations in the entire database.
2. there are 20K stations in this CONUS database
GHCN Monthly is rarely used. We use GHCN DAILY. daily data is not adjusted.

GHCN-Daily now contains records from over 75000 stations in 180 countries and territories. Numerous daily variables are provided, including maximum and minimum temperature, total daily precipitation, snowfall, and snow depth; however, about two thirds of the stations report precipitation only. Both the record length and period of record vary by station and cover intervals ranging from less than year to more than 175 years. (Source: ncdc.noaa.gov)

A third of 75,000 stations is a lot less than 40,000.

GHCN-M Version 3.2.0 contains monthly climate data from weather stations worldwide. Monthly mean temperature data are available for 7,280 stations, with homogeneity-adjusted data available for a subset (5,206 mean temperature stations)

Mosher, your 40,000 number doesn’t make sense if “about two-thirds” of 75000 are precipitation only. You are also asking me to believe that GHCN Monthly has significantly fewer stations than GHCN Daily temperature stations. I need to see a link on that.
I would also very much like to see a distribution of the length of station records in the database and a distribution of the lenght of records after they have been through the scalpel.

zootcadillac

Stephen Rasey says :

I would also very much like to see a distribution of the length of station records in the database and a distribution of the lenght of records after they have been through the scalpel.

I would also like to see this. but for now I’ll settle for the more likely scenario of faerie folk farting rainbows and world peace. ( the world peace being a direct result of the atmospheric increase in gaseous rainbows. I was not being greedy by asking for two separate items )

Mario Lento

At around 2 minutes 50 seconds into the video. Zeke says that their results were much more consistent with reanalysis products… This gives them confidence they did it right evidently. What are the reanalysis products? And how are they right?

Mario Lento

reposted so I can follow the topic:
Mario Lento says:
December 18, 2013 at 10:04 pm
At around 2 minutes 50 seconds into the video. Zeke says that their results were much more consistent with reanalysis products… This gives them confidence they did it right evidently. What are the reanalysis products? And how are they right?

zootcadillac

in addition to my previous comments: I know full well that Anthony has no truck with folk who criticise whilst posting ‘anonymously’. i’ve made it clear before that I post as I do from habit rather than a wish to hide. My name is Craig Frier should anyone ever take issue with a comment of mine. I’m more than happy to provide an address should anyone wish to berate me in person.

RoHa

@Reg Nelson,
“Yet, this study seems to focus only on the US weather stations. Am I missing something?”
I noticed this earlier. It looks as though Global Warming affects only Americans, and the other 95% of us don’t have to worry

zootcadillac

Yes @RioHa and @Reg Nelson. Whilst the Continental US has to deal with the despicable affliction global warming the rest of us, the likes of myself in the North West of the United Kingdom waking up to no power after a windy night for example, have to deal with weather because it’s not part of the program to include those of us moving back to Arctic conditions.

Why did they start at 1850? Is it because it was the best year to start the record due to temperature measuring sites, quality of data, or because that was the end of the Little Ice Age? On that note I have talked with some old timers in my area (east Texas) and they told me that when their ancestors settled our area in the early 1800s that the ecosystem was mainly grass plains and some savanna, but now our entire area is heavily forested (except where cleared by man). These forests grew since the end of the LIA, which means there was a significant climate change between then and now with it being both warmer and wetter now. So, why was the year 1850 chosen when that is precisely when North America started recovering from the LIA?

Tilo

This is all a waste of time. The 100 most pristeen stations in the world, unmoved, no time of day changes, totally rural, un homogenized, no station type changes, would give a more real picture of the global temperature trend than all of this super adjusted and manipulated nonsense.

Steve Oregon

This is yet another demonstration of how WUWT is the new (and far superior) normal for
“Peer Review”. This is the the open, global, immediate and unlimited way to scrutinize that which needs scrutinizing. Resulting in the removal of any potential for the snow jobs and pal review produce by the now obsolete Journal “Peer Review” .
This is what science in all forms should strive to achieve for the sake of the highest quality outcomes. The web provides a level of participation which cannot be marginalized by those clinging to what they hoped to continue controlling to benefit their interests at the expense of progress.

Lawrie Ayres

Tilo,
I have often wondered the same. It would surely give an accurate record. Looking at some remote Australian stations ie in country areas where growth has been minimal, the record shows either no warming or in a number of cases, slight cooling. Obviously this is a real problem for a warmist organisation like our BoM so they adjust until they get warming. When taken to task by a request to the Auditor general they scrapped that data series and started another. They must have felt threatened.

Tilo

Lawrie,
You can’t get perfection. For example, peeling paint can change readings. But I think what I mentioned above would give us a truer picture. Several thousand readings with multiple layers of adjustments just have no value at all in my mind. And then Best actually came up with a result that there is no UHI effect. But many other studies have proven that there is by direct emperical comparison of cities and their immediate countryside. Heck I can confirm that on my car thermometer. In my mind BEST and GISS are a total waste of time for even more reasons than I have already given. We should move on and let the alarmist faithful cling to their security blankets.

Konrad

The bottom line is that “BEST” is using surface station data to try and prove a warming tend in climate of less than 1C. However the data is unfit for purpose. No amount of slicing, dicing or homogenisation can ever fix it.
The egg is scrambled and no amount of extra time in the blender can unscramble it.
Anthony’s surface stations project highlighted the extent of the problem. Without individual station metadata, there is no hope of applying the necessary corrections to individual station records. This data would have to record numerous micro and macro site factors for each station and span the full period of the station record. This data does not exist for a sufficient number of stations.
We have well past the point in the climate debate where the attempted use of surface station data speaks to motivation. It now speaks very, very loudly.