There’s a new paper out today, highlighted at RealClimate by Hausfather et al titled Quantifying the Effect of Urbanization on U.S. Historical Climatology Network Temperature Records and published (in press) in JGR Atmospheres.
I recommend everyone go have a look at it and share your thoughts here.
I myself have only skimmed it, as I’m just waking up here in California, and I plan to have a detailed look at it later when I get into the office. But, since the Twittersphere is already demanding my head on a plate, and would soon move on to “I’m ignoring it” if they didn’t have instant gratification, I thought I’d make a few quick observations about how some people are reading something into this paper that isn’t there.
1. The paper is about UHI and homogenization techniques to remove what they perceive as UHI influences using the Menne pairwise method with some enhancements using satellite metadata.
2. They don’t mention station siting in the paper at all, they don’t reference Fall et al, Pielke’s, or Christy’s papers on siting issues. So claims that this paper somehow “destroys” that work are rooted in failure to understand how the UHI and the siting issues are separate.
3. My claims are about station siting biases, which is a different mechanism at a different scale than UHI. They don’t address siting biases at all in Hausfather et al 2013, in fact as we showed in the draft paper Watts et al 2012, homogenization takes the well sited stations and adjusts them to be closer to the poorly sited stations, essentially eliminating good data by mixing it with bad. To visualize homogenization, imagine these bowls of water represent different levels of clarity due to silt, you mix the clear water with the muddy water, and end up with a mix that isn’t pure anymore. That leaves data of questionable purity.
4. In the siting issue, you can have a well sited station (Class1 best sited) in the middle of a UHI bubble and a poorly sited (Class5 worst sited) station in the middle of rural America. We’ve seen both in our surfacestations survey. Simply claiming that homogenization fixes this is an oversimplification not rooted in the physics of heat sink effects.
5. As we pointed out in the Watts et al 2012 draft paper, there are significant differences between good data at well sited stations and the homogenized/adjusted final result.
We are finishing up the work to deal with TOBs criticisms related to our draft and I’m confident that we have an even stronger paper now on siting issues. Note that through time the rural and urban trends have become almost identical – always warming
up the rural stations to match the urban stations. Here’s a figure from Hausfather et al 2013 illustrating this. Note also they have urban stations cooler in the past, something counterintuitive. (Note: John Nielsen-Gammon observes in an email: “Note also they have urban stations cooler in the past, something counterintuitive.”, which is purely a result of choice of reference period.” He’s right. Like I said, these are my preliminary comments from a quick read. My thanks to him for pointing out this artifact -Anthony)
I never quite understand why Menne and Hausfather think that they can get a good estimate of temperature by statistically smearing together all stations, the good, the bad, and the ugly, and creating a statistical mechanism to combine the data. Our approach in Watts et al is to locate the best stations, with the least bias and the fewest interruptions and use those as a metric (not unlike what NCDC did with the Climate Reference Network, designed specifically to sidestep the siting bias with clean state of the art stations). As Ernest Rutherford once said: “If your experiment needs statistics, you ought to have done a better experiment.”
6. They do admit in Hausfather et al 2013 that there is no specific correction for creeping warming due to surface development. That’s a tough nut to crack, because it requires accurate long term metadata, something they don’t have. They make claims at century scales in the paper without supporting metadata at the same scale.
7. My first impression is that this paper doesn’t advance science all that much, but seems more like a “justification” paper in response to criticisms about techniques.
I’ll have more later once I have a chance to study it in detail. Your comments below are welcome too.
I will give my kudos now on transparency though, as they have made the paper publicly available (PDF here), something not everyone does.



We are finishing up the work to deal with TOBs criticisms related to our draft and I’m confident that we have an even stronger paper now on siting issues.
Anthony, are you hinting that your 2012 paper is being held up because of TOBs criticisms? Or is it that you have a follow-up paper that will combine Watts-2012 draft with additional stations where TOBs needs to be addressed?
In the video of Watts-etal-2012 during your Gore-a-thon-2012, I thought your solution of using only stations that had no need of TOB adjustment was not only a very proper treatment of the data, but also an essential one. The reduction in the number of stations available will increase the uncertainty in the trends, but we must see the analysis where TOBS adjustments are unnecessary before we apply a confounding TOBS adjustment with its necessary increase in uncertainty due to method.
BTW, in the Gore-a-thon-2012 category, I don’t see the last hour covering Watts 2012 listed. Do you have the video on WUWT?
Thanks for it all.
Hi Anthony
Keep on drumming, eventually they will have to march to the beat of science based research not dogma.
I have done some work on Australia’s temperature record keeping and plotted RAW temperature series for a number of them… I then located the Acorn-Sat data and have been plotting these alongside the RAW… not only do they no take into account UHIE in city-sited stations (e.g. Obervatory Hill Sydney) , their data bears no resemblance to the RAW data, particularly in the earlier years.
In addition in the Richmond NSW series, between 1958 and 2012 the Acorn-Sat data has 173 missing entries as compared to their own RAW data. The RAW data in the CSV I downloaded from the BOM site are all marked as ‘Y’ for quality but have been removed from the Acorn-Sat data?
The result of my plotting these together comes up with the following obvious adjustments… the following are the first 10 years of minimum temperatures recorded for each year.
SAT DATA
1958 -4.20
1959 -5.00
1960 -5.00
1961 -5.00
1962 -4.00
1963 -2.90
1964 -3.90
1965 -3.50
1966 -5.30
1967 -2.60
RAW DATA
1958 -1.70
1959 -2.80
1960 -2.20
1961 -2.20
1962 -1.80
1963 -0.60
1964 -1.70
1965 -1.30
1966 -2.80
1967 -0.60
It should be noted that in 1994 a new station came on line and it is from that point back to 1958 that the minimums have been adjusted… however, the maximums have also been adjusted, but to a far lesser extent.
Adding up the differences over the 1958 – 2012 period we see the minimum aggregate changes totalling -45.7° C while the maximums aggregate to -2.3° C over the same period… it should further be noted that 26.2° C of the adjustments to the minimum temperatures were in the first 11 years?
The result of the obvious tinkering is that the average annual temperatures were adjusted to the tune of -19.22° C over this same period with -7.9° C being in the first 11 of the 55 years.
It goes without saying that the charted SAT data starts off at a much lower point than the RAW data for both minimum & average temperatures while the maximum, although still different is relatively similar in comparison.
(-;]D~
Slartibartfast says:
February 13, 2013 at 9:57 am
“But if the equipment use to measure all data is statistically similar in behavior, no devaluation is justifiable.”
There are at least two parts to every measurement: the object measured and the measuring instrument. My comment is about the former not the latter. Mainstream climate science refuses to do the empirical work to determine that the objects measured are actually comparable. Anthony gives them an empirical five-fold classification of the objects. They simply cannot bring themselves to deal with Anthony’s empirical work on the objects measured. They are not engaged in science.
Ed_B says: “To evaluate the effect of CO2 upon global temperatures beyond the baseline natural warming since the little ice age, we need absolutely clean data, ie, numbers which are from a long term pristine site. If we only have 50 such land sites world wide, so be it. Thats all we have to work with. Add that to the validated ocean measurements.”
Sorry, Ed, perfect data won’t prove causation, no matter what the correlation. Trying to tease a trend out of extremely noisy data and then saying it proves AGW exists is a fool’s errand.
The chart shows that the study used USHCN version 2, which is a “value-added” data set.
The v2 revision from the original USHCN raw data in 2009 had the effect of increasing the slope of the curve, moving it closer to the GISS homogenized data, producing an artificial warming.
The paper merely studies whether the rural stations were as altered as the urban ones.
jorgekafkazar says:
February 13, 2013 at 10:50 am
“Sorry, Ed, perfect data won’t prove causation, no matter what the correlation. Trying to tease a trend out of extremely noisy data and then saying it proves AGW exists is a fool’s errand”
But a zilch trend in prisitne sites above the long term rise and cyclical ups and downs from the liitle ice age leaves a big problem for those wanting to demonize CO2 does it not? I doubt CO2 has any effect at all that we can measure with 2 sigma statistical significance, let alone 3 sigma. Would you allter the worlds economy on 2 sigma? Not me, I would want better proof than that.
Whatever became of Occams razor?
Clim. Past, 7, 975–983, 2011
http://www.clim-past.net/7/975/2011/
doi:10.5194/cp-7-975-2011
© Author(s) 2011. CC Attribution 3.0 License.
Climate
of the Past
Temperature trends at the Mauna Loa observatory, Hawaii
Your comment about mixing good sites with bad:
The Mauna Loa site is considered to represent the world vis-a-vis CO2 measurements. This paper looks to see if the Mauna Loa site could also represent the world vis-a-vis temperature readings.
What is VERY interesting is that the data from 1979 – 2009 shows
1) at noontime a DROP in temperatures of -1.4C/100, while
2) at midnight, a RISE in temperatures of +3.9C100.
The average, then is +1.25C/100. But this is an artefact of mathematics!
What is going on at Mauna Loa? The paper theorizes that GHGs warm the ground at night, while causing additional mixing of upper air at noontime.
There is always a spin.
What about this: at noontime temperatures are dropping, period. At nighttime, WINDS have decreased.
I was at the Mauna Loa site in January of this year. It is at 11,300′ on the leeward side of Mauna Loa volcano, surrounded by a mountain of recent blocky and ropy volcanic flows. The sun is uninterrupted by coulds, which is why there is an observatory there; if there is no breeze, the ground warms up well.
The paper makes the claim that the observations bolster the CO2 narrative. But I sense a discomfort in this conclusion.
The data has been ruthlessly smothed for the purposes of the conclusion, by the way.
B. D. Malamud1, D. L. Turcotte2, and C. S. B. Grimmond1
“In all of the urbanity proxies and analysis methods,the differences between urban and rural station minimum temperature trends are smaller in the homogenized data than in the unhomogenized data, which suggests that homogeniza- tion can remove much and perhaps nearly all (since 1930) of the urban signal without requiring a specific UHI correction. ”
Of course the differences are smaller. It’s like taking some white paint and pouring it into the black paint and taking some black paint and pouring it into the white paint and then acting as though it is a revelation that the differences between the black and the white paint are smaller after the “homoginization”. When will you give up the idea that homoginization is any kind of solution for UHI. Are you so enamored of letting the computer do it for you that you can’t see that it’s an absurd idea right from the start.
Simply UHI correct the Urban and small town stations based on pristeen rural sites with the least amount of changes in location, urbanization, and reading time of day.
Look Zeke, the objective is to determine what is happening to the global climate. If you took the simple mean of the uncorrected records of the world’s 100 most pristeen stations – well distributed – you would have a far more trustworthy idea of that objective than all of the nonsense that you are currently doing. But then, probably nobody would fund you or publish you for doing that, right?
I think we’ve been having the same “homoginization” discussion for at least 4 years. Give it up. It’s a bad idea – almost as bad as extrapolating arctic and antarctic shore station data across a thousand kilometers of solid ice.
Maybe Zeke could tweet to Ward and Mandia that they are misusing or misreading his results, just as they misuse / misread everything else? Then we could discuss Zeke’s paper without the distraction of their offstage catawauling chorus…
Mike McMillan,
We examine both fully-homogenized USHCN data (using both v2 and the new v2.5 methods), data that has been homogenized using only rural stations (to avoid potentially [aliasing] in any urban signal), TOBs-only adjusted data, and raw data. Figures 1 and 2 show the resulting trends in urban-rural differences for each series and each urbanity proxy (nightlights, GRUMP, ISA, and population growth) using both station pairing and spatial gridding approaches.
‘Thanks Anthony. Here is the section of our paper discussing the tests we did around the potential for homogenization “spreading” urban signal to rural stations. ”
And just so you know Anthony as Zeke was working on this paper, your claim of “spreading” the urban bias through homogenization was at the front of Zeke’s mind. We talked about it a number of time. That is why they looked at homogenizing with rural only stations.. to put that issue to bed. Their approach does that.
So, your criticism was noted. A proceedure to insure that “spreading” didnt happen was used and folks can see the result.
UHI is real. and by using rural stations to detect and correct you can answer the ‘spreading” issue. That looks like an advance.
‘Look Zeke, the objective is to determine what is happening to the global climate. If you took the simple mean of the uncorrected records of the world’s 100 most pristeen stations – well distributed – you would have a far more trustworthy idea of that objective than all of the nonsense that you are currently doing. But then, probably nobody would fund you or publish you for doing that, right”
###
did that. the answer is the same.
david
‘I repeat my question to you from another thread which you have failed to answer. What is the justification for averaging anomalies from completely different baseline temperatures when these represent completely different flux levels? For example an anomaly of 1 from a base temperature of -30 represents a change in flux of 2.89 w/m2. How do you justify averaging it with an anomaly of 1 from a baseline of +30 which would represent a change in flux of 6.34 w.m2?”
Actually in berkeleyearth we dont use anomalies so your question doesnt apply.
We work in absolute temperature. no anomalies are averaged.
Thank you for playing.
cui bono,
I sent Mandia a tweet that our paper doesn’t address micro-scale siting biases, at least to the extent that they are uncorrelated with urban form.
steven mosher;
Actually in berkeleyearth we dont use anomalies so your question doesnt apply.
We work in absolute temperature. no anomalies are averaged.
Thank you for playing.
>>>>>>>>>>>>>
1. My question may not apply to Berkely Earth, but it most certainly applies to other temperature trends. Either you can justify their use of this or you can’t. Not to mention that my original question to you was posed on a thread in which you explained the value of anomalies. So, the truth is that while I am playing, you are just avoiding answering the question.
2. Am I to understand that you are averaging temperature from completely different temperature regimes? If so, how do you justify averaging a temperature of +30 which would represent 477.9 w/m2 with one of -30 which would represent 167.1 w/m2? Are you of the opinion that averaging such disparate temperature ranges has any value in understanding the earth’s at surface energy balance?
As Ernest Rutherford once said: “If your experiment needs statistics, you ought to have done a better experiment.”
Rutherford was known for his off-the-cuff disrespectful comments. That actually is pretty stupid. The results statistical analyses are used to design the next experiments. For one nice example, consider the Green Revolution.
I never quite understand why Menne and Hausfather think that they can get a good estimate of temperature by statistically smearing together all stations, the good, the bad, and the ugly, and creating a statistical mechanism to combine the data.
the issue you are trying to address is this: aggregating statistically improves the estimate of the most important effects (here changes across time), ** if ** the variation (across types of sites) is independent of important the important effects of interest. It is the obligation of the people using the statistical methods to show that such variation is in fact independent of the effects of interest. If you can’t do that, then you have to do what Watts et al (in preparation) have done, which is analyze the types of sites as different strata (at minimum estimating site-by-time interactions.) This is discussed in nearly all textbooks that include experiment design, hierarchical modeling, meta-analysis, multiple linear regression, repeated-measures analyses (including time series such as growth curves), and most of the commonly used techniques.
oops. I left in a redundant “important”.
As Ernest Rutherford once said: “If your experiment needs statistics, you ought to have done a better experiment.”
1. WUWT 2012 uses statistics
2. taking an average is statistics
3. testing the null requires statistics.
http://en.wikipedia.org/wiki/Rutherford_scattering
find the statistics.
or read this
http://www.lawebdefisica.com/arts/structureatom.pdf
yup he should have followed his own advice
david
“1. My question may not apply to Berkely Earth, but it most certainly applies to other temperature trends. Either you can justify their use of this or you can’t. Not to mention that my original question to you was posed on a thread in which you explained the value of anomalies. So, the truth is that while I am playing, you are just avoiding answering the question.
1. Your question DOES NOT apply. you should unfool yourself.
2. Anomalies do have a value, that is different than saying they are perfect.
3. They are used in other temperature series. The effect is to change the variance
during the overlap period. We’ve published on that aspect.
4. If the process of taking anomalies changed means or trends and you can show that, then a nobel prize awaits you.
5. I dont have time to answer every question. So, write up your nobel prize winner and do like zeke did. Do like Mcintyre. Do like Anthony. publish.
Matthew R Marler says:
February 13, 2013 at 12:54 pm <blockquote.
As Ernest Rutherford once said: “If your experiment needs statistics, you ought to have done a better experiment.”
You should understand however, that one of Rutherford’s most important papers, based on the 1909 Geiger-Marsden experiment with alpha particles being shot at a very thin gold foil held in a high vacuum, showed that (1) EXPERIMENTAL results override EVERY “theory” that even the most advanced atomic scientist of the time (J J Thomson) held
and (2) EXPERIMENTAL results are NOT subject to “statistical treatment” .
If the experimental results were held to CAGW “standards” he (Rutherford) should have thrown out every reflection of each alpha particle found at 90 degrees or more. By consensus, by conventional theory, by EVERY “scientific body”, by the conclusion of the top scientists in the world – despite Leif’s demands that solar energy discussion must begin with theory, not results – there could be NO reflection of alpha particles at the high angles infrequently observed, and thus, the results must be discarded “by statistics” …
Our modern atomic nuclear theory shows who was correct.
Doug Proctor says:
February 13, 2013 at 11:35 am
The Mauna Loa site is considered to represent the world vis-a-vis CO2 measurements. This paper looks to see if the Mauna Loa site could also represent the world vis-a-vis temperature readings.
What is VERY interesting is that the data from 1979 – 2009 shows
1) at noontime a DROP in temperatures of -1.4C/100, while
2) at midnight, a RISE in temperatures of +3.9C100.
The average, then is +1.25C/100. But this is an artefact of mathematics!
This might be a good example of David Hofer’s argument about actual energy. The change at midnight clearly represents lower energy changes per degree than those at noon. What was the actual temperatures?
one of -30 which would represent 167.1 w/m2?
>>>>>>>>>>>>>
well that would be the w/m2 for -40. You really should check your math before pressing post comment mr hoffer.
(Figured I may as well call myself out before anyone else got to it)
davidmhoffer wrote:
“Am I to understand that you are averaging temperature from completely different temperature regimes?”
Here’s a graph that demonstrates the need of regime definitions.
It shows the different slopes/trends of measurements taken at different times of day:
http://www.boels069.nl/Climate/SlopePerHourDeBiltNL.pdf
The same is true when using daily Tmax, Tmin and T(max-min).