A Considered Critique of Berkeley Temperature Series

Guest post by Jeff Id File:Berkeley earth surface temperature logo.jpg

I will leave this alone for another week or two while I wait for a reply to my emails to the BEST group, but there are three primary problems with the Berkeley temperature trends which must be addressed if the result is to be taken seriously.  Now by seriously, I don’t mean by the IPCC which takes all alarmist information seriously, but by the thinking person.

Here’s the points:

1 – Chopping of data is excessive.   They detect steps in the data, chop the series at the steps and reassemble them.   These steps wouldn’t  be so problematic if we weren’t worrying about detecting hundredths of a degree of temperature change per year. Considering that a balanced elimination of up and down steps in any algorithm I know of would always detect more steps in the opposite direction of trend, it seems impossible that they haven’t added an additional amount of trend to the result through these methods.

Steve McIntyre discusses this here. At the very least, an examination of the bias this process could have on the result is required.

2 – UHI effect.  The Berkeley study not only failed to determine the magnitude of UHI, a known effect on city temperatures that even kids can detect, it failed to detect UHI at all.  Instead of treating their own methods with skepticism, they simply claimed that UHI was not detectable using MODIS and therefore not a relevent effect.

This is not statistically consistent with prior estimates, but it does verify that the effect is very small, and almost insignificant on the scale of the observed warming (1.9 ± 0.1 °C/100yr since 1950 in the land average from figure 5A).

This is in direct opposition to Anthony Watts surfacestation project which through greater detail was very much able to detect the ‘insignificant’ effect.

Summary and Discussion

The classification of 82.5% of USHCNv2 stations based on CRN criteria provides a unique opportunity for investigating the impacts of different types of station exposure on temperature trends, allowing us to extend the work initiated in Watts [2009] and Menne et al. [2010].

The comparison of time series of annual temperature records from good and poor exposure sites shows that differences do exist between temperatures and trends calculated from USHCNv2 stations with different exposure characteristics. 550 Unlike Menne et al. [2010], who grouped all USHCNv2 stations into two classes and found that “the unadjusted CONUS minimum temperature trend from good and poor exposure sites … show only slight differences in the unadjusted data”, we found the raw (unadjusted) minimum temperature trend to be significantly larger when estimated from the sites with the poorest exposure sites relative to the sites with the best exposure. These trend differences were present over both the recent NARR overlap period (1979-2008) and the period of record (1895-2009). We find that the partial cancellation Menne et al. [2010] reported between the effects of time of observation bias adjustment and other adjustments on minimum temperature trends is present in CRN 3 and CRN 4 stations but not CRN 5 stations. Conversely, and in agreement with Menne et al. [2010], maximum temperature trends were lower with poor exposure sites than with good exposure sites, and the differences in

trends compared to CRN 1&2 stations were statistically significant for all groups of poorly sited stations except for the CRN 5 stations alone. The magnitudes of the significant trend differences exceeded 0.1°C/decade for the period 1979-2008 and, for minimum temperatures, 0.7°C per century for the period 1895-2009.

The non-detection of UHI by Berkeley is NOT a sign of a good quality result considering the amazing detail that went into Surfacestations by so many people. A skeptical scientist would be naturally concerned by this and it leaves a bad taste in my mouth to say the least that the authors aren’t more concerned with the Berkeley methods. Either surfacestations very detailed, very public results are flat wrong or Berkeley’s black box literal “characterization from space” results are.

Someone needs to show me the middle ground here because I can’t find it.

I sent this in an email to Dr. Curry:

Non-detection of UHI is a sign of problems in method. If I had the time, I would compare the urban/rural BEST sorting with the completed surfacestations project. My guess is that the comparison of methods would result in a non-significant relationship.

3 – Confidence intervals.

The confidence intervals were calculated in this method by eliminating a portion of the temperature stations and looking at the noise that the elimination created. Lubos Motl described the method accurately as intentionally ‘damaging’ the dataset.  It is a clever method to identify the sensitivity of the method and result to noise.  The problem is that the amount of damage assumed is equal to the percentage of temperature stations which were eliminated. Unfortunately the high variance stations are de-weighted by intent in the processes such that the elimination of 1/8 of the stations is absolutely no guarantee of damaging 1/8 of the noise. The ratio of eliminated noise to change in final result is assumed to be 1/8 and despite some vague discussion of Monte-Carlo verifications, no discussion of this non-linearity was even attempted in the paper.

Prayer to the AGW gods.

All that said, I don’t believe that warming is undetectable or that temperatures haven’t risen this century. I believe that CO2 helps warming along as the most basic physics proves. My objection has always been to the magnitude caused by man, the danger and the literally crazy “solutions”. Despite all of that, this temperature series is statistically speaking, the least impressive on the market. Hopefully, the group will address my confidence interval critiques, McIntyre’s very valid breakpoint detection issues and a more in depth UHI study.

Holding of breath is not advised.

5 1 vote
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

145 Comments
Inline Feedbacks
View all comments
mindert eiting
November 4, 2011 5:57 am

Dear Jeff,
How do you know that all those surface measurements reflect the earth and not the policy of the WMO? Here is a recipe for data fraud: each year you may include some new stations and some others may be dropped. Open stations as you like. However, before closing stations you should compute the slope of their private regressions. Even in a small area warming and cooling stations may co-exist. You only have to close more cooling stations than warming ones in order to create a warming world. It’s better to invest in methods of detecting data manipulation even for BEST. One method is survival analysis. Compute as dependent variable for each station the number of years it was on duty. Define as independent variables (1) latitude category, and (2) station’s melody. I defined the latter on 8000 GHCN stations as follows: compute regression slopes for the periods 1930-1969 and 1970-2009. Dichotomize them as up or down. You get four melodies, down-down, down-up, up-down, and up-up. It’s not surprising that stations on the northern hemisphere ‘live’ longer than in the tropics. But I found that their life expectancy also depends on melody, differences of more than six times the sum of the associated standard deviations. This is just some private exploration, but you may understand that I do not trust the data even if they are called BEST.

JR
November 4, 2011 6:00 am

Re: BillD
Where do you come up with the statement that …in arctic, where UHI is clearly not a factor.?
The trends since 2002 at CRN station 4 ENE of Barrow AK and at GHCN 42570026 (which is listed as rural in GHCN) are:
CRN: -1.2 C/century
GHCN: +7.9 C/century
I didn’t cherry-pick 2002 – it is when the CRN data start. But it sure suggests that UHI is happening even in the arctic.

Jason Calley
November 4, 2011 6:03 am

Rasey says: November 3, 2011 at 9:56 pm
Thank you! That was a wonderfully understandable post about how low frequency info can be disappeared and high frequency info emphasized.

Editor
November 4, 2011 6:07 am

Muller admits that – 27% of the Global Historical Climatology Network Monthly stations are located in cities with a population greater than 50000.
and that – currently GISS allow for this (UHI) effect by making adjustments which result in a reduction in global temperatures of about 0.01C over the period 1900-2009.
Something does not add up here.
http://notalotofpeopleknowthat.wordpress.com/2011/10/23/mullers-problem-with-uhi/

Latitude
November 4, 2011 6:12 am

If rural and UHI are trending the same…
…look for a reason
It could be something as simple as CO2 makes plants darker green……..

Matt
November 4, 2011 6:14 am

Jeff,
Thanks for the response.
“Perhaps you are unaware of this but if you take the time to correctly process satellite LTL data and compare it to ground data, there is a statistically significant difference in trend. i.e. detrend sat data, scale variance, retrend, regress, examine residuals. That is really all the confirmation of UHI that I need. So when a paper is published on non-detection of UHI, it is an example of go home and do it again.”
I appreciate that there may be a statistically significant difference between satellite and land trends, but there are important two important caveats on that:
1. I am not an expert on temp reconstruction and I don’t know if anyone quantifies systematic uncertainties for these reconstructions. But, when two measurements are made using very different methods, one cannot just compare the statistical uncertainties to claim a significant effect. One has to use the total uncertainty (including systematics). Otherwise you’ll detect a statistical significance that isn’t significant.
2. Even if there is a statistically significant difference between the satellite and thermometer trends, that is not “all the confirmation of UHI [you] need”. The claim that UHI accounts for the difference is purely speculative. It is not a totally unreasonable hypothesis and it goes in the right direction. But, it is speculation…and one of many possible explanations for the difference. The satellite measurements might not have siting problems, but they have plenty of challenges and unknowns of their own. Reconstructing surface temps from satellite data requires some extrapolation from models and I remind you that for much of the early history of the UAH, miscorrections for orbital decay gave the temp trend an opposite slope. So the discrepancy could be an overestimate in the thermometer record OR an underestimate in the satellite record OR a little bit of both OR just an underestimate of the error bars. There is simply no way to know absent further work (and perhaps some time). It certainly shouldn’t be “all you need” to confirm UHI.

JJ
November 4, 2011 6:22 am

Richard S Courtney says:
I write to point out a confusion that seems to exist in the minds of several posters to this thread;
i.e. UHI and local anthropogenic effects on temperature are not the same thing.

Yes, they very often are, with the only difference being one of scale. A ‘rural’ thermometer may yet be influence by proximity to ashphalt and other heat collecting surfaces, heat dissipating equipment, heated buildings, etc. UHI is nothing more or less than those same impacts aggregated over a larger volume.
This is why I stated that limiting analysis of UHI to only the largest aggregations of the effect – cities – was a effectively a semantic arguement. They are only looking for the UHI effect in ‘urban’ areas. Nonsense. The importance of the UHI mechnaism is its proximity to thermometers, not how close it is to hip-hop artists.

Pamela Gray
November 4, 2011 6:46 am

As I sit here surrounded by snow at pass level in the mountains (that I will have to negotiate later today), I am contemplating the significance of studies centered on the previous warming trend that has obviously stalled to anyone with enough sense to be able to read a thermometer.
Does it matter to my selection of coats and boots who did what to the data? Will it help me predict that the snow will be more wet, the rain less dry, the storm more extreme, or the resulting snow pack more, less, or somewhat rotten? Apparently it does to some people, mostly scientists who are trying to relocate the warming signal in a cooling oscillation. It may also matter to liberal voters who want onerous regulations over a gnats-ass trend. And it matters to statisticians who want analysis done right. To be sure, if warming comes round again, I would rather not go through the hysteria we have had as a daily meal shoved down my throat again.
But for today, my selection of a coat and boots to wear, as well as what time I should start out to safely nagivate the pass, will depend on common sense. OMG. Common sense. To those of you who are mesmerized by impending doom and see only dark days ahead in the now-you-see-it, now-you-don’t temperature trend BEST has relocated, ask your grandparents what common sense means.

mindert eiting
November 4, 2011 6:52 am

A second trial (perhaps I used a forbidden word). Has the BEST team done a survival analysis? I did it for 8000 GHCN stations. Define as dependent variable the number of years a station was on duty. Define as independent variables (1) latitude category, and (2) stations melody. Compute for the relevant stations the regression slopes for 1930-1969 and 1970-2009. Dichotomize these as up or down. You get four melodies. It’s not surprising that stations life expectancy depends on latitude: on the northern hemisphere stations are for more years on duty than in the tropics. Having controlled for that, suppose the expectancy depended on melody. What would you conclude? Before applying sophisticated statistics, quality controls are needed. I have my reasons not to trust these data.

kim
November 4, 2011 6:56 am

mindert 5:57.
Hum a few bars and they can fake it.
=============

Jeff Id
November 4, 2011 7:02 am

Matt,
You have made the assumption that I’ve made some blind claims which were not considered carefully. I’ve been at this for quite a while now though and have a few dozen links from the three and a half years of work put into this climate blog thing. Some of these have been run at WUWT in the past. My own global temp reconstruction which does nothing for UHI.
http://noconsensus.wordpress.com/2010/03/25/thermal-hammer-part-deux/
and comparisons of satellite and ground data including problems in both:
http://noconsensus.wordpress.com/2010/02/20/345/
http://noconsensus.wordpress.com/2009/11/09/statistical-significance-in-satellite-data/
http://noconsensus.wordpress.com/2009/01/30/tropospheric-temperature-trend-amplification/
http://noconsensus.wordpress.com/2009/01/23/bifurcated-temperature-trend/
http://noconsensus.wordpress.com/2009/10/28/satellite-temps-getting-closer/
Corrections to RSS:
http://noconsensus.wordpress.com/2009/01/19/satellite-temp-homoginization-using-giss/
The newer posts are more accurate than the old as my opinions changed and I learned. In addition to these posts, I have discussed the issues with various climate scientists and obviously have read a lot of literature on the subject. Actually, temperature reconstructions are the only thing I’m published on in climate science.
I always suggest to people that they make their own conclusions, but if you don’t spend the time looking at the differences and the magnitudes of the differences, a true familiarity cannot occur. It is my opinion that there is a real (statistically significant) difference between ground and satellite trends. Perhaps it is time for an updated post on that.
Another example you can see refuting the Berkeley UHI claim, is the difference between ground and ocean surface trends which also should not be ignored. Offsets are fine but why are trends different? How long can trends stay different? I have spent very little time on ocean data because this is a hobby. It seems a reasonable question which has been answered unsatisfactorily to date.

Mark Buehner
November 4, 2011 7:36 am

Phil Jones of all people demonstrated the UHI effect in his analysis of Chinese stations. I guess they blew right past embarrassment on that paper and moved on to pretending it doesn’t exist.

November 4, 2011 7:41 am

I understood UHI to account for 1% of land surface area. The book; “Living in the Endless City” states that it is 2% of land surface area. Characterising UHI with a simple % does not describe fully the nature of that UHI coverage and makes it easy to dismiss as too small. This approach appears to be in contrast to the view that a change in the atmosphere of 1% of 1% in terms of CO2 content is considered very significant.
Whether Urbanisation is 1% or 2%, this number does not describe the fact that nearly all urbanisation is concentrated around mid latitudes where the sun has more impact in heating the surface. That 1% or 2% now starts to look a lot bigger. Add to that urbanisation is further concentrated to cover large areas of conurbation such the US Eastern seaboard. UHI is not confined to the city limits and involves all ancilliary activities that cities require, such as food and water.
Not long ago a watershed was passed where more of us were now living in cities. That still leaves a massive 3.5 billion living in rural areas, which is twice what the total population was in 1900.

theduke
November 4, 2011 7:52 am

Jeff Id writes re Anthony’s project and BEST: “My guess is that the comparison of methods would result in a non-significant relationship.”
Can you expand and clarify on that?

Chris S
November 4, 2011 7:56 am

WORST not BEST. Without Objective or Robust Statistical Techniques.

November 4, 2011 8:15 am

Verity
ANSWER the question
“Steven Mosher says:
November 3, 2011 at 7:57 pm
Are you open to discussion of the upper bound…..?
In broad agreement. I read Steve’s excellent post when there were no comments on it. I haven’t got back to read the comments yet (will do so this evening).The upper bound of 0.1C/decade only goes back to the start of the satelite era and cannot be extrapolated backwards.”
############
I am not suggesting an extrapolation.
Do you agree with steve and spencer and Christy or not!
Answer, do not assume that I will make the move you suggest.. That will never happen

Matt
November 4, 2011 8:47 am

Thanks Jeff,
Will look at your material. However, might not have time to look at everything in depth and I couldn’t find any analysis that points to the UHI as the source of discrepancy between satellite and earth-based measurement or excludes the possibility of other affects? Could you point me specifically to that analysis?
Also, one point of clarification:
We are talking about a few percent discrepancy in the slopes, right? As I’ve suggested, measures of statistical significance are a little hairy. If you don’t account for systematics, I’d be a little skeptical. But, let’s assume that there is a statistically significant affect. It looks like the affect amounts to a 3% difference in slope between satellite and earth-based reconstruction over the lifetime of the satellite record. Am I right on that? Personally (and I acknowledge that this is a subjective statement) I’d say that it is pretty impressive that such vastly different methods as satellite and thermometer reconstructions come within 3% of each other. I also feel that a 3% affect is not large enough to justify Anthony’s claims that “it cannot be credibly asserted there has been any significant ‘global warming’ in the 20th century” or that “all terrestrial surface-temperature databases exhibit signs of urban heat pollution and post measurement adjustments that render them unreliable for determining accurate long-term temperature trends” (http://scienceandpublicpolicy.org/images/stories/papers/originals/surface_temp.pdf).

November 4, 2011 8:50 am

Mr. Bradly:
Yes, that might be true that a MIN-MAX thermometer was invented in 1782. (Daniel Gabriel Fahrenheit invented the alcohol thermometer in 1709). However – considering that there was a revelation about 6 years ago that the DEW and BMEWS operators were “fudging” the winter low (and even the day time high, if cold enough) numbers because it was TOO DAMNED COLD TO GO OUT AN TAKE READINGS..
And I would BEG of you to realise COST and SOPHISTICATION. TO IMPLY that all these readings were taken with Min – Max thermometer is disingenuous and mis-leading at best. It is complete “temperal provincialism”. Temporal provintialism in it’s basic sense is to judge everything in the future and everything in the past based on what we know now.
So, for example, putting today’s abilities and “quality assurance” standards on our forebearers, is as ill-legitimate as making computer programs to predict “climate” 50 and 100 years in the future from NOW.
Sorry, don’t buy the hand waving. Min-max themometers were NOT THE NORM in the 19th century. It was hand recorded data with the inherent errors. (So me diaries, sketches, records.)

November 4, 2011 8:51 am

Dang, I hate making trivial errors. “SHOW ME” not “So me”.
Max

Spen
November 4, 2011 8:51 am

I am still puzzled by confidence levels/error range. I assume the accuracy of the older temperature measurements was no better than +/- 0.5 deg C. Shouldn’t that degree of accuracy apply to the anomaly?

Don Monfort
November 4, 2011 9:11 am

Steve Mosher,
What’s wrong with the map? It was offered as an eyeball comparison with the lousy map in the BEST UHI paper (figure 2). Do you think that figure 2 belongs in a scientific paper? It’s mislabled (the black dots are not the rural sites but the very-rural sites), and what’s with the totally black USA? Yes I have seen their explanation, but I am sure you could have done much better than that. And anyone should be able to compare the maps and see that the allegedly very-rural stations (black dots) are not in major areas that are obviously the least populated places on the planet. Look at Africa, S. America and Australia. Where are the black dots that represent stations far from urban areas located? Look at Australia, clearly the black dots are not concentrated in the sparsely populated interior, but in the populated coastal areas. Same story in Africa and S America. Not too many black dots in the Sahara Desert, or the Amazon. Go to the cities to find the black dots. The paper says that the US appears to be very black, but only 18% of the sites are very-rural. Again look at Africa, S America, Australia and other places with large sparsely populated areas. See much black? Are you following me here, Steve?
Look, the point is not that they picked relatively less populated areas out of their 39,000 stations. The problem is that urban, one-horse town, burg, hamlet, suburban and blah…blah…blah areas with human influence on local temperature are heavily overrepresented in the data. The truly very-rural sites are generally not there, because there are very few stations in those places. Do you believe that this study has done what Muller claims for it? That it has laid to rest climate skeptics’ doubts regarding UHI effect in estimating global warming?
Oh, but some geniuses are claiming that this paper was not about UHI. Read the title of the paper, or actually read the paper, if you have a few minutes. And read Muller’s article in the WSJ, in which he falsely claimed that BEST compared urban to very-rural, distant from urban areas, and the climate skeptics got no case on UHI any more. See how many times UHI, or references to UHI are found in those things, which you have not yet read.

November 4, 2011 9:13 am

Steven Mosher says:
November 3, 2011 at 11:21 pm
So Theo.
from 1979 to 2010.
UAH has .18C of warming
Best has .28C of warming
===========================================
Maybe this can help……
click for an ugly graph
I think it is becoming increasingly clear that UHI is a term that adds to the confusion. 🙂
I also think is it clear that the BEST team has a ways to go before they’ll be current on the discussion. Rural and not very rural? lol

Don Monfort
November 4, 2011 9:39 am

Steve Mosher,
I forgot to address this:
“you are misunderstanding what they mean.
what they mean is that sites with No built pixels are very rural ( 16,000)
all other sites are called NOT very rural.. that includes urban”
I know that includes urban. It also includes rural, because with the tool they chose to use they cannot distinguish urban from rural. Is that the right tool to use in trying to find out something about UHI? Isn’t this BEST UHI study rather amateurish? Jeff’s criticism is valid, and that is why he hasn’t heard back from Prof. Dr. Muller.

November 4, 2011 9:51 am

Jeff
“All that said, I don’t believe that warming is undetectable or that temperatures haven’t risen this century. I believe that CO2 helps warming along as the most basic physics prove”
Sorry Jeff, which basics would that be? would that be different to mine- I don’t know if the net effect is warming or cooling.
http://www.letterdash.com/HenryP/the-greenhouse-effect-and-the-principle-of-re-radiation-11-Aug-2011
these data from BEST and what not are all meaningless without the maximas and minima’s? – which would prove natural warming and by how much?
http://www.letterdash.com/HenryP/more-carbon-dioxide-is-ok-ok

Keith
November 4, 2011 10:03 am

Matt says:
November 3, 2011 at 9:09 pm
The strongest thing Anthony’s paper seems to be saying is that the poor quality stations tend to amplify diurnal variation. That just adds to the noise, but as Anthony admits, the noise cancels out when you get an average trend. It seems that in Anthony’s own summary of the Fall et al paper, he admits that including the poor stations does not bias the average daily temperature trends towards more warming (if anything, towards less warming).

It said the opposite: “While the 30-year and 115-year trends, and all groups of stations, showed warming trends over those periods, we found that the minimum temperature trends appeared to be overestimated and the maximum warming trends underestimated at the poorer sites.” See http://wattsupwiththat.com/2011/05/11/the-long-awaited-surfacestations-paper/
In other words, minimum temps are rising faster than trend and maximum temps lower than trend at the poorer sites, giving a smaller diurnal variation. This will reduce noise in the data, giving erroneously reduced error bars and confidence intervals, unless this is catered for through UHI/siting adjustments/corrections. Reduced diurnal variation, particularly in winter, is a key prediction of CO2 warming thoery, so poor siting gives a false degree of confidence that CO2 is indeed having this forecast effect.
Too many people seem to have misunderstood and failed to pick up this key finding from the Fell et al paper, instead proclaiming that the trend of ‘average’ (min-max/2) is unaffected by siting issues and that therefore the Surfacestations project was a waste of time. None so blind, etc…