NOAA's National Climatic Data Center caught cooling the past – modern processed records don't match paper records

We’ve seen examples time and again of the cooling of the past via homogenization that goes on with GISS, HadCRUT, and other temperature data sets. By cooling the data from the past, the trend/slope of the temperature for the last 100 years increases.

This time, the realization comes from an unlikely source, Dr. Jeff Masters of Weather Underground via contributor Christopher C. Burt. An excerpt of the story is below:

Inconsistencies in NCDC Historical Temperature Analysis

Jeff Masters and I recently received an interesting email from Ken Towe who has been researching the NCDC historical temperature database and came across what appeared to be some startling inconsistencies. Namely that the average state temperature records used in the current trends analysis by the NCDC (National Climate Data Center) do not reflect the actual published records of such as they appeared in the Monthly Weather Reviews and Climatological Data Summaries of years past. Here is why.

An Example of the Inconsistency

Here is a typical example of what Ken uncovered. Below is a copy of the national weather data summary for February 1934. If we look at, say Arizona, for the month we see that the state average temperature for that month was 52.0°F.

The state-by-state climate summary for the U.S. in February 1934. It may be hard to read, but the average temperature for the state of Arizona is listed as 52.0°F From Monthly Weather Review.

However, if we look at the current NCDC temperature analysis (which runs from 1895-present) we see that for Arizona in February 1934 they have a state average of 48.9°F, not the 52.0°F that was originally published:

Here we see a screen capture of the current NCDC long-term temperature analysis for Arizona during Februaries. Note in the bar at the bottom that for 1934 they use a figure of 48.9°.

Ken looked at entire years of data from the 1920s and 1930s for numerous different states and found that this ‘cooling’ of the old data was fairly consistent across the board. In fact he produced some charts showing such. Here is an example for the entire year of 1934 for Arizona:

The chart above shows how many degrees cooler each monthly average temperature for the entire state of Arizona for each month in 1934 was compared to the current NCDC database (i.e. versus what the actual monthly temperatures were in the original Climatological Data Summaries published in 1934 by the USWB (U.S. Weather Bureau). Note, for instance, how February is 3.1°F cooler in the current database compared to the historical record. Table created by Ken Towe.

Read the entire story here: Inconsistencies in NCDC Historical Temperature Analysis

================================================================

The explanation given is that they changed from the  ‘Traditional Climate Division Data Set’ (TCDD) to a new ‘Gridded Divisional Dataset’ (GrDD) that takes into account inconsistencies in the TCDD. “.

Yet as we have seen time and time again, with the exception of a -0.05°C cooling applied for UHI (which is woefully under-represented) all “adjustments, improvements, and fiddlings” to data applied by NCDC and other organizations always seem to result in an increased warming trend.

Is this purposeful mendacity, or just another example of confirmation bias at work? Either way, I don’t think private citizen observers of NOAA’s Cooperative Observer Program who gave their time and efforts every day for years really appreciate that their hard work is tossed into a climate data soup then seasoned to create a new reality that is different from the actual observations they made. In the case of Arizona and changing the CLimate Divisions, it would be the equivalent of changing state borders as saying less people lived in Arizona in 1934 because we changed the borders today. That wouldn’t fly, so why should this?

Sure there are all sorts of “justifications” for these things published by NCDC and others, but the bottom line is that they are not representative of true reality, but of a processed reality.

h/t to Dr. Ryan Maue.

UPDATE: Here’s a graph showing cumulative adjustments to the USHCN subset of the entire US COOP surface temperature network done by Zeke Hausfather and posted recently on Lucia’s Blackboard:

This is calculated by taking USHCN adjusted temperature data and subtracting  USHCN raw temperature data on a yearly basis.  The TOBS adjustment is the lion’s share.

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
5 1 vote
Article Rating
193 Comments
Inline Feedbacks
View all comments
Editor
June 7, 2012 3:56 am

The original monthly hand written/typed monthly temperature and rainfall records can all be accessed here for individual stations right back to Year Dot.
http://www7.ncdc.noaa.gov/IPS/coop/coop.html
Also there is a very useful monthly state summary here, which shows the actual stations which actually build up the state average.
http://www7.ncdc.noaa.gov/IPS/cd/cd.html
I will have a closer look when I get a minute, but anyone can have ago themselves, as everything you need is there.

Blade
June 7, 2012 4:24 am

Gail Combs [June 6, 2012 at 7:12 pm] says:
At the rate the US temperature data is changing the records we will have a mile of ice sitting on New York City and the data will still be showing a warming trend.
http://jonova.s3.amazonaws.com/graphs/giss/hansen-giss-1940-1980.gif

LOL funny but true!

beng
June 7, 2012 4:42 am

****
Auto says:
June 6, 2012 at 1:56 pm
[On present trend, DC and much of Maryland will be covered by a kilometer-deep ice cap by 2025, extrapolating the cooling seen last night, using a model I made (up) earlier. /Sarc. /Not real!]
****
It’s a rather system-shocking 40F here in west MD this morning. My tomatoes/okra just can’t get any love…

Caleb
June 7, 2012 5:07 am

For those who are just waking up to the fun and games involved with “adjustments,” I refer you to the Climate Audit post of August 8, 2007. http://climateaudit.org/2007/08/08/a-new-leaderboard-at-the-us-open/
This was my personal wake-up-call.
However once you get the hang of playing with adjustments, it can be quite helpful in terms of feeling better about your personal budget. Rather than depressed about being broke, you can make a few adjustments-for-inflation, playing with various “proxies.” For example, the price of gold per ounce has gone up since 1865, while the price of aluminum has fallen. Pick your proxie, extend your “trend-line,” and guess what!!? You’re not broke after all!!!

Phil C
June 7, 2012 5:31 am

BTW< you should know that you've violated site policy by changing names. You've previously commented here as John Parsons and now are commenting as "atarsinc".
Why do you care?
[REPLY: Because commenting under multiple names is sock-puppetry. Because even anonymous commenters should be accountable. Why do you feel a need to question our policy? -REP]

RockyRoad
June 7, 2012 5:33 am

So now with these revelations, the NCDC can’t claim (with any validity) to be a “Data Center”. Now they’re just the NC. And with “Data Center” gone, they can’t claim to be “National” or “Climatic” either.
Now they’re nothing. How sad.

Phil C
June 7, 2012 7:04 am

Because commenting under multiple names is sock-puppetry.
A tautology here. I’ll move on …
Because even anonymous commenters should be accountable.
Why? What does it matter? In all the posts I’ve read here for a number of years, I see no harm in someone wishing to remain anonymous. Perhaps you could offer one up. And I’m at a loss as to what specifically these people should be held to account for. It’s the content of the post, the value of the argument and not the person’s name that is relevant to the discussion.
Why do you feel a need to question our policy?
Because I’ve seen valid discussions here cut short because someone violated this policy that has nothing to do with the discussion. And it seems like every time that happens, it happens to someone who argues against the original post. Meanwhile, I see plenty of comments of an ad hominem nature (I have been subject to some of those) coming from people who don’t use their real names, yet I can’t recall an instance where the moderators stepped in to that unless the person making the comment was challenging the original author. In other words, your decision to intervene with your policy regarding names appears to be applied based on the content of the poster’s argument.
[REPLY: “Tautology?” It seems you really do need everything spelled out for you. Sock-puppetry is a form of astro-turfing. Even if you are not concerned about that sort of behavior, we are. Remaining anonymous is one thing, but making contradictory statements or constantly repeating statements that have already been addressed is another. In that sense, even anonymous commenters are accountable for what they have claimed and for their actions on this site. Your last complaint is just plain false. This morning I snipped a comment that could well be described as skeptic for sock-puppetry. You didn’t see the comment because it was snipped. Anthony and his moderators bend over backwards to give every sincere comment (and not-so-sincere, like many of yours) a fair airing. Many other sites do not. This is Anthony’s living room on the internet and he does not have to entertain every insult and innuendo people may care to utter. Don’t like it? Tough. Now, this conversation is over. Moderation policy is NOT open for discussion or debate and any further comments aling this line will be deleted. That’s tough, too. -REP

Mark
June 7, 2012 7:04 am

Myron Mesecke says:
Why is it that the older data that had less influence from man made structures, roads, changes to land, no air conditioning units, fewer parking lots and smaller airports is considered to have inconsistencies and must be “adjusted”?
If the aim were to compensate for UHI then “cooling” historical data makes no sense. You’d either want to “cool” recent data or “warm” historical data.
The “curve” given makes even less sense, it’s simply the wrong shape for any kind of UHI “compensation”.
I wonder how adjusted and unadjusted plots would compare.

Mark
June 7, 2012 7:37 am

Steven Mosher says:
1. changing the time of observation from mid day to morning or mid night to noon, WILL create a bias. Depending on the time and place that bias can be positive or negative, large or small.
Given that there are at least 5 possible ways to work out time of day things get especially tricky when you are trying to compare different sites. In plenty of parts of the world timezones are radically different from local time according to longitude even without DST. “Midnight” according to timezone may equate to “somewhere between 21:30 and 02:30” according to longitude.

Maus
June 7, 2012 7:48 am

Christopher C. Burt: 11:04 “This is why, for the sake of determining long-term trends, it is not possible to simply use the same raw data from 1934 as in, say, 2012.”
Assume this claim is true. And yet you claim that moving city hall a few blocks is sufficient to destroy our capability to detect or measure the long term trend. For both of these to be true it must be that the long-term trend is significantly less than the difference in temperatures over a distance of one-half of one mile. It follows then that to even potentially detect such a small long-term trend that there would need to be stations positioned at a minimum of 1/4 of one mile distant from another. (Nyquist, etc.)
But if that is the case there is no significance to 8 of 78 stations being in cities and speaking of ‘warmest counties or regions’ if it is that the best coverage of those 78 stations is 15.3153 (Rounding up) square miles out of region comprising 113,594.08 square miles. Or a valid sample coverage of 13/1000ths of 1% or the area in interest.
Such that if we assume, to begin with, that the samples are not randomly distributed then everything is as you say it is. But if we assume they are not randomly distributed then it is then it is certain that the only thing that can be said is that temperature averages and trends, whether gridded or not, are they are like unicorns. A fantasy for now, but will make someone filthy, stinking rich if they can ever be actually implemented.

JR
June 7, 2012 8:00 am

Re: Steve Mosher 9:29pm
Please show one example of a station that was moved 500 meters or more in elevation and retained the same station ID. A move like that should result in a new ID being assigned after the move.

alex the skeptic
June 7, 2012 8:17 am

Exerpt fron Georeg Orwell’s Ninteen Eightyfour (Part 3, chapter 2):
>>O’Brien was looking down at him speculatively. More than ever he had the air of a teacher taking pains with a wayward but promising child.
‘There is a Party slogan dealing with the control of the past,’ he said. ‘Repeat it, if you please.’
‘”Who controls the past controls the future: who controls the present controls the past,”‘ repeated Winston obediently.
‘”Who controls the present controls the past,”‘ said O’Brien, nodding his head with slow approval. ‘Is it your opinion, Winston, that the past has real existence?’
Again the feeling of helplessness descended upon Winston. His eyes flitted towards the dial. He not only did not know whether ‘yes’ or ‘no’ was the answer that would save him from pain; he did not even know which answer he believed to be the true one.
O’Brien smiled faintly. ‘You are no metaphysician, Winston,’ he said. ‘Until this moment you had never considered what is meant by existence. I will put it more precisely. Does the past exist concretely, in space? Is there somewhere or other a place, a world of solid objects, where the past is still happening?’
‘No.’
‘Then where does the past exist, if at all?’
‘In records. It is written down.’
‘In records. And –?’
‘In the mind. In human memories.’
‘In memory. Very well, then. We, the Party, control all records, and we control all memories. Then we control the past, do we not?’
‘But how can you stop people remembering things?’ cried Winston again momentarily forgetting the dial. ‘It is involuntary. It is outside oneself. How can you control memory? You have not controlled mine!’
O’Brien’s manner grew stern again. He laid his hand on the dial.
‘On the contrary,’ he said, ‘you have not controlled it. That is what has brought you here. You are here because you have failed in humility, in self-discipline. You would not make the act of submission which is the price of sanity. You preferred to be a lunatic, a minority of one. Only the disciplined mind can see reality, Winston. You believe that reality is something objective, external, existing in its own right. You also believe that the nature of reality is self-evident. When you delude yourself into thinking that you see something, you assume that everyone else sees the same thing as you. But I tell you, Winston, that reality is not external. Reality exists in the human mind, and nowhere else. Not in the individual mind, which can make mistakes, and in any case soon perishes: only in the mind of the Party, which is collective and immortal. Whatever the Party holds to be the truth, is truth. It is impossible to see reality except by looking through the eyes of the Party. That is the fact that you have got to relearn, Winston. It needs an act of self-destruction, an effort of the will. You must humble yourself before you can become sane.<<

June 7, 2012 8:20 am

noaaprogrammer says:
June 6, 2012 at 2:55 pm
Benford’s Law requires the data to span several orders of magnitude to be valid. The temperature data or its anomalies do not do so, so the fraud-test would not work.
But your comment cause me (and I suppose many others) to learn about Benford’s Law, which is impressive!
Years ago, while making thousands of estimations of depths in my job as a petroleum geologist, I wondered if I unconsciously chose certain numbers at the 0.x meter level. I didn’t come to a conclusion – I stopped believing there was any validity to the precision as I thought about it further, so I stopped caring – but I still expect there is. Reading the thermometer would have the same intrinsic situation.
Error bars on pre-mechanical temperature readings strike me as something that should never be less than the error possible on one reading. As I understand it, you can reduce your error by a square root function if the measurement is of the same object by the same means by the same operator (if personal choice is a factor), but in temperatures, each reading is unique either by place, time or operator. There is no “true” value around which a measuring device or operator wanders randomly.
I also wonder about the median value in temperature or other readings, as shown for the value we are supposed to note. Do we not tend to measure higher rather than lower if the day is hot, and lower than higher if the day is cold? Are our instruments not more sensitive to changes up or down? So should the “median” curve not be high or low weighted? We only center it because we think the errors are equal high/low and random. If neither assumption is true, then the “median” should be off-median.
Another: I throw this out one more time (at least until I see a response): look at the older records, wherein the error bars are wide. The median value looks like a damped wave, as if a long-period smoothing function has been applied to the data. As we come towards the present, the error bars shrink as the temperature median departs from the past. However, if we were to apply a similar long-period smoothing function to the present (presuming we had future data to allow this) the last 100 years of temperature changes would exhibit less variation and, for the 20th century, a reduced temperature rise.
More like the past.
So: have we got a statistical problem going on, in which older data is apple-like, and newer data, orange-like? Can we not really say that in the past, were we able to see the true temperature averages, we would see a similar frequency distribution, such that highs and lows could more reflect what we see recently with new, better data?
Is it more reasonable not to say that today’s temperatures are valid to compare to the post-1945 days, perhaps, but prior to then we can only say that the warmer decades were WARMER than the median value, and the cooler decades, COLDER than the median value, and that the top and bottom of the error bars was, at times, actual data, not error, and on a decadal level?
It is said by one and all that today is warmer than – today, in the Calgary Herald – since humans evolved. This is based on the current medial values as shown in the temperature reconstruction graph/data used regularly by amateur and professional alike. I wonder if this is a valid statement to make when the nature of the median value is taken into account.

June 7, 2012 8:21 am

This is fraud!! No way NOAA is getting away with this.

Howling Winds
June 7, 2012 8:51 am

One thing that seems to be apparent in all of this, is the simple notion of the accuracy of the so-called “data”. Although I am an admitted skeptic, I have a difficult time with the idea that thousands upon thousands of temperature readings take over hundreds of years, can have any precision at all, and certainly not within more than a few degrees in either direction.

rilfeld
June 7, 2012 8:54 am

The citrus band and species suitability by temperature charts from the DOA over the years aren’t showing much warming. This points up that the political implications depend on “catastrophic” global warming. Nothing in the record suggest a climate change that we can’t accomodate by moving 100 miles north or south. I don’t think voters would be moved to immolate their economies on the alter of global warming if it were expressed as making North Dakota more like South Dakota, yet that’s the reality of even the “cooked” data
Those of us of a certain age, who remember that the first “Earth Day” was based on catastrophic anthropogenic global cooling, and are amused that certain prophets of doom such as Paul Erlich have made a living arguing both sides of the street, called BS long ago.
I understand the call for lawyering, but the combination of sovereign immunity and the argument of most that they were following what they though was legitimate science (‘I was only following orders’) will certainly make that futile.
Anthony, you are on the correct course: sunshine is the best antiseptic.
As “greened” economies fail, and money dries up, we’ll have a new fake crisis that requires us to give up our lives to the statists in due course….the food police seem to be on the rise at the moment.
Age, again, causes these things to be viewed as circular…..
Grocery stores use paper bags – very efficient, and reusable by consumers.
Tree huggers scream, irrationally as the pulp comes from farmed, purpose grown trees.
We switch to plastic.
Anti carbon folks scream, and discards kill fish and birds and clog drains, and even kill babies in spite of the multilingual printer warnings..
We switch to government mandated, inconvenient cloth bags. They harbor nasty bacteria and people hate having to manage them.
Some stores offer paper again , still from purpose grown trees and still fully recyclable. And find as before, it is the most reliable lowest cost alternative. And
disgruntled sports fans once again have something to wear to the games.

Ripper
June 7, 2012 9:34 am

Steven Mosher says:
June 6, 2012 at 9:29 pm
Others use a standard lapse rate.
===========================================================================
That is the theory, but it often is opposite to what happens in the real world.
Tell me which station here is at the higher altitude.
http://members.westnet.com.au/rippersc/hcjanjulycomp.jpg

NotSure
June 7, 2012 9:37 am

“Of these 78 sites 3 were in the city of Phoenix (Airport, USWB site, and Indian School), 3 were in Yuma (Citrus Station, USWB, and Valley site), and 2 were in Tucson (Airport and Univ. of Arizona campus). So 8 (more than 10%) of the 78 sites for the entire state were located in three of the warmest cities in the state. Furthermore, 27 of the 78 sites were in Maricopa and Yuma counties, the two warmest counties in the state that comprise 12.6% of the state’s landmass yet account for 34.6% of the observation sites.”
I don’t understand this. How do you know which cities and areas are “warmest”? You can’t compare against an unknown. Maybe some of the areas for which we have no measurements were actually warmer than the areas you say are warmest. We don’t know and can’t know. Comparing against an unknown is like division by zero, the result is undefined.

June 7, 2012 10:03 am

How funny. Everytime the 1930’s are adjusted they become cooler for the US. Next thing you know they will be telling us that the 1930’s was a mini ice age and that our planet should be inside of an ice age, but due to evil man and his artificial ways, the climate is now not in an ice age. /sarc
Boggles the mind if I might say so. But in any regard, I don’t see how changing anything data wise means much especially in such a state as Arizona with mountains that create micro-climate zones. You really can not tell the “average” temperature over an arbritrary area such as the state because frankly your guess is as good as any because of all the micro-climate zones. We could figure it out for today but for the 1930’s…forget it.
And I say we “could” because if we used more thermometers that are not at airports and that are not in urban locations we “could” get a good representative average, but frankly we won’t because frankly every adjustment must be the same and tell the same story. Reminds me of religions and how they changes history to advance their cause.

Ian W
June 7, 2012 10:15 am

I have yet to see it explained what ‘average temperature’ actually means.
Yes I realize that two numeric values are added together and then divided by two – to give a numeric mean. But behind that are several completely invalid assumptions.
Examples:
* The lowest temperature of the day is just before dawn and the highest temperature of the day is late afternoon. This leads to all these time of day corrections – but we all know of days where that is not the case. As someone pointed out – what about a maximum and minimum thermometer readings – just once at midnight then reset. But both of these approaches are failing to quantify how LONG temperature was at a certain level. The assumption is of a consistent, smooth, temperature curve which is patently false in most temperate areas of the world.
* The intent of this whole process and the reason it has moved from a poorly funded arcane exercise to a trillion dollar industry, is to prove (falsify) the hypothesis of emissions from human activity leading to more heat energy (watts per square meter) being trapped in the Earth’s system. Measuring temperature alone will not provide this information – it is the incorrect metric. Creating a mathematical mean temperature is even less meaningful. What is needed are temperature and humidity observations at regular intervals. The atmospheric enthalpy and heat content in Kilojoules per Kilogram -at the time of observation – can then be calculated. These observations should then be repeated perhaps at hourly intervals and the overall daily integral of heat content in KJ/Kg can be quantified. Climate ‘scientists’ really should understand the metrics that they are using. A misty 75degF afternoon in a Louisiana Bayou after a thundershower has around twice the heat energy of an almost zero humidity 100degF afternoon in Arizona. Averaging these values is nonsensical. It IS heat energy this is all about — so called green house gases trapping outgoing longwave radiation and the heat trapped causing ‘climate change’?
Measuring temperature (and fudging the figures) and trying to show cleverness by nitpicking time-of-observation – may be a useful argument with non-scientists like politicians and the media but it is abject nonsense: So one wonders why so much time is spent by supposed experts NCDC and NOAA fudging incorrect metrics. This would appear to be solid evidence of an intentional hoax – I don’t believe the members of these agencies are so ignorant of physics – but they seem to believe that everyone else is.

Hugh K
June 7, 2012 10:31 am

atarsinc says:
June 6, 2012 at 8:48 pm
Especially for Gail: What does a cold snap in Oregon have to due with AGW? Answer: absolutely nothing. Thanks for the weather report. We’re all enthralled. JP
I missed the part in Gails post where she compared a cold snap w/AGW…could you please point me to that in her comment?
Sadly it has come down to exactly that – to be enthralled with the use of raw data. I found that refreshing. However, alarmists need not worry…this was obviously an oversight on Gail’s part and I’m sure before expressing her concern for farmers in the future, Gail will wait for the adjusted Fed version. Of course the farmers will still suffer the same amount of loss but at least they won’t suffer as much anxiety after the fact thinking it really wasn’t as cold as it really was.

June 7, 2012 10:46 am

A few things:
1) As Mosher and other have noted, some adjustments to the record are needed to correct for TOBs changes, station moves, instrumentation changes, etc. Its actually rather impressive how often stations moved from urban rooftops to newly created airports in the U.S. in the 1930s and 1940s, resulting in some rather large step changes in temperature.
Since some adjustments are needed, you should have an adjustment method that ideally is a) algorithmic, so it can automatically detect and correct for inhomogenities without manual (and potentially biased) observer corrections and b) has been extensively tested against data with different types of inhomogenities to ensure that it does not introduce systemic biases. For those who have not come across it, the Menne, Williams, and Thorne paper last year is a good example of this sort of testing and well worth a read: ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/williams-menne-thorne-2012.pdf
Its also worth pointing out that the Berkeley homogenization method, which differs quite a bit from that employed by USHCN, produces rather similar results (with even more “cooling of the past” than USHCN): http://rankexploits.com/musings/2012/a-surprising-validation-of-ushcn-adjustments/
.
2) Its worth pointing out that while the simple majority of adjustments are positive, its far from the vast majority. About 40% of the adjustments actually cool the present or warm the past (and thus decrease the trend). You can see a distribution of USHCN adjustments in Fig. 6 of Menne et al 2009 here: ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/menne-etal2009.pdf
.
3) You really need to separate out the effects of the adjustments from the effects of different spatial interpolation and the use of anomalies. The latter should be far less controversial, as state average temperatures will be a far more accurate reflection of reality if they use the spatially-weighted anomalies rather than a simple average of non-spatially weighted absolutes.
_________
Also, Bill Illis, the data used in creating the chart shown above uses the latest USHCN data as of about three weeks ago. USHCN adjustment procedures have not been changed since 2010 or so, and the updated GHCN 3.1 data has no effect on USHCN, as they are separate (albeit overlapping) networks.

Tom Stone
June 7, 2012 11:13 am

It is like the Stalinist tactic of airbrushing “inconvenient” people from government photographs.
REPLY: No it is not anything like that. The original Monthly Weather Review is still there. If that disappears, then you’ll have a point – Anthony

John Day
June 7, 2012 11:26 am

C aka atarsinc

1. draw a tic-tac-toe grid (3 x 3 square)
2. fill in all 9 squares with a temperature reading.
3. fill in the 3 top squares again with an additional temperature reading.
4. You’ve now got 12 readings: average them. That’s the old method (TCDD).
Is that the average for the entire area? Of course it isn’t. You’re taking too many readings in the top row. To correct for the bias in the top row, you should first average the two numbers in each square of the top row, and then use those three readings with the remaining six to find your average over the entire area.

Wrong. The arithmetic mean function (‘expectation operator’) is idempotent and associative, so you’ll get the same result, regardless of the order or number of sub-arrangments made: E[a,b]=E[E[a,b]]=E[E[E[a,b]]] etc.
Also, the mean function (unlike the variance function) is an unbiased estimator , so increasing the weights of randomly selected terms should not change the expected value (i.e. bias=0). (Some will pull up, other will pull down, so changes will cancel out to zero in the limit)
The bias here, if any, seems to be caused by the way the grids were selected, if I understood the article correctly (I skimmed it). It looks like the old ‘selection bias’ problem again.
So, if the small grids tend to be around large urban areas and the larger grids tend to be much larger, relatively unpopulated areas, then the larger grids (i.e. rural areas) are “unfairly” represented in the sense that their ‘coolness’ is diminished by the influence of the warmer (UHI) urban areas, if the grids with larger area have equal weight with smaller areas.
To make it fair (“unbiased”) either the grids should be weighted by area, or equivalently, make all the grids the same size. (Or, thirdly, relocate thermometers to eliminate UHI bias)
So I have no problem with this attempt to remove bias provided it is not used to make prejudiced comparisons with the older biased statistics. In other words, we might need some “fudge factors”, to eliminate the bias. Like the ones the solar scientists use to try to make unbiased estimates with historical and current sunspot data. (But bias still exists there, according to Leif Svalgaard).
😐

June 7, 2012 11:46 am

Christopher C. Burt says:
June 6, 2012 at 11:04 pm
So 8 (more than 10%) of the 78 sites for the entire state were located in three of the warmest cities in the state.

So let us presume for argument sake that these 10% of the sites were 3 F warmer than the rest of the state. Then the average would be 0.3 F higher than it ought to be if we totally neglect the area of the cities. But since the average went down 3.1 F, would that not mean the cities were 30 F higher than the country side? However I realize the statement below also has to be factored in:
Furthermore, 27 of the 78 sites were in Maricopa and Yuma counties, the two warmest counties in the state that comprise 12.6% of the state’s landmass yet account for 34.6% of the observation sites.
So exactly how much warmer were the extra 22% of the sites? Everything could still be legitimate, but I would be happy if an impartial person audited everything. I have just read too much negative information about adjustments to data.