Comparing GHCN V1 and V3

Much Ado About Very Little

Guest post by Zeke Hausfather and Steve Mosher

E.M. Smith has claimed (see full post here: Summary Report on v1 vs v3 GHCN ) to find numerous differences between GHCN version 1 and version 3, differences that, in his words, constitute “a degree of shift of the input data of roughly the same order of scale as the reputed Global Warming”. His analysis is flawed, however, as the raw data in GHCN v1 and v3 are nearly identical, and trends in the globally gridded raw data for both are effectively the same as those found in the published NCDC and GISTemp land records.

clip_image002

Figure 1: Comparison of station-months of data over time between GHCN v1 and GHCN v3.

First, a little background on the Global Historical Climatology Network (GHCN). GHCN was created in the late 1980s after a large effort by the World Meteorological Organization (WMO) to collect all available temperature data from member countries. Many of these were in the form of logbooks or other non-digital records (this being the 1980s), and many man-hours were required to process them into a digital form.

Meanwhile, the WMO set up a process to automate the submission of data going forward, setting up a network of around 1,200 geographically distributed stations that would provide monthly updates via CLIMAT reports. Periodically NCDC undertakes efforts to collect more historical monthly data not submitted via CLIMAT reports, and more recently has set up a daily product with automated updates from tens of thousands of stations (GHCN-Daily). This structure of GHCN as a periodically updated retroactive compilation with a subset of automatically reporting stations has in the past led to some confusion over “station die-offs”.

GHCN has gone through three major iterations. V1 was released in 1992 and included around 6,000 stations with only mean temperatures available and no adjustments or homogenization. Version 2 was released in 1997 and added in a number of new stations, minimum and maximum temperatures, and manually homogenized data. V3 was released last year and added many new stations (both in the distant past and post-1992, where Version 2 showed a sharp drop-off in available records), and switched the homogenization process to the Menne and Williams Pairwise Homogenization Algorithm (PHA) previously used in USHCN. Figure 1, above, shows the number of stations records available for each month in GHCN v1 and v3.

We can perform a number of tests to see if GHCN v1 and 3 differ. The simplest one is to compare the observations in both data files for the same stations. This is somewhat complicated by the fact that station identity numbers have changed since v1 and v3, and we have been unable to locate translation between the two. We can, however, match stations between the two sets using their latitude and longitude coordinates. This gives us 1,267,763 station-months of data whose stations match between the two sets with a precision of two decimal places.

When we calculate the difference between the two sets and plot the distribution, we get Figure 2, below:

clip_image004

Figure 2: Difference between GHCN v1 and GHCN v3 records matched by station lat/lon.

The vast majority of observations are identical between GHCN v1 and v3. If we exclude identical observations and just look at the distribution of non-zero differences, we get Figure 3:

clip_image006

Figure 3: Difference between GHCN v1 and GHCN v3 records matched by station lat/lon, excluding cases of zero difference.

This shows that while the raw data in GHCN v1 and v3 is not identical (at least via this method of station matching), there is little bias in the mean. Differences between the two might be explained by the resolution of duplicate measurements in the same location (called imods in GHCN version 2), by updates to the data from various national MET offices, or by refinements in station lat/lon over time.

Another way to test if GHCN v1 and GHCN v3 differ is to convert the data of each into anomalies (with baseline years of 1960-1989 chosen to maximize overlap in the common anomaly period), assign each to a 5 by 5 lat/lon grid cell, average anomalies in each grid cell, and create a land-area weighted global temperature estimate. This is similar to the method that NCDC uses in their reconstruction.

clip_image008

Figure 4: Comparison of GHCN v1 and GHCN v3 spatially gridded anomalies. Note that GHCN v1 ends in 1990 because that is the last year of available data.

When we do this for both GHCN v1 and GHCN v3 raw data, we get the figure above. While we would expect some differences simply because GHCN v3 includes a number of stations not included in GHCN v1, the similarities are pretty remarkable. Over the century scale the trends in the two are nearly identical. This differs significantly from the picture painted by E.M. Smith; indeed, instead of the shift in input data being equivalent to 50% of the trend, as he suggests, we see that differences amount to a mere 1.5% difference in trend.

Now, astute skeptics might agree with me that the raw data files are, if not identical, overwhelmingly similar but point out that there is one difference I did not address: GHCN v1 had only raw data with no adjustments, while GHCN v3 has both adjusted and raw versions. Perhaps the warming the E.M. Smith attributed to changes in input data might in fact be due to changes in adjustment method?

This is not the case, as GHCN v3 adjustments have little impact on the global-scale trend vis-à-vis the raw data. We can see this in Figure 5 below, where both GHCN v1 and GHCN v3 are compared to published NCDC and GISTemp land records:

clip_image010

Figure 5: Comparison of GHCN v1 and GHCN v3 spatially gridded anomalies with NCDC and GISTemp published land reconstructions.

If we look at the trends over the 1880-1990 period, we find that both GHCN v1 and GHCN v3 are quite similar, and lie between the trends shown in GISTemp and NCDC records.

1880-1990 trends

GHCN v1 raw: 0.04845 C (0.03661 to 0.06024)

GHCN v3 raw: 0.04919 C (0.03737 to 0.06100)

NCDC adjusted: 0.05394 C (0.04418 to 0.06370)

GISTemp adjusted: 0.04676 C (0.03620 to 0.05731)

This analysis should make it abundantly clear that the change in raw input data (if any) between GHCN version 1 and GHCN version 3 had little to no effect on global temperature trends. The exact cause of Smith’s mistaken conclusion is unknown; however, a review of his code does indicate a few areas that seem problematic. They are:

1. An apparent reliance on station Ids to match stations. Station Ids can differ between versions of GHCN.

2. Use of First Differences. Smith uses first differences, however he has made idiosyncratic changes to the method, especially in cases where there are temporal lacuna in the data. The method which used to be used by NCDC has known issues and biases – detailed by Jeff Id. Smith’s implementation and his method of handling gaps in the data is unproven and may be the cause.

3. It’s unclear from the code which version of GHCN V3 that Smith used.

STATA code and data used in creating the figures in this post can be found here: https://www.dropbox.com/sh/b9rz83cu7ds9lq8/IKUGoHk5qc

Playing around with it is strongly encouraged for those interested.

0 0 votes
Article Rating
275 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
June 23, 2012 12:05 am

3. From the last thread:
E.M.Smith says:
June 22, 2012 at 1:19 am
Stokes:
I use ghcn v3 unadjusted.

June 23, 2012 12:47 am

One actually needs the name of the dataset. and actually the code that downloads and reads it in.

June 23, 2012 2:13 am

“effectively the same” is not good enough. Again, global warming is is founded upon 1/10ths of a degree. It is not founded upon large amounts of whole integers—i.e., it’s barely perceptible, especially to the untrained eye.
But more than that, using strictly “anomalies” isn’t good enough either because “global warming” scientists can be tricky with anomalies.
Have a look for yourself at these two videos. You’ll see there’s lots of play room available in actual temperature when looking only at anomalies of two, or more, data sets:
How ClimateGate scientists do the anomaly trick, PART 1

How ClimateGate scientists do the anomaly trick, PART 2

phi
June 23, 2012 2:24 am

A central feature in these comparisons is constituted by adjustments and by choices that are made to integrate or not in the same data series segments of the same station.
National offices generally choose to form the longest possible series and homogenize them. I believe GHCN preserves the segmentation. This means that the reconstructions performed based on the GHCN data run slightly on the principle adopted by BEST. Segments are homogenized de facto at the stage where all is averaged (in the cases presented here, within cells). Quantifying the actual adjustments can be made only if the series of stations were previously merged so according to the methodology of NMS. The magnitude of the actual adjustments are remarkably stable and it is about 0.5 ° C for the twentieth century.

Richard T. Fowler
June 23, 2012 2:24 am

“Smith’s implementation and his method of handling gaps in the data is unproven and may be the cause. ”
“3. It’s unclear from the code which version of GHCN V3 that Smith used. ”
These two statements appear to contradict each other. If the code is available, how can Smith’s “implementation and his method of handling gaps in the data” be unproven?
Zeke or Steve, would you care to elaborate? Thank you.
RTF

June 23, 2012 2:44 am

Nice curve ball Steve. What is it in the data dance world; three strikes and you’re out?
The frequency bar bell charts look to heavily favor positive anomalies in both charts. Looks to be warmed up temps well outnumber cooler mods. Any chance the cooler mods are before 1970 while those positive adjustments tend towards the end on the 20th century and the beginning of the 21st? Of course, you are avoiding showing the changes by year.
The anomaly spatially gridded line comparison charts, nice but why did you have to force the data through a grid blender first?

“…When we do this for both GHCN v1 and GHCN v3 raw data, we get the figure above. While we would expect some differences simply because GHCN v3 includes a number of stations not included in GHCN v1,…”

As I understand your gridded database, you are knowingly comparing apples to oranges and then you follow that little twist of illogic with.

“the similarities are pretty remarkable”

I must say, that last little tidbit just might be the truest thing you’ve posted. And you are brazen enough to say

“…3. It’s unclear from the code which version of GHCN V3 that Smith used…”

You’re out!

mfo
June 23, 2012 3:21 am

Saturday morning. What a time to post this response to EM :o(
The First Difference Method in comparison with others was written about by Hu McCulloch in 2010 at Climate Audit in response to an essay about calculating global temperature by Zeke Hausfarther and Steven Mosher at WUWT.
http://climateaudit.org/2010/08/19/the-first-difference-method/
http://wattsupwiththat.com/2010/07/13/calculating-global-temperature/

Paul in Sweden
June 23, 2012 3:23 am

E.M. Smith, Zeke Hausfather and Steve Mosher & all of you other highly talented individuals with your own fine web sites that grind this data up – we all know who you are :),
There is a lot of work and a great deal of expenditure of time and finances going on refining the major global temperature databases for the purpose of establishing a global mean temperature trend. I imagine the same amount of resources could be dedicated towards refining various global databases regarding precipitation, wind speed, polar ice extent, sea level, barometric pressure or UserID passwd for the purpose of establishing a global mean average trend.
How do we justify the financial and resource allocations dedicated to generating and refining these global means?
I cannot fathom a practical purpose for planetary mean averages unless we are in the field of astronomy. Here on earth global mean averages for specific metrics regarding climate have no practical value(unless we are solely trying to begin to validate databases).
Regional data by zone for the purpose of agricultural and civic planning are all that I see as valuable. Errors distributed throughout entire global databases in an even manner give me little solace.

david_in_ct
June 23, 2012 4:18 am

so since you have all the data why don t u do exactly what smith did and see if u get the same plots, instead of producing a different analysis. his main point is that the past was cooled relative to the present. why not take all the station differences that u found and bin them by year, then plot a running sum of the average of the differences year by year. if he is correct that graph will be a u shape. if the graph is flat as it should be then maybe he/you can find the differences in the data/code that each of u has used.

Geoff Sherrington
June 23, 2012 5:01 am

Did this Australian comment from Blair Trewin of the BoM become incorporated in any international adata set? Consequences?
> Up until 1994 CLIMAT mean temperatures for Australia used (Tx+Tn)/2. In
> 1994, apparently as part of a shift to generating CLIMAT messages
> automatically from what was then the new database (previously they were
> calculated on-station), a change was made to calculating as the mean of
> all available three-hourly observations (apparently without regard to
> data completeness, which made for some interesting results in a couple
> of months when one station wasn’t staffed overnight).
>
> What was supposed to happen (once we noticed this problem in 2003 or
> thereabouts) was that we were going to revert to (tx+Tn)/2, for
> historical consistency, and resend values from the 1994-2003 period. I
> have, however, discovered that the reversion never happened.
>
> In a 2004 paper I found that using the mean of all three-hourly
> observations rather than (Tx+Tn)/2 produced a bias of approximately
> -0.15 C in mean temperatures averaged over Australia (at individual
> stations the bias is quite station-specific, being a function of the
> position of stations (and local sunrise/sunset times) within their time
> zone.

Louis Hooffstetter
June 23, 2012 5:35 am

Informative post – thanks.
I’ve often wondered how and why temperatures are adjusted in the first place, and whether or not the adjustments are scientifically valid. If this has been adequately discussed somewhere, can someone direct me to it? If not, Steve, is this something you might consider posting here at WUWT?

wayne
June 23, 2012 5:49 am

In Figure 3: http://wattsupwiththat.files.wordpress.com/2012/06/clip_image006.png

“This shows that while the raw data in GHCN v1 and v3 is not identical (at least via this method of station matching), there is little bias in the mean. Differences between the two might be explained by the resolution of duplicate measurements in the same location (called imods in GHCN version 2), by updates to the data from various national MET offices, or by refinements in station lat/lon over time.”

Zeke, that is not a correct statement above, “there is little bias”. I performed a separation of the bars right of zero from the bars on the left of zero and did an exact pixel count of each of the two portions.
To the right of zero (warmer) there are 9,222 pixels contained within the bars and on the left of zero (cooler) there are 6,834 pixels of area within. That is makes the warm side adjustments 135% of those to the cooler side. Now I do not count that as “basically the same” or “insignificant”. Do you? Really?
It seems your analysis has a bias to warm itself, ignoring the actual data presented. The warm side *has* been skewed as E.M. was pointing out. The overlying bias is always a skew to warmer temperatures, always, I have yet in three years to see one to the contrary, and that is how everyone deems this as junk science. To some, a softer term, cargo cult science.

June 23, 2012 5:54 am

“National offices generally choose to form the longest possible series and homogenize them. I believe GHCN preserves the segmentation. This means that the reconstructions performed based on the GHCN data run slightly on the principle adopted by BEST.”
The Berkeley Earth Method does not preserve segmentations, quite the opposite. It segments time series into smaller components,

phi
June 23, 2012 6:06 am

Steven Mosher,
“The Berkeley Earth Method does not preserve segmentations, quite the opposite. It segments time series into smaller components,”
What is the opposite of BEST is the NMS methodology which aggregates segments before homogenizing. What you did with the GHCN series is between these two extremes. In fact, you’re closer to BEST because segmentation present in GHCN generally corresponds to stations moves and it is these particular discontinuities which are biased.

June 23, 2012 6:14 am

“Richard T. Fowler says:
June 23, 2012 at 2:24 am (Edit)
“Smith’s implementation and his method of handling gaps in the data is unproven and may be the cause. ”
“3. It’s unclear from the code which version of GHCN V3 that Smith used. ”
These two statements appear to contradict each other. If the code is available, how can Smith’s “implementation and his method of handling gaps in the data” be unproven?
Zeke or Steve, would you care to elaborate? Thank you.”
Sure. In EM’s post on his method he describes his method of handling gaps in the record in words. His description is not very clear, but it is clear that he doesnt follow the standard approach used in FDM which is to reset the offset to 0. And in his post it wasnt clear what exact file he downloads. For example, if you read turnkey code by Mcintyre you can actually see which file is downloaded because their is an explict download.file() command. In what I could find of Smith’s it wasnt clear.

June 23, 2012 6:20 am

Louis Hooffstetter says:
June 23, 2012 at 5:35 am (Edit)
Informative post – thanks.
I’ve often wondered how and why temperatures are adjusted in the first place, and whether or not the adjustments are scientifically valid. If this has been adequately discussed somewhere, can someone direct me to it? If not, Steve, is this something you might consider posting here at WUWT?
#################
Sure. back in 2007 I started as a skeptic of adjustments. After plowing through piles of raw and adjusted data and the code to do adjustments. I conclude
A. Raw data has errors in it
B. These errors are evident to anyone who takes the time to look.
C. these errors have known causes and can be corrected or accounted for
The most important adjustment is TOBS. We dedicated a thread to it on Climate audit.
Tobs is the single largest adjustment made to most records. It happens to be a warming adjustment.

June 23, 2012 6:32 am

atheok
My guess is that you did not look at EM smiths code.
http://chiefio.wordpress.com/2012/06/08/ghcn-v1-vs-v3-some-code/
When you look through that Fortran and the scripts.. well, perhaps you can help me and figure
out which file he downloaded. Look for a reference in that code that shows he downloaded
this file
ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/ghcnm.tavg.latest.qcu.tar.gz
basically, if somebody asks me why someone comes to wrong conclusions it could be
the wrong data or the wrong method. basic forensics work.
Wrong data can be
1. wrong file
2. read in wrong
3. formated wrong
Wrong method can be a lot of things. So, basically, I suggest starting at step zero when trying to figure these things out. Perhaps your fortran is better than mine and you can find the line in hat code that shows what file he downloads. Its a simple check,

June 23, 2012 6:38 am

“A. Raw data has errors in it”
Steven Mosher, elsewhere over the years you have claimed there is no raw data.
So 2 questions:
1. Is there raw data or not?
2. How did you come to determine there were errors in it? Data is normally just data. The error occurs in the way it’s handled. Care to explain?
Andrew

June 23, 2012 6:41 am

“so since you have all the data why don t u do exactly what smith did and see if u get the same plots, instead of producing a different analysis. his main point is that the past was cooled relative to the present. why not take all the station differences that u found and bin them by year, then plot a running sum of the average of the differences year by year. if he is correct that graph will be a u shape. if the graph is flat as it should be then maybe he/you can find the differences in the data/code that each of u has used.”
Zeke has provided the code he used to do this analysis. So, you are free to go do that. If you dont like that code, you can go use the R packages that I maintain. Everything can be freely downloaded from the CRAN repository. The package is called RghcnV3.
My preference is to avoid GHCN V3 altogether, and work with raw daily data. You get the same answers that we posted here for monthly data and avoid all the confusion and controversy surrounding GHCN V1,V2 and V3. That dataset has 26,000 stations ( actually 80K when you start )

phi
June 23, 2012 6:42 am

Steven Mosher,
“A. Raw data has errors in it
B. These errors are evident to anyone who takes the time to look.
C. these errors have known causes and can be corrected or accounted for”
Corrected errors are discontinuities. The main discontinuities that cause bias are those related to stations moves. They should not be regarded as errors but as corrections of increasing perturbations since the 1920s.
“The most important adjustment is TOBS. We dedicated a thread to it on Climate audit.”
Only valid for US.

June 23, 2012 6:48 am

mfo
Yes, you will find in the past that I used to be HUGE FAN of the first difference method.
read through that climate audit post. Skeptic Jeff Id, convinced believers Hu and Steve
that First differences was fatally flawed. EM did not get the memo.
That is how things work. I was convinced that First differences would solve all our problems.
I was wrong. Jeff Id made a great case and everybody with any statistical sense moved on to methods exactly like those created by Roman M and JeffId. That list includes: Tamino, Nick Stokes and Berkeley Earth. See Hu’s final comment:
“Update 8/29 Just for the record, as noted below at http://climateaudit.org/2010/08/19/the-first-difference-method/#comment-240064, Jeff Id has convinced me that while FDM solves one problem, it just creates other problems, and hence is not the way to go.
Instead, one should use RomanM’s “Plan B” — see
http://statpad.wordpress.com/2010/02/19/combining-stations-plan-b/, http://climateaudit.org/2010/08/19/the-first-difference-method/#comment-240129 , with appropriate covariance weighting — see http://climateaudit.org/2010/08/26/kriging-on-a-geoid/ .”

June 23, 2012 6:51 am

Phi.
Interesting that you think Tobs only applies to the US. It doesn’t.
With regard to station moves, I prefer the BEST methodology. although in practice we know that explicit adjustments give the same result.

June 23, 2012 6:59 am

Andrew
So 2 questions:
1. Is there raw data or not?
2. How did you come to determine there were errors in it? Data is normally just data. The error occurs in the way it’s handled. Care to explain?
Andrew
###############
Philosophically there is no raw data. Practically, what we have is what you could call
“first report” So, I’m using “raw” in the sense that most of you do.
2. How do you determine that there are errors in the data? Good question.
Here are some examples; Tmin is reported and being great than Tmax, tmax is reported as being less than Tmin. temperatures of +15000C being reported, temperatures of -200C
being reported. There are scads of errors like this. data items being repeated over and over again. In a recent case where I was looking at heat wave data we found one station reporting freezing temperatures. When people die in July in the midwest and a stations “raw data” says that it is sub zero, I have a choice: believe the doctor who said they died of heat stroke or believe the raw data of a temperature station. hmm. Tougher examples are subtle changes like
a) station moves
b) instrument changes
c) time of observation change
d) and toughest of all gradual changes over time to the enviroment

pouncer
June 23, 2012 7:00 am

Hi Steve,
Does this analysis address the point of “fitness for purpose”? The purpose of all such historic reviews, as I understand it, is to proximate the changes in black body model atmospheric temperatures for use in a (changing) radiation budget. The “simple physics” is simple. Measuring the data is more complicated.
Chiefio claims differences over time are of comparable size (a) between versions of the data set, (b) as “splice” and other artifacts of measuring methods, (c) deliberate adjustments intended to compensate for the data artifacts, and (d) actual physical measures.
If the real difference over a century is under two degrees and the variations for versions,data artifacts, and adjustments distort measurement of that difference, how can that difference be claimed to decimal point accuracy? (Precision, I grant, from the large number of measurements. But Chiefio’s point that the various sources of noise are NOT random and therefore can NOT be assumed to cancel is, as far as I can tell, not explicitly addressed.) If the intended purpose does require that level of accuracy and if the measurement does not provide it, can the data set be said to be useful for that purpose? (Useful for many other purposed, including those for which it was originally gathered, don’t seem to me to be germaine.)
I see your analysis as a claim that the differences make little difference. I agree. But we are talking about very little differences in the whole picture.

phi
June 23, 2012 7:08 am

Steven Mosher,
“Interesting that you think Tobs only applies to the US. It doesn’t.”
If you say this is that you have a case in mind. Have you a reference?
“With regard to station moves, I prefer the BEST methodology.”
It has the disadvantage of not allowing to assess the magnitude of the adjustments.
“although in practice we know that explicit adjustments give the same result.”
Yes, explicitly or implicitly all global temperatures curves are homogenized.

June 23, 2012 7:08 am

steven mosher says:
…….
Hi Steven
Thanks for comment on the other thread. I noticed the Santa Fe BEST (page 10) shows similar spectral response, but I am not certain if using 5yr smoothing is a good idea.

June 23, 2012 7:13 am

Richard Fowler,
Here, perhaps this can help somewhat. This is EM’s description of what he does.
“2) Missing data handling. For classical First Differences, if there are missing data, you just re-zero and reset. As there are a lot of missing data in some places, that gives some very strange results. I just assume that if you have January for some years in a row, and leave out a couple, then get some more, that somewhere in between you passed through the space between the two. So if you had a series that was +1/10, +1/10, -1/10, missing, missing, +2/10; I just hold onto the last anomaly and wait while skipping missing data. When I hit the +2, I just account for all the change in that year. So you would have 1/10, 2/10, 1/10, 3/10 as the series. This, IMHO, more accurately reflects the reality of the temperatures recorded. That last year WAS 2/10 higher than the prior value, so throwing it away and using a zero is just wrong. In this way this code is much more robust to data dropouts and also more accurate.”
So, my concern is this.
1. we know from Jeff Id’s fine work ( Jeff is the skeptic who tore Steig’s work to shreds ) that First differences is a flawed method.
2. EM departs from this method and “invents” his own approach.
That approach is untested ( have a look at the synthetics tests that Jeff Id did in first differences)
If you ask me why EM gets odd results quite logically I can only point to two possibilities
data or method. Assuming he used ghcn v1 raw and ghcn v3 raw, that logically leaves method as the reason. I look at his method and I see that he uses a method that has been discredited by leading skeptics and that he has made untested changes to the method ( while trying to say its “peer reviewed”). I kinda shrug and suggest that maybe there is an issue there. For me, I use the better methods as suggested by Jeff and Roman. I used to think First differences was the best. I was wrong, One thing I have always appreciated here and at climate audit is people’s insistence that we use the best methods.

June 23, 2012 7:17 am

Sure thing vuk.
You can expect some updates to that Sante fe chart in the coming months. I suspect folks who do spectral analysis will be very interested.

June 23, 2012 7:22 am

Steven Mosher,
“Interesting that you think Tobs only applies to the US. It doesn’t.”
If you say this is that you have a case in mind. Have you a reference?
#############
yes. do more reading and post your results.
“With regard to station moves, I prefer the BEST methodology.”
It has the disadvantage of not allowing to assess the magnitude of the adjustments.
###############
Of course you can assess the magnitude. Its a switch in the code you can turn off
or turn on.
“although in practice we know that explicit adjustments give the same result.”
Yes, explicitly or implicitly all global temperatures curves are homogenized.
####
The wonderful thing about having new data is that you can actually test a method
by withholding data. You can test the ability with and without.

Pamela Gray
June 23, 2012 7:39 am

Question. Are the stations between the two data versions still the exact same stations or has there been station drop out or upgrades (and in some cases deteriorating stations) as time went by? Keeping due diligence over the study plots (tracking how they have changed with careful observations), and having a gold standard control group that is kept in pristine condition to compare them with is vital before homogenization methods can be developed. I don’t think that has been done to any extent. Therefore the raw and homogenized results are probably worthless. The homogenization methods are a shot in the dark.

June 23, 2012 7:40 am

(Moderator — feel free to snip, but I think this is relevant)
I tried to compare BEST to Environment Canada data for one station near where I live.
I realized the data in BEST was crappy.
http://sunshinehours.wordpress.com/2012/03/13/auditing-the-latest-best-and-ec-data-for-malahat/
“To start with I am looking at one station. In BEST it is StationID 7973 – “MALAHAT, BC”. In EC it is station MALAHAT which is Station_No 1014820.
I am comparing the BEST SV (Single Valued) data to the BEST QC (Quality Controlled) data.
The first minor problem is that the EC data has records from the 1920s and 1930s that BEST does not have (that I have found). Thats no big deal. The next problem is that out of 166 Month/Year records, not one of them matched exactly. BEST SV and QC data is to 3 decimal points while EC is to 1.
For example. Jan 1992 has QC = 5.677, as does SV, while EC = 5.8. Close. But not an exact match.
However, the real problem is that there are 5 records that have been discarded between SV and QC. Two out of the five make no sense at all, and one is iffy.
Where it says “No Row” it means BEST has discarded the record completely between SV and QC.
1991 is iffy. EC has it has 4.5, SV has 3.841. Close, but not that close
1993 makes no sense at all.
2002 is fine. Thats a huge error. But where the heck did BEST get the -13.79 number in the first place.
2003 is fine. But again, where the heck did BEST get the -4.45 number in the first place.
Finally, 2005 makes no sense at all. There is little difference between -1.1 and -1.148. Certainly most records are that different.
And those are just the discarded records!
There are another 48 record with a difference of .1C or greater and here are the greater than .2C ones.”

DocMartyn
June 23, 2012 7:41 am

steven mosher, I have, as always, a simple request.
In your figure 3 you have all the stations that have been mathematically warmed and cooled during the revision, and also have their location.
If it is not too much trouble could you color-code them and plonk them on a whole Earth map and let us see if all the warm ones are clustered around the areas where we are observing ‘unprecedented’ warming?
Call me a cynical old-fool, but a bit of warming here and a bit of cooling there and pretty soon you can be talking Catastrophic Anthropogenic Global Warming.

June 23, 2012 7:45 am

pouncer.
“fit for purpose”
From my standpoint the temperature record has nothing to do with knowing that radiative physics is true. We know that from engineering tests. If you add more GHGs to the atmosphere your change the effective radiating level of the atmosphere and that over time will result in the surface cooling less rapidily.
So it depends upon what purposes you are talking about. People are also confused about what the temperature record really is and what it can really tell us and how it is used. Chief amongst those confusions is the idea that we claim to know it to within tenths or hundreths.
Let me see if I can make this simple. Suppose I do a reconstruction and I say that the
“average” global anomaly in 1929 is .245 C. What does that mean?
It means this: It means that if you find hidden data record from 1929 to the present and calculate its anomaly, the best estimate of its 1929 anomaly will be .245C. That is, this estimate will minimize the error in your estimate. We collect all the data that we do have
and we create a field. That field actually is an estimate that minimizes the error, such that if you show up with new data my estimate will be closer to the new data than any other estimate.
The global average isnt really a measure of a physical thing. You can compare it to itself and see how its changing, its only a diagnostic, an index. Of course in PR land it gets twisted into something else.
So fit for purpose? You can use it for many purposes. you could use it to tell a GCM molder that he got something wrong. You could use it to calibrate a reconstruction and have a rough guess at past temperatures. You could use it to do crude estimates of sensitivity. Its fit for many purposes. I wouldnt use it to plan a picnic.
Let me give you a concrete example. I’ve come across some new data from the 19th century
Hundreds of stations never seen before.Millions of new data points preserved photos of written records on micro fiche. What method would you use to predict the temperatures I hold in my hand? Well, the best method you have is to form some kind of average of all the data that you hold and use that to estimate the data that I hold in my hands.
The best averaging method is not First differences. Its not the common anomaly method. Its not the reference station method. The best method is going to be something like Jeff Ids method, or Nick Stokes method, or the Berkeley Earth method. When they estimate .245678987654C
that estimate will be closer to the data that I hold than any other method. The precision of that guess has nothing to do with the quality of the data, it has to do with minimizing the error of prediction given the data.

June 23, 2012 7:57 am

Hi Doc,
One reason why Zeke provided the data and code in the public drop box is to allow people with questions to answer the questions for themselves. By releasing the data and the code we effectively give people the power to prove their points.
When I fought to get hansen to release the code and when I sent FOIA into Jones it wasnt to get them to answer my questions. It was to get their tools so I could ask the questions I wanted to ask.

Rob Dawg
June 23, 2012 8:05 am

• “We can perform a number of tests to see if GHCN v1 and 3 differ. The simplest one is to compare the observations in both data files for the same stations. This is somewhat complicated by the fact that station identity numbers have changed since v1 and v3, and we have been unable to locate translation between the two. ”
——
Simply a stunning admission. What judge in any court would, upon hearing the prosecutors admit to this, not throw the case out of court?

June 23, 2012 8:25 am

Mosher: “Well, the best method you have is to form some kind of average of all the data that you hold and use that to estimate the data that I hold in my hands.”
I wouldn’t use all data, I would use data for a reasonably sized region and compare the data and see what the differences are.
For example, if all of your old data was 1C or 2C warmer than NOAA’s adjusted old data, then the odds are your data is right.

Michael R
June 23, 2012 8:30 am

I certainly do not have the kind of expertise to take sides in the argument. One thing that does give me pause however is that I have seen the type of analysis done in Figure three previously, also used as evidence of no bias in adjustments, however as was pointed out then, that kind of graph is isn’t supportive of anything.
Should most or all of those adjustments that are on the right side of the bell curve happen to be in the more recent time and the opposite true for earlier time, then you artificially create a warming trend and indeed could make a huge warming trend while still showing equally cool and warming adjustments.
I cannot comment on the rest of it, but I would caution the use of the argument used in the above paragraph as he last time I saw it used was in an attempt to mislead people which automatically makes me untrusting of the following argument, which may very well be unfortunate rather than unwarranted.

June 23, 2012 8:31 am

sunshinehours the Env Canada data has quality control flags that you should apply before doing any comparison. If you download my package CHCN ( be sure to get the latest version ) that might help you some.
Lets start with environment canada for jan 1992 you really didnt tell people everything
now did you? here is the actual data from environment canada
Time Tmax Tmean Tmin
1992-01 8.60 E 5.80 E 2.60 E
As you note Environment canada has 5.8 and Best has 5.67
Best does not use environment canada as a source.
If you use my package however you can look at environment canada data.
When you do, here is what you find. What you find is that you failed to report
the quality flags. See that letter E that follows the Tmax and tmin and tmean
That flag means the value in their database is ESTIMATED
Now do the following calculation (8.6+2.6)/2 or (tmax + tmin)/2 see what you come up with?
is it 5.8? nope. looks like 5.6 to me. Without looking at Best data ( I ‘m tired of correcting your mistakes ) I would bet that the source for the BEST data is daily data. Enviroment canada also has daily data for that site it has daily data from
1920-01-01/2012-05-16
Looking at the environment canada excell file for that site and examining the quality flags you should note that a very very large percentage of the figures you rely on have a quality flag of
“E” or “I”
E stands for estimated
I stands for incomplete
Folks can verify that by looking at the csv file for the station.
Station Name MALAHAT
Province BRITISH COLUMBIA
Latitude 48.57
Longitude -123.53
Elevation 365.80
Climate Identifier 1014820
WMO Identifier 71774
TC Identifier WKH
Legend
[Empty] No Data Available
M Missing
E Estimated
B More Than One Occurrence and Estimated
I The value displayed is based on incomplete data
S More than One Occurrence
T Trace

June 23, 2012 8:37 am

1. “I’m using “raw” in the sense that most of you do.”
If it’s not philosophically “raw.” Then “raw” is not a very descriptive term for what it is. That’s a big problem.
2. “There are scads of errors like this”
Sounds like the data may not be very meaningful if there continues to be “scads” of errors in it.
Andrew

Pamela Gray
June 23, 2012 8:38 am

Steven, a significant portion of your time as a scientist should be spent explaining what you have said. To refuse to do so by telling others to figure it out for themselves seems a bit juvenile and overly dressed in “ex-spurt” clothes. If questions come your way, kindly doing your best to answer them seems the better approach, especially with such a mixed audience of arm-chair enthusiasts such as myself and truly learned contributors. I for one appreciate your post. Don’t spoil it.

Ripper
June 23, 2012 8:55 am

The first minor problem is that the EC data has records from the 1920s and 1930s that BEST does not have (that I have found).
=======================================================
Same thing here in western Australia.
Instead of looking at the records used ,It is time to look at the records that are not used.
It appears to me that the records with warm years in Western Australia in the 1920-40’s are just no used.
If someone can point me to where GHCN or CRU uses the long records from e.g.Geraldton town that starts in 1880
http://members.westnet.com.au/rippersc/gerojones1999line.jpg
or Kalgoorlie post office that starts in 1896
http://members.westnet.com.au/rippersc/kaljones1999line.jpg
I would be most grateful.
Or Carnarvon post office that starts in 1885… etc.
It appears to me that in the SH Phil Jones used very little data that was available from 1900 to 1940-50 odd, but selected different stations in each grid cell for 1895-1899.
e.g. instead of using those years from the above mentioned stations, he filled the grid cell with 1897-1899 (despite that record going to 1980) from Hamelin pool.

Pamela Gray
June 23, 2012 9:03 am

Station drop out has always intrigued me. Here’s why. Station dropout may have done so in non-random fasion. Therefore the raw data collected from stations over time may have non-random bias in them before they were homogenized, which may have caused the raw data to be even more biased.
How would one determine this? One way would be comparing ENSO-driven temperature trend patterns from analogue years and dated station dropout plots. I would start with the US since ENSO patterns are fairly well studied geographically. Overlaying a map of dated station dropout on temperature and precipitation pattern maps under El Nino, La Nina, and neutral ENSO conditions through the decades may be quite revealing and possibly demonstrate explained variance being related to non-random station dropout interacting with ENSO oscillations and patterns. If so, homogenization may have only served to accentuate the bias in the raw data itself.

climatebeagle
June 23, 2012 9:21 am

Rob Dawg says: June 23, 2012 at 8:05 am
“Simply a stunning admission.”
Exactly, that jumped out at me as well, how on earth did anyone think that was a good idea.
Thanks for posting it, it would be good if both sides could continue the dialog to resolution, whatever any outcome may be, (e.g. both have valid points or acceptance that one (or both) analysis is wrong/weak/strong/right/…)

Pamela Gray
June 23, 2012 9:25 am

By the way, local low-temperature records in Oregon are falling right and left, as are precipitation records. Why? Not because of global cooling but because of the ENSO oceanic and atmospheric conditions we currently have in and over the great pond to the West of us that uniquely affect our weather patterns. These oceanic and atmospheric conditions have been in place, more or less, for several years now, resulting in several predictable changes in flora and fauna response and broken records related to lows over the same number of years in Oregon.
What is interesting and connected to my earlier comment, is that these same conditions result in year in and year out heat waves and drought elsewhere. Which leads me to restate: Station dropout patterns over time need further study in relation to ENSO conditions and the fairly well established geographic temperature trends tied to ENSO conditions. I think the raw data may be biased.

dp
June 23, 2012 9:25 am

Start over and show where Smith’s error is. All you’ve shown is you have arrived at a different result with different methods. No surprise. I could do a third and be different again. It would not show that Smith or you are right (or wrong).

Andrew Greenfield
June 23, 2012 9:27 am

There is NO significant warming when will they learn?
http://www.drroyspencer.com/wp-content/uploads/UAH_LT_1979_thru_May_2012.png
There is no hotspot in the TLT

E.M.Smith
Editor
June 23, 2012 10:00 am

I haven’t had time to look through the comments, but to address a couple of points quickly:
The version of v3 which I used is:
ghcn.tavg.v3.1.0.20120511.qcu.dat
The version used ought not to have much effect. If it does, then there are far more variations in the data than IMHO could be reasonably justified.
The “attack” on station matching is just silly. I went out of my way, especially in comments, to make clear that the “shift” was not a result of individual data items, but a consequence of the assemblage of records. It isn’t a particular record that has changes, it is WHICH records are collected together. As I don’t do “station matching”, there isn’t any consequence from station IDs changing. The “country code” portion changes dramatically, but the WMO part less so. I do make assemblages of records by “country code”, assuring that the correct mapping of countries is done, as a different approach to getting geographical smaller units to compare That map is at:
http://chiefio.wordpress.com/2012/04/28/ghcn-country-codes-v1-v2-v3/
In large part, this article asserts ‘error’ in doing something that I specifically asserted I did not do. Match stations or compare them data item by data item. If asked, I would freely respond that most individual records from individual stations will have highly similar, if not identical, data for any given year / month. (There are some variations, but they are small). It ignores what I stated was the main point; that the collection of thermometer records used changes the results in a significant way.
In essence, “point 1” is irrelevant as it asserts a potential “error” in something I specifically do not do. Station match.
The chart showing an equal distribution to each side of zero for non-identical records ignores the time distribution. (I’m not making an assertion about what the result from a station by station comparison on the time domain, just pointing out that it is missing). For the assemblage of data, the time distribution of change matters. The deep past shows more warm shifting while the middle time shows more cold, then swapping back to warmth more recently. The data in aggregate have a cooler time near the periods used as the “baseline” in codes like GIStemp and by Hadley, but is warmer recently (and in the deep past). It would be beneficial to make that “by item” comparison 3 D with a time axis. (No, I’m not asserting what the result would be. As it is a “by item” comparison, it says nothing about the impact of the data set in total).
Point 2 has some validity in that I do handle data drop out differently from the classical way. I simply hold onto the last value and wait for a new data item to make the comparison. So, for example, if Jan 1990 has data, but Jan 1989 does not, while Jan 1988 does again, the classical method would give 0 0 0 as the anomaly series. (Since it would ‘reset’ on the dropout and each ‘new set’ starts with a zero). That is, IMHO, fairly useless especially in data with a large number of data dropouts (as are common in many of the stations). Lets say for this hypothetical station the temps are 10.1 missing 10.0 (and on into 9.8, 10.4 etc.). I would replace the 10.1 with a zero (as FD has ‘zero difference’ in the first value found) replace the second value with zero (as it is missing, there is ‘no difference’ found) and then the third value be compared and a 1/10 C difference found. It says, in essence, these two Januaries have changed by -0.1 C so regardless of what happened in the middle year, the slope is 0.1 C over 2 years. A value of -0.1 C is entered for the 3rd value. So which is more accurate? To say that NO change happened over 3 years that start with 10.0 and end with 10.1, or say that a change of 0.1 happened? I think that on the face of it the ‘bridging’ of dropouts is clearly more accurate; though it would benefit from more formal testing.
For individual stations, this takes what would otherwise become a series of “resets” (and thus, artificial “splice artifacts” when they are re-integrated into a trend) and instead has one set of anomalies that starts with the most recent known value, ends with the last known value, and any dropout is bridged with an interpolated slope. It will clip out bogus excursions caused by resets on missing data (as the new end point data gets effectively ignored in classical FD) and will preserve what is a known existing change between known data. Using the data above, classical FD would find 0, 0, 0, -0.2 for an overall change of 0.2 cooler moving left to right over 4 years. By bridging the gap instead of taking a reset, my method gets 0, 0, -0.1, -0.2 and finds an overall change of -0.3 going from 10.1 to 9.8 (which sure looks a whole lot more correct to me…)
But if you want to hang your hat on that as “causal”, go right ahead. It won’t be very productive, but a thorough examination of the effects on larger assemblies of data would be beneficial. So while I’d assert that point 2 is, well, pointless; being a new minor variation on the theme it could use more proving up.
Per point 3, I’ve given you the version above. Point now null. (And frankly, if you wish to assert that particular versions of GHCN v3 vary that much, we’ve got bigger problems than I identified with version changes.)
In the end, this ‘critique’ has one point with some validity (but it isn’t enumerated). That is the “grid / box” comparison of the data. However, what it ignores is the use of the Reference Station Method in codes such as GIStemp (and I believe also in NCDC et. al. as they reference each others methods and techniques – though a pointer to their published codes would allow me to check that. If their software is published…) So, you compare the data ONLY in each grid cell and say “They Match pretty closely”, and what I am saying is “The Climate Codes may homogenize and spread a data item via The Reference Station Method over 1200 km in any one step, at may do it serially so up to 3600 km of influence can be had. This matters.”
Basically, my code finds the potential for the assemblage to have spurious influence via the RSM over long distances and says “Maybe this matters.” while your ‘test’ tosses out that issue up front as says ‘if we use a constraint not in the climate codes the data are close’. No, I don’t know exactly how much the data smearing via the RSM actually skews the end product (nor does anyone else, near as I can tell) but the perpetually rosy red arctic in GIStemp output where there is only ice and no thermometers does not lend comfort…
So as a first cut look at the critique, what I see is a set of tests looking at the things I particularly did not do (station matching) and then more tests looking at things with very tight and artificial constraints (not using the RSM as the climate codes do) that doesn’t find the specific issue I think is most important as it excludes the potential up front. Then a couple of spurious points ( such as version of v3) that are kind of obviously of low relevance.
My assertion all along has been that it is the interaction between records and within records that causes the changes in the data to “leak through” the climate codes. That RSM and homogenizing can take new or changed records and spread their influence over thousands of kilometers. That data dropouts and the infilling of those dropouts from other records can change the results. That it is the effective “splicing” of those records via the codes (in things like the grid / box averages) that creates an incorrect warming signal. By removing that homogenizing, the RSM, and the infilling, you find that they have no impact. What a surprise… Then, effectively, I’m criticized because my code doesn’t do that.
Well this behaviour is by design. I want to see what the impact of the assemblage is when ‘smeared together’ since that is what the climate codes do. Yes, I use a different method of doing the blending (selection on ranges of country code and / or WMO number) than ‘grid box’ of geographic dimensions area weighted. This, too, is by design. It lets me flexibly look at specific types of effect. (So, for example, I compared inside Australia with New Zealand. I can also compare parts of New Zealand to each other, or the assembly of Australia AND New Zealand to the rest of the Pacific – or most any other mix desired). Since one of the goals is to explore how the spreading of influence of record changes in an assemblage (via things like RSM and homogenizing) can cause changes in the outcome, this is a very beneficial thing.
This critique just ignores that whole problem and says that if we ignore the data influence of items outside their small box there is no influence outside their small box.
Oddly, the one place where I think there is a very valid criticism of the FD method is completely ignored. Perhaps because it is generic to all FD and not directed at me? Who knows… FD is very sensitive to the very first data item found. IF, for example, our series above started with 11.5, then 10.1, missing, 10.0, 9.8 etc. We get an additional -1.4 of offset to the whole history. That offset stays in place until the next year arrives. If it comes in at 10.1, then that whole offset goes away. The very first data item has very large importance. I depend on that being ‘averaged out’ over the total of thermometers, but it is always possible the data set ends in an unusual year. (And, btw, both v3 and v1 ought to show that same unusual year…) But any ‘new data’ that showed up in 1990 for v3 that was missing in v1 when it was closed out will have an excessive impact on the comparison.
At some point I’m planning to create a way to “fix that” but haven’t settled on a method. The ‘baseline’ method is used in this critique (and in the climate codes) and it has attractions. However, when looking for potential problems or errors in a computer code, it is beneficial to make the comparison to a ‘different method’ as that highlights potential problems. That makes me a bit reluctant to use a benchmark range method. Doing a new and self created method is open to attacks (such as done here on the bridging of gaps technique) even if unwarranted. One way I’ve considered is to just take the mean of the last 10% of data items and use it as the ‘starting value’. This essentially makes each series benchmarked via the first 10% of valid data mean. But introducing that kind of complexity in a novel form may bring more heat than light to the discussion.
In the end, I find this critique rather mindlessly like so many things in ‘climate science’. Of the form “If we narrow the comparisons enough we find things in tiny boxes are the same; so the big box must be the same too, even if we do different things in it.” Makes for nice hot comments threads, but doesn’t inform much.
I’m now going to have morning tea and come back to read the comments. I would suggest, however, that words like “error” and “flawed” when doing comparisons of different things in different ways are mostly just pejorative insults, not analysis. I don’t assert that the ‘grid / box’ method is in ‘error’ or ‘flawed’ because it will miss the impact of smearing data from an assembly of records over long distances. It is just limited in what it can do. Similarly it is not an ‘error’ or ‘flawed’ to use a variable time window in FD. (That is, in effect, what I do. Standard FD just accepts the time window in the data frequency, such as monthly, I just say ‘the window may be variable length, skip over dropouts’.) It is just going to find slightly different things. And, IMHO, more accurate things. It is more about knowing what you are trying to find, and which tool is likely to find it. As I’m looking for “assemblage of data impacts”, using a variable “box size” to do various aggregations (via CC/WMO patterns) lets me do that; while using a fixed geographic grid box without RSM and homogenizing will never find it. That isn’t an error or flaw in non-homogenized non-RMS small grid boxes, it is just a limitation of the tool. That the tool is used wrongly (if your goal is to find those effects) would be the error.
So, in short, the analysis here says I didn’t find what I wasn’t looking for, and they don’t find what I did look for because they don’t look for it.

June 23, 2012 10:05 am

Steve and Zeke, you’ve gone above-and-beyond here, and it’s much appreciated. I realize that you don’t owe us anything else, and I think it’s a good idea to push people to delve into the data for themselves.
At the same time, I have quite a few data sets on my laptop where I could bang out a new graph in a minute or two that would take you quite some time to figure out my R code, what data files I’ve given you and what you have to obtain from elsewhere, etc. In like manner, I’m wading through your Stata (a truly ugly language, like SAS, SPSS, gretl, and that whole command-oriented genre of statistical programs), trying to figure out what you’ve done, what files you’ve supplied, what files you have not, etc. I then suspect I’ll have to make some enormous data downloads, and finally, once I’ve reproduced your results in R I’ll be able to see how your Figure 3 histogram changes over time and location.
I think it’s a very intelligent observation that an apparent canceling of adjustments over the entire period might actually obscure a trend of adjustments over time, or over latitude, etc. You don’t have to pursue that line of reasoning to all of its boundaries, but I would guess that you could fairly quickly do a lattice-style histogram by, say, decades to settle the issue to a first approximation.
Thanks again!

phi
June 23, 2012 10:05 am

Steven Mosher,
Ok, that’s right. Tobs applies only to US : You have nothing.
“Of course you can assess the magnitude. Its a switch in the code you can turn off
or turn on.”
No, you can’t disable this with BEST. What you say does not make sense.

June 23, 2012 10:38 am

phi
You’re starting to catch on phi.

June 23, 2012 10:42 am

On a tangent to this: Thermometer bulbs shrink over time; see the chart at the top of page 98:
http://tinyurl.com/7jngj73
Its contribution might depend on how often thermometers are replaced (?)

June 23, 2012 10:45 am

Mosher, I used Environment Canada monthly summaries.
Again and again you miss the main point:
“But where the heck did BEST get the -13.79 number in the first place.”

E.M.Smith
Editor
June 23, 2012 10:46 am

Steven Mosher says:
June 23, 2012 at 12:47 am
One actually needs the name of the dataset. and actually the code that downloads and reads it in.

The name is in the last comment. The “code” that downloads it is ftp built in to most browsers. The code that reads it is published in the source code posting. (Not very remarkable code. Fixed format FORTRAN read… following the file layout from the data source.)
Frankly, I think that’s a bit of a Red Herring as any download of v3 ought to be substantially the same and ‘how to read it’ comes from their web site. But yes, it’s nice to have the specifics.
@Amino Acids in Meteorites:
A good point that is often obscured. In the 2009 era version of GIStemp that I analyzed, for example, temperatures are kept AS temperatures right up until the very last “grid / box” anomaly creation stage. So all the RSM and Homogenizing are done on recorded temperatures, not anomalies (though it looks like an obscure kind of anomaly is temporarily created in step 2 IIRC when it does some of the UHI “adjustment” – but it’s based on an assemblage of records that are a new collection with each change of the data set… It picks and chooses out of the available data until it has ‘enough’ records, so change what records are in, you get a different ‘selection’. This can be up to 1200 km away from the target station and may contain ‘homogenized’ data from a further 1200 km away, so limiting things to inside a small grid / box misses that.)
Only at the bitter end are “anomalies” created, but these are NOT between readings on a thermometer, nor readings on any specific thermometers. They are created between two eras of the “grid / box”. One from the “baseline” and one from the “present”. In the current version of GIStemp they have 16,000 grid/boxes, yet not nearly enough thermometers to have even ONE in most grid boxes. So most of the “anomalies” are being created between two values that are both a “polite fiction” where there is no real temperature data. (These grid / boxes are filled with values based on, yes, data from up to 1200 km away that may have been UHI “adjusted” from a further 1200 km based on data that may have been homogenized from up to 1200 km away that is based on ‘corrected and adjusted monthly averages’ based on looking at “nearby ASOS” stations in the QA process up to an unknown distance away…
IMHO, that is why it is important to look at effects from blending data from outside any individual grid / box. Since it is what the climate codes like GIStemp do (or did in 2009-10 when I worked on it) and if you ignore that, you miss something important.
I’ll watch the videos a bit later, they look interesting…
@Phi:
The segments WERE separate in v2, but in v3 the data come ‘pre-homogenized’. (Which likely also means the 1200 km ‘reach’ of GIStemp may have been moved upstream to NCDC, but I’ve not gone through the newer GIStemp to see if they removed their homogenizing or just left it in to rehomogenize)
T. Fowler:
My method of handling the gaps is not in the peer reviewed literature, so is “unproven”. I think it is trivially obvious to do “no bad thing” and fix some problems in the peer reviewed form. But technically, yes, it is “unproven” and it ought to be examined closely (as any change in process or code ought to be closely examined. Bugs and ‘issues’ can be incredibly subtle things…)
:
That’s the ‘grid / box’ issue… and the composition of change issue… Perhaps I ought to have read comments before writing the quick response, look like folks already covered some of it 😉
@mfo:
I have “variable awake time” and Anthony had even odds that I was awake then (as I have been most of the prior week) or not (as today). The world is not time-synchronous anyway, so while I appreciate the sentiment, it’s just how life is. Besides, this lets me get a nice comment thread to read 😉
(Time to put milk and sugar in the tea… back in mo…)

June 23, 2012 10:49 am

wrong phi.
start with canada which adjusts for tobs

June 23, 2012 10:53 am

pamela. you might be interested in the test of homogenizstion algorithms
in blind studies. experiment trumps
your speculation

June 23, 2012 11:05 am

rob
the point about there being no translation table
between version 1 and 3 really concerns Ems comparison. you can’t
simply match on id number. in all cases you
you must check location name the actual
data and then the id. id can change
as suppliers correct or upgrade their
systems. relying on id is a known
recipe for disaster. been there done that.

June 23, 2012 11:09 am

Mosher, I looked at the daily data for Malahat. There was one E value and 4 M (missing).
The average is 5.825925926 with the E value, 5.95 without. EC used 5.8 for the monthly summary value.
http://www.climate.weatheroffice.gc.ca/climateData/dailydata_e.html?timeframe=2&Prov=BC&StationID=65&mlyRange=1920-01-01|2005-04-01&Year=1992&Month=1&Day=22
But I am still really curious where BEST got -13.79C for Dec 2002 from this:
http://www.climate.weatheroffice.gc.ca/climateData/dailydata_e.html?timeframe=2&Prov=BC&StationID=65&mlyRange=1920-01-01|2005-04-01&Year=2002&Month=12&Day=22
All the means were above zero (and one value was missing).

June 23, 2012 11:17 am

micheal r
if the differences were distributed in time as you suggest the rends would be different
that is the point of comparing
trends.

A C Osborn
June 23, 2012 11:20 am

sunshinehours1, I am a bit surprised about the value being lower as I have found hundreds of Winter values in BEST that are higher than the Summer values, when they should be at least 5-10 degrees lower, maybe even 20 degrees lower, the data is riddled with those kinds of errors.

Pamela Gray
June 23, 2012 11:30 am

Steven, observation should precede experimentation. My speculation should properly begin with observation in classic experimental design. If observed (the 1st stop following speculation) dropout and ENSO patterns show correlation, homogenization (the experiment – the 2nd step) would be better informed. It is okay for you to speculate that temperatures have risen in response to CO2. It is okay for me to speculate otherwise. But what of the 1st and 2nd step in research methods? Have you done the necessary observations first or did you skip to experimentation?

Editor
June 23, 2012 11:31 am

Steven, first, my congratulations to you and Nick Zeke for an interesting and meticulously documented post. It is clear, well researched and presented, and eminently checkable, just like science should be done.
Heck, contrary to your usual practice, you even actually answered a few questions. However, you seem to be falling back into your bad habits when you say in the comments:
steven mosher says:
June 23, 2012 at 7:22 am

Steven Mosher,

“Interesting that you think Tobs only applies to the US. It doesn’t.”

If you say this is that you have a case in mind. Have you a reference?

#############
yes. do more reading and post your results.

I gotta say, that habit of yours of making cryptic responses that simultaneously don’t answer anything, insult the person asking the question, and pretend to great knowledge on your part, is getting really old.
Phi has posed an interesting question. He has asked you, as is bog-standard scientific practice, simply for a reference for your claim. Either answer the man’s question or admit that you don’t have an reference. The kind of response you have given is both meaningless and damaging to your reputation.
w.

June 23, 2012 11:35 am

Could we get this by decade: “Figure 3: Difference between GHCN v1 and GHCN v3 records matched by station lat/lon, excluding cases of zero difference.”

paddylol
June 23, 2012 11:37 am

I know enough about statistics to know that global sea surface temperatures in all global temperature data sets are modeled and extrapolated from an inadequate number of data sources. Since 70% of the planet’s surface are oceans, the extrapolated data is speculative and readily manipulated

JT
June 23, 2012 11:52 am

@Mosher
@Smith
question: if you took each raw temperature measurement and plotted it against the time when the measurement was made from the earliest known temperature measurement to the latest known temperature measurement so as to create a complete scatterplot of available raw data, what would it look like?
question: if you computed the trend through the whole of the above described scatterplot using ordinary least squares regression what would the slope of that trend be?
question: if you computed a set of trends through the whole of the above described scatterplot beginning at the beginning with the first month of data only (i.e. a one month trend), and recomputing the trend with one additional months data points added each time again using OLS to create a sequence of slopes of increasing temporal length and then plotted the values of each slope against the date of the month which had been added in in order to calculate the trend at that month, what would that graph look like?

phi
June 23, 2012 12:21 pm

Chas,
“Thermometer bulbs shrink over time”
It’s an interesting point, I addressed it there:
http://rankexploits.com/musings/2012/a-surprising-validation-of-ushcn-adjustments/#comment-95708
E.M.Smith,
“The segments WERE separate in v2, but in v3 the data come ‘pre-homogenized’.”
So we have actually no unhomogenized global temperature because it would require unadjusted aggregate series for it.
Steven Mosher,
“start with canada which adjusts for tobs”
Well, you have a reference which shows that “the most significant adjustment is Tobs” for Canada?

E.M.Smith
Editor
June 23, 2012 12:36 pm

@Paul in Sweden
I’m a volunteer, so not a lot of resources consumed by me 😉
But yes, the amount of global waste over the fantasy of a Global Average Temperature is astounding. That it could instead be used to absolutely fix and solve a large number of significant problems instead is a tragedy. That the “answers” that come from AGW Panic are ‘exactly wrong’ is heart wrenching. (For example, if we did Coal To Liquids or Gas To Liquids we could be free of OPEC oil in 5 to 10 years fairly easily and at lower cost than gasoline here last week. Similar fixes to global “fuel poverty” are just as available. Then the “oil wars” could end… as just ONE example of how the misallocation of resources is a travesty.)
FWIW, the very notion of a Global Average Temperature is based in a fundamental error of Philosophy of Science. It is an intrinsic or intensive property. You simply can not average two temperatures from different things and have any meaning in the result. It is an obscure, but vitally important point; that is consistently ignored by the entire Climate Panic Industry…
http://chiefio.wordpress.com/2011/09/27/gives-us-this-day-our-daily-enthalpy/
http://chiefio.wordpress.com/2011/07/01/intrinsic-extrinsic-intensive-extensive/
Take two pans of water, one at 0 C and the other at 20 C, average temperature is 10 C. Mix them. What is the resultant temperature? You do not know. What are the relative masses of water in the two pans? Is that 0 C frozen or melted? Same problem in climate and temperatures. Average a place with 1 C change from below zero to above zero and 2 feet of snow melt with a desert that has a 1 C change the other way, you get “no change”, yet tons of snow melt heat of fusion moved… It is just wrong to use average temperatures for heat flow problems. But “it’s what they do”…
So I’m occasionally a bit “bummed” that so much time is spent examining “average of temperatures” since it really is an “angels and pins” question; even when I do it. But as that is where the “debate” is mired, it’s the only place to participate. Just somewhere in the back of your mind remember that GAT (or any other average of temperatures) is just meaningless.
@david_in_ct:
I could say “They won’t do that as it will find what I found”, but that would be cheeky… 😉
In defense of what they have done: In forensics you always want to look at things from a different point of view. Watching the magician from the audience front row doesn’t illuminate any “issues” as well as looking from back stage (or in the basement under the stage…)
So it IS important to have a divergent look at things.
My only complaint about this critique, really, is that “part two” is typically to look at things in the same way, then “part three” is to “compare and contrast” followed by “why are the things that are different, different? What is going on?”. That is where learning and wisdom come, and that is where the “perp” is typically caught and the Magician has his trick “found out’. All those are missing here.
No curiosity at all about why a ‘grid / box’ method finds similarity while a FD variable assortment method finds interesting patterns of divergence. Just “He must be wrong because he didn’t do it my way”. It isn’t about which way is “right”, it is about how each method illuminates what can be happening in the climate codes. That this ‘grid / box’ method finds similar data, yet the climate codes smear data all over, and my method that blends data also finds difference; that ought to cause much more curiosity about the nature of the data blending and smearing in the climate codes. Instead it just devolves in a more tribal ‘fling poo’ display. “I’m right, he is in error”. Not very useful, IMHO.
So the “look at it different” is a valid step, but only a first step.
It could be that the data are being carefully manicured in just such a way that the distribution of changes about a mean, ignoring date, gives a surprisingly symmetrical curve. ( I’d expect some kind of small bias in ‘accidental’ non-directed adjustment…) It could be that the ‘grid / box’ averages are so close because the data are being manicured up front to assure that result (but knowing that the actual effect will be different when run through an RSM or Homogenizing blending step). Or it could just be a reasonable and randomly collected set of data that has those properties, but tickles an unexpected ‘bug’ like behaviour when spread around. It could be malice, so a more paranoid forensic mind set ought be applied; or it could be simple error and a less paranoid and more subtle analytical examination ought to be applied. Or, in fairness, it might just be that FD has a different result than the grid / box / baseline method (and that, then turns into an argument over which is more ‘accurate’).
We just don’t know. (And “Trust me” from the data set creation folks is NOT acceptable, not now that Hansen has testified that breaking the law is justified if your goal is “good” enough.)
But even IF it is an artifact of FD vs grid/box/baseline: That, then, just raises the question of how do we benchmark the grid/box/baseline codes in use so as to prove THEY are not out of whack with the warming they find? If 1/2C to 1C of “warming” depends on method, how do you prove the “error bars” are less than 1/10 C when you can not run a benchmark through the codes? Or prove the data unbiased to that precision?
At any rate, taking a different POV is valid, just incomplete.
@Geoff Sherrington
That’s the kind of “devil in the details” stuff that gets ignored in the kind of examination done in this critique. It is the kind of thing that the forensics mindset is dedicated to finding. It is why doing a broad comparison of data sets is an important thing to do, and why past versions ought to all be archived and publicly available (but are not).
@Wayne:
Pixel counting? Talk about dedication 😉 Nicely done, though…
Yes, things always cool the past and warm the present (with the only exception being the occasional warming of the very deep past where the data are thrown away in HadCRUT and GIStemp.)
My question can be recast in the context of your finding as “To what extent does the small induced bias found in a non-spreading grid/box examination become a larger bias when spread via homogenizing, UHI “adjusting” and infilling via the RSM?” The critique here can not answer that question. My examination points out that there’s plenty of opportunity for selected data items to have more impact when spread outside the grid / box. I ask the question “Is that enough to account for the imputed ‘warming’?” Others say “It can not be a problem, trust us.” I’d rather not trust. I’d rather verify…
@Phi:
It is exactly that ‘joining of segments’ as a kind of ‘splice artifact’ that IMHO can be a source of issues. In places like Marble Bar Australia, GIStemp finds a ‘warmer than ever’ trend; yet the record temperature there has never been matched. IMHO that is evidence for a “splice artifacty” behaviour in GHCN/GIStemp interactions. So my approach is to look for the potential for that kind of artifact stimulation in the data. (Segments with a low start point and a high end point). FD, being more sensitive to end states, will react more to those “high ends” and show the rest of the past lower in comparison. It will illustrate the “splice artifact” potential in the data.
The critique here takes deliberate action to suppress that effect (uses a baseline from a semi-random spot in the middle) then simply asserts that the various climate codes can’t have a sensitivity to the splice as they do something similar. I find that inadequate (especially in the face of the known data smearing methods and the specific examples of places like Marble Bar where the artifact effect is clearly present.) So theory runs headlong into existence proof…
No, I’ve not been able (yet) to prove exactly how the artifact effect gets through the codes. (GIStemp is brittle to station change so hard to benchmark) but to simply assert it does not matter because a “toy world” version of the code shows only a small part of it is, well, inadequate.
@all:
I see Steven Moser is more interested in tossing insults than thinking. Spending more time with the Warmers, eh?
BTW, I’ve generally been pretty careful to say I use a variation on a peer reviewed method and to NOT claim that that variation was peer reviewed. Please to do not assert that I lie about that. I’ve said FD is peer reviewed, and it is; and then I point out where I do something different from the classical form. That, too, is true.
Per “Didn’t Get The Memo”:
Steve seems to have not read my statements about wanting the end point sensitivity in the look for splice artifact potential. Oh well. He also seems to have ignore my response in the prior thread where I quoted Jeff Id saying 10,000 segment compares had near zero error but down in the 50 range there was an issue; and where I pointed out that using FD on over 6000 thermometers, by month, gives over 72,000 segments and is well over 10,000. Oh Well.
Steve seems fixated on what finds the perfect Global Average Temperature (ignoring that it is impossible and ignoring that it is meaningless even if found). Not “got the memo” on intrinsic properties I guess. Oh Well.
Here’s a hint Steven: Drop the insult tone. Makes you look vindictive. Contributes nothing. I’d also suggest being careful hanging out with Warmers. It is slowly making your presentation negative and bitter in public.
Personally, I don’t think there will ever be a “right way to go”. Just different methods with different uses and different “issues”, so usable for different things. The world is not a nail, even if you love your hammer…
:
I spent a while trying to find “raw” data. It doesn’t exist. There are, as Steve calls it, “First report” data. And it has errors in it.
I noted one where I found 144 C for a station in the USA.
Now Steve is happy to say ~’but we can find that 144 C and change it to something valid’. I think “Hmmm… So we catch the 144 C, but do we catch the 40C that ought to be 38 C?” There is a system that looks for some large number of sigmas off and replaces them with an average of “nearby” ASOS stations (think hot airports, then averaged that will by definition suppress things like large down spikes) but this does nothing to catch whole classes of bad data.
That the electronic instruments seem to regularly “fail high” to insane values, THEN are caught and fixed, implies a probably slow fail to higher than would not be caught, THEN a catastrophic fail high. A periodic calibration is supposed to prevent / catch this, but that would just result in a series of segments with ‘start low end high’ effects that, in smearing / splicing, can induce bias into the result.
So while some folks are happy to just say “Trust me, we fix it good.”, I’m less willing to accept that. The data from the electronic thermometers that “sucked their own exhaust” and heated from the humidity measurement device is still in the record, for example.
:
One clarification:
While I do think all the adjustment and such of individual data items is an issue; I think that the larger issue is the potential for changing which records are in, vs out, of the data set, to have impacts. What I show in the v1 vs v3 is that they DO have impacts, and those can be large.
What the critique here does is say in essence, if we avoid the question of station records in a small area having influence over others we don’t find that problem. Yes, don’t look for it, you won’t find it. Then the assertion is left that the various climate codes are also insensitive to the effect (yet we know that they do data spreading).
IMHO that is the larger issue, not things like tossing out a 144 C and replacing it with an interpolation from the neighbors…
Basically, my major point is just that “The way you look at the data has more effect than the presumed Global Warming”. So is GIStemp right, or wrong? You can not know. It is not possible to benchmark their code and do comparative runs on different data. But you can put comparative sets of data through different systems and get an idea of the general range of sensitivity. That shows a range greater than the Global Warming signal. So ‘choice of method’ can have more ‘warming’ than the GW signal. Choice of data set can have more ‘warming’ than the GW signal. I find that “a problem”. Others like to say “Trust us, we have it right”. This critique says “Here is A way that is less sensitive”. Which is sort of a “so what” kind of moment for me…
@Pamela Gray:
IMHO, the “QA method” is even worse than that. It takes a collection of “nearby” ASOS stations at airports and uses them to replace any data found too far from the expected value. IMHO this will incorporate Airport Heat Islands by default AND via averaging them, prevent any low going extreme values from showing up. (An average always suppresses range).
For homogenizing, the techniques vary, but largely use the same kind of “average a bunch” to get a value. That will always tend to put in mid-values and not what would be there in the real world of actual high or low range values. And what if “the bunch” have “issues”? Like Airport Heat Island? Well, you get that too. (And very large percentages of present GHCN data come from Airports, while by definition none came prior to 1914).
The whole issue of how potentially wrong data is detected and handled is, IMHO, not very good. IIRC the threshold for tossing a “bogus” value was a large fixed temperature in one code ( so low excursions will be tossed more often than high excursions… Inspection of the data shows cold spikes are more extreme than high spikes). In others it is several std deviations. That will miss a whole lot of “somewhat off” errors (like those electronic thermometers that “sucked their own exhaust”).
It’s a modestly large source of error that is largely ignored, IMHO.
(Time for another break… back soon.)

Richard T. Fowler
June 23, 2012 12:50 pm

E.M., thanks for your response to my point that Steve Mosher also responded to. I think you’ve done some very good work here.
RTF

Chuck Nolan
June 23, 2012 1:09 pm

Is the data “Fit for purpose”?
First you have to know what their “purpose” is.
The people who want to use this data want to take control of energy. All production, manufacturing, development etc would be with their approval. They want to use it to stop fossil fuel use. They want to use it to stop third world advancement. They want to use it to tell you where you can live, what you can buy and how to live your life. All this based on “the adjusted data”.
So I ask, is it fit for purpose?

June 23, 2012 1:18 pm

steven mosher says: June 23, 2012 at 7:17 am
You can expect some updates to that Sante fe chart in the coming months. I suspect folks who do spectral analysis will be very interested.
I will look forward to your results.
I suggest to separate two hemispheres, South is less volatile, ocean inertia and CPC flywheel effect, North is affected by gyres and more in sympathy with the GMF
Till then this is what I get
http://www.vukcevic.talktalk.net/NH+SH.htm
When done I’ll email you magnetic data, so you can have some fun with it

wayne
June 23, 2012 1:27 pm

EM: “Pixel counting? Talk about dedication 😉 Nicely done, though…”
That’s me stepping all of the way back to Newton physically counting areas under the curve when no closed form is available (or in this case no data at all) and is all you have to work with, but that burning question presses you to answer it! It’s just a qick and easy c# program to count each color’s pixels. Pick up any chart and you can either color under the curve, or like this one, color the bars, and presto, you’ve got the approx. multiple integrations. Does work. And if you ever try it, use mspaint to first convert png or jpg to 16 color or 256 color fmt and pick only pure colors so you can easily identify which count goes with which color. A whistle lets me exclude numerous colors that have less than ‘n’ pixels, makes it real easy. Your limits are on the axises already.
But tell me, is that your Figure 3? If so you should have the exact ratio and I’m curiuos just how close this method works. Only if you get the time.

Bill Illis
June 23, 2012 1:47 pm

Are we to take away from this that there are no significant adjustments to the GHCN data?
The USHCN adjustments are nearing +0.5C but GHCN to NCDC Land is just 0.004C per decade?
This requires a confirmation because that is what your charts and data is showing.
——————–
Perhaps you could re-try the analysis with the original 1992 GHCN data held at the CDIAC.
http://cdiac.ornl.gov/ftp/ndp041/
probably can just unzip the Temp.data file (22.8 Mb unzipped) and replace it in your programs.

June 23, 2012 2:06 pm

EM Smith says
“the perpetually rosy red arctic in GIStemp output where there is only ice and no thermometers does not lend comfort…”
For those who don’t know what he means, he’s a video that will help you get a start on understanding:

June 23, 2012 2:17 pm

Phi, these bulb contractions do not seem to be as large as those you mention, but 0.1 C in the first 4 years (in what is presumably a top-notch thermometer ) is big enough to explain all of the ‘global warming’ if the thermometers were broken roughly every 5 years.
I have a “Meterological Observers Handbook 1939″ (HMSO) and there is not one mention of calibration cards or any suggestion that thermometers should be sent for recalibration after a period of time. There are paragraphs on methods to avoid breakages (” (your) coat should be held with the left hand” etc) .
I have the 1946 amendment list which just adds “Maximum thermometers with solid stems (not sheathed) are very likely to break above the bulb. They should therefore be gripped above the bulb rather than in the middle when setting”.
It might be that they did not consider recalibrating the thermometers simply because they tended to have a short service life.
Broken mercury/spirit thread failures are ‘upwards’ too. Perhaps there is a case for trimming away data prior to ‘missing values’ and trimming for a few years afterwards ?

Philip Bradley
June 23, 2012 2:27 pm

steven mosher says:
Interesting that you think Tobs only applies to the US.

Phi said the adjustment only applies to the USA. What adjustment method is used is determined by how and when thermometer reading practices changed and when automation occured.
These vary by country and hence the adjustment (method) needed also varies by country. Karl’s adjustment method is specific to the USA. If it used elsewhere then that is questionable.

E.M.Smith
Editor
June 23, 2012 2:38 pm

@Pamela Gray:
The stations are not at all the same collection. Some are the same, others not. Station counts are quite different (so stations must be different). I provide a “count” of individual station records used in making any given ‘report’ of anomalies from which I make the graphs. You can find those counts in the individual reports that are posted in the individual examinations of regions and areas here:
http://chiefio.wordpress.com/v1vsv3/
In particular, for the set as a global whole here:
http://chiefio.wordpress.com/2012/06/01/ghcn-v1-vs-v3-1990-to-date-anomalies/
Where the 1990 ‘count’ for v1 is 3400 (down from 3503 in 1989) and the v3 ‘count’ is 4703 in 1990 (down from 4929 in 1989). So, of necessity, there are at minimum 1303 “different” stations in v3 than in v1. (There will be more, as station changes do not show up in the broad ‘count’. There will also be instrument changes that do not show up as either a count or a station change, that were flagged with ‘duplicate number’ in v2 but are now lost in the homogenizing of v3).
It is that constantly mutating content of the thermometer list that is, IMHO, the biggest problem. Any chemist who has done calorimetry will tell you that screwing around constantly changing the thermometers gives a load of error and splice artifacts that simply can not be reliably removed. And make no mistake about it, what the climate codes claim to do is use past temperature records to do a calorimetry on the Earth, showing net heat gain. (Just without accounting for mass, mass flow, phase changes, and thermometer changes…. Yeah, that bad…)
My comparison looks for how those changes can have impact on the results. This critique says if they look at small enough batches, they don’t change much. But the climate codes smear the data all over and only ‘batch it’ in the end steps; so IMHO the method used here will fail to find that problem in the data.
@sunshinehours1:
There is an accepted fallacy in ‘climate science’ circles. That fallacy is that if average things enough, you can get any precision you like. (Yes, an average can have far greater precision than the individual data times that go into it… but that’s not the point… read on…) It is true that if you average a bunch of readings of some value, the random error in that sample will be reduced. It is not true that the systematic error will be reduced.
An example:
You have a Liquid In Glass (LIG) thermometer read by a human. It is at about 4 feet off the ground, most folks are taller than that. So they look at the meniscus in the glass, find it above a whole degree mark (call it 95.4 F) and dutifully put on the record / report “95 F” (they only recorded whole degrees F in the USA for a very long time. Historically the directions even said that if you missed a value, you could just make one up; but once I linked to those directions at NOAA, they rapidly evaporated… One can only wonder why…)
Now there are two ways to illustrate the potential for error here. One is to say you change to a short person reading the thermometer, so they see the meniscus at 95.6 and report 96 F on the form. That one is pretty clear. You could easily have a series that is prone to that shifting when staff has turnover. (Or new training happens on how to look at the meniscus straight on).
The other is more interesting, IMHO. Say the station is now replaced with an electronic gizmo that reports “95.4”. You now have 0.4 F of “warming” that comes just from change of process (how temperatures are reported). As you can not go back are “fix” the past records to be greater precision than exists, this structural bias will just be there forever. So, say, your people regularly just looked at the meniscus and reported the last line it crossed. On average, the new electronic thing will be 1/2 F higher on what it reports. (Note that I am NOT saying the directions to the people were to do that, only that it is a very common thing for folks to do. Truncate instead of round.)
The more important point here is just that you do not know if that kind of structural error is in the data. It might be. It might not be. So averaging can remove random error (one person rounds up, the next rounds down) but can not remove a structural bias where most folks truncate because it is easy and a few round.
Nor can it remove the error from an old LIG that was regularly 1/2 F low being replaced by a new electronic one that is ‘spot on’, so you get a 1/2 F “lift” to the data. As the prior data were reported in whole degrees F, you can not find that “lift” by inspection of the data.
IMHO, a lot of the ‘step function warmer’ that happens at the 1987-1990 area is from exactly those kinds of structural errors in the “splice” when the “duplicate number” on the segments changed showing a change of process and / or instruments.
Also, IMHO, anyone who reports or uses temperatures to more than whole degrees F is showing that they either do not understand the problem, or are deliberately choosing to ignore it. ( I occasionally choose to ignore it, as it mostly just gets the True Believers In Over Averaging False Precision tossing rocks at me; but I occasionally point it out…)
So for me, that BEST puts values out to 3 decimal places in their data set tells me immediately that they are making some fundamental errors around the nature of averages, precision, and the false precision from expecting averages to remove all error instead of just random error.
(IFF they put a disclaimer on the set somewhere that they know the precision is false precision but are passing through the results of calculations for others to have the exact result, that is a reasonable thing to do. Saying “Only the whole degrees are trusted, the 1/10ths are good to about 1/2 C, and the rest is false precision but there for you to deal with” is a more flexible way of providing such a data product, and is fine. Saying “that 1/1000 place is accurate” is just wrong and indicates lack of understanding. So it really depends on the data set notes.)
FWIW, the point you illustrate is part of why I think the data are “an issue”. So much of it wanders back and forth by 1/2 C / 1 F type values and is riddled with structural errors that you just can’t really say anything about temperatures in the fractional degree range of a large average. But the True Believers will vociferously defend their divination powers to all sorts of precision…
@DocMartyn:
Nice point… WHICH station matters quite a bit when things start ‘reaching’ for data up to 1200 km away and using ‘selected’ corrections. There is one station in the middle of Europe that is key to GIStemp getting the UHI adjustments largely based on it. The code even puts in a ‘special’ set of data for that station so it becomes the ‘longest record’ in the area (it looks for ‘longest’ to determine which station gets priority in adjusting the others…)
So there are key stations with far more power than others.
Per “Fit For Purpose”:
I probably ought to have made it more clear that specifically the purpose I see as “the issue” is feeding into “Serial Averager” and “Serial Homgenizer” codes like GIStemp and expecting to get anything reasonable out.
In particular, that codes which give individual stations and individual data items ‘reach’ out to large distances to modify the contents of that record or those areas. The potential for hidden “splice artifact” like effects from picking up those higher end data and giving them long reach into other records is just too problematic.
I suppose my ‘critique of this critique’ mostly comes down to that point: I’m looking at what the data are used for in the AGW world (feeding GIStemp, NCDC Adjusted, Hadley) that do those kinds of ‘spread the wealth’ process and find the data unfit for that kind of process. This critique finds it suited for use inside small boxes. So? Not the point at all…
Dawg & Climatebeagle:
They do like to periodically shuffle the deck and make it hard to do comparisons…. It was that violation of the standards in things like accounting (where I have more experience) that first got me a bit ruffled. It’s just really dodgy technique, at minimum. While it makes me think of constantly moving walnut shells and peas; I can’t prove it is anything other than gratuitous change from lack of caring.
For some changes, like the Country Code map, it makes more sense. Countries come and countries go. But even there, they could have kept the unchanged group, well, unchanged. Instead a complete renumber was done. Near as I can tell they just picked a map and started assigning numbers in sequence each time. Not very sophisticated.
So “malice” or “stupidity”… heck of a choice.
And I notice in a further comment that Steven still hasn’t noticed that I don’t DO station matching on WMO number. ( I don’t do it at all. I compare groups of stations in matched country codes, or in the first digit of it, Region. I don’t expect that a North American station will suddenly wander off to Asia…. ) Nothing like having someone all in a tizzy about you doing wrongly something you don’t do at all. No idea how to answer that. It’s a “Have you stopped beating your wife?” issue structurally. “I never did it at all” is the only thing to say…
:
BINGO! again… As Steven points out, there are a very large number of flags for various degrees of flaky. Guess why they need them 😉
IMHO so much of the manipulation done to the data between collection and end use is done that the end result is full of all sorts of “pretty values” but they have lost usable meaning.
At one time I started listing all the steps of changes and couldn’t get them all listed. Even the “base data” are full of “estimated” and “interpolated” and other kinds of “not really data” flags.
But that’s for another day…
R:
Good point…
@Ripper:
Ah, the light dawns!
That kind of ‘selective loss and infill’ is what I think is at the heart of how GIStemp and related do a ‘splice artifacty’ smearing process (across long distances…) and get the effect they do.
That is exactly why I will ONLY compare a thermometer with itself, and only within a given month series and span dropouts without a reset. It breaks that kind of “delete and infill” effect. That is why it is important to deal with the dropouts in a way that negates them.
One of the things that I first ran into looking at v2 and GIStemp was the way some locations seemed to have dropouts just at convenient times that would be ‘filled in’ from nearby places, where the record was more suitable, by ‘warming by splicing’. That is especially visible in the Pacific where several islands that are “dead flat” with one thermometer had their data suddenly end, and the nearest place (Tonga or Fiji I think…) had something like 6 thermometers that changed over time (so splice opportunities) and when spliced gave a nice warming trend. So that “trend” gets spread into the ‘grid box’ of the nearby island and compared to its past flat, and presto, instant warming grid / box.
Congratulations on spotting it too!
And that, BTW, is why it is important to compare OUTSIDE the prescription of small grid/boxes and why I made a system that lets me do those kinds of ‘variable areas’… (And why the method used in this critique can not find that ‘issue’…)
@Pamela Gray:
Nice idea… Hmmm… someone has been thinking 😉
Also useful is to find those islands with data truncation and get current data and see if they have continued their flat trend (or even just if the present Wunderground value matches the historical reports, modulo the Airport Heat Island…)
@dp:
Another person ‘gets it’ about the nature of this critique 😉
I’d only add that the method for comparison ought to also have the goal of showing how not-doing infill on dropouts and how examination of data spreading could present “issues” in the typical climate codes that do in-fill, homogenizing, and Reference Station Method type spreading…
@Anthony:
I’m fine with the “put it up when you get it”. No worries.
“Reality just is. -E.M.Smith” and the comparison will show what it shows, be I looking or not. Similarly, my comparison shows what it shows. It is the answer to “what do they show?” that is interesting, not exactly when they show it…
As I’ve now caught up with my first response to comments here, I’m taking a lunch break, then I’ll come back and look at what is in subsequent comment.

June 23, 2012 2:52 pm

phi says:
June 23, 2012 at 12:21 pm
Chas,
“Thermometer bulbs shrink over time”
It’s an interesting point, I addressed it there:
http://rankexploits.com/musings/2012/a-surprising-validation-of-ushcn-adjustments/#comment-95708

phi, I would submit there are other factors not addressed regarding accuracy over time (re: ‘glass thermometer bulb shrinkage’) besides “thermometers which suffered from a slow contraction of the mercury containers.” and this is in the category of the ‘shrinkage-expansion’ hysteresis glass exhibits, to wit I quote the following from here:

The next most significant cause of error comes from the glass, a substance with complex mechanical properties. Like mercury, it expands rapidly on heating but does not contract immediately on cooling. This produces a hysteresis which, for a good glass, is about 0.1% of the temperature change.
A good, well-annealed thermometer glass will relax back over a period of days. An ordinary glass might never recover its shape. Besides this hysteresis, the glass bulb undergoes a secular contraction over time; that is, the thermometer reading increases with age, but fortunately the effect is slow and calibration checks at the ice point, 0°C, can correct for it.

The chief cause of inaccuracy the above reference cites is: “that not all of the liquid is at the required temperature due to its necessary presence in the stem. Thus, the thermometer is also sensitive to the stem temperature.” (An air-measuring thermometer obviously would be ‘immersed’ in the air it is intended to measure.)
Unfortunately, Chas’ linked material does not address ‘shrinkage’; it may have been directly on pg 95 which did not show in the preview I was allowed by Google.
.

June 23, 2012 3:13 pm

It’s late and my brain fuzzes. But my every instinct is that EMS is right and Steve and Zeke are not even beginning to investigate what EMS is saying.
My instincts in this are based on my past assemblage of several different formidable pieces of work, not least being the excellent work of John Daly, and that of Illarionov, videos reposted above by Amino Acids in Meteorites. Each project in my assemblage showed serious evidence of unquantified UHI. Logic says that this unquantified UHI has wrecked the “homogenization” from the start – to say nothing of dropped thermometers and smearing results over unjustified large areas especially polar.

phi
June 23, 2012 3:18 pm

Chas, _Jim,
“these bulb contractions do not seem to be as large as those you mention, but 0.1 C in the first 4 years…”
In fact, my interpretation is: discontinuities of raw data are overwhelmingly coolings. It is known that this is mainly related to stations moves. It’s very annoying for climatologists because it can be explained only by significant perturbations by urbanization which are not corrected. Therefore, they look to other possible causes (glass contraction, change of hours of observations etc..). It is likely that these effects actually exist but their importance is clearly overvalued. Anyway, in the case of glass contraction, the break is generally corrected but not the slow warming preceding. Still a little boost to anthropogenic warming.
Philip Bradley,
“These vary by country and hence the adjustment (method) needed also varies by country.”
Steven Mosher claimed that most adjustments came from Tobs not only in US. TObs adjustments are generally made ​​in all countries but they are weak and the problem is usually totally different. In this regard, US is in fact a special case. Very curious.

Gary Pearse
June 23, 2012 3:18 pm

I’m with Paul in Sweden regarding all the homogenizing and correcting and extrapolating into a globe of gridboxes to arrive at some highly doubtful global average temperature from which we can integrate grid boxes of anomalies to show Global Warming. If its significant Global Warming you are trying to show, you could choose an area or cluster of areas of the globe where there are abundant thermometers – 10 to 20% of the globe should be enough and foreget about the oceans (yeah I know 75% or so yadda yadda). If we are going to warm significantly, this selected area will show this. If it is unequivocal after 40 years (the time we have already been worrying about AGW) then it will become obvious. Here, I consider hundredths, tenths of a degree as insignificant and even 1 degree over a century also to be insignificant – which is our experience so far. If you have another purpose (whatever that might be) for knowing if the temperature on average has increased several hundredths of a degree to a couple of tenths of a degree over a decade, then by all means carry on. I believe the effort and expenditure made to date to have been hugely misspent resources (all in probably approaching a trillion bucks – research and windmills and government taxation and policies) if it is for determining if we are all going to fry real soon – we could have done something about people who have already died of things we could have fixed with the cash. We’re not like frogs being slowly heated up to boiling without them noticing. .

June 23, 2012 3:37 pm

Rather than “unquantified” I should have said “insufficiently-quantified”.

Philip Bradley
June 23, 2012 3:42 pm

phi. the real issue with the TOB adjustment is Karl used an estimating method for the time of observation, that even he accepts results in an adjustment that could be wrong by as much as 25%, when time of observation was recorded and is available on the paper records. A method using the recorded time of observation rather than an estimate of the time would give us a more accurate value for the adjustment needed.
i assume this was originally done to save money, and then, as is commonly the case in climate science, they doggedly stick to a method that gives them the result they want.

E.M.Smith
Editor
June 23, 2012 3:50 pm

@Willis:
Nice to know I’m not the only one to notice…
@Paddylol:
Then you ought to be really interested in the way codes like GIStemp submit every record to a variety of ‘in-filling’ and homgenizing and RSM based ‘spreading around’… There are giant dropouts in the data where it just fills in the grid / box ( like Indonesia during a war or two…)
Exactly why I don’t do that and why I did a version of FD that bridges the dropout.
@JT:
There is no ‘raw data’ in this discussion. By definition, the monthly averages are a computed thing. Even if you find daily values, they have typically been QA fixed and sometimes homogenized.
With that said, the next problem is that the data are temperatures, so range from minus a lot to very hot. Plotting it all up and you get a wide band of mush that is a narrow point at the start of time (one thermometer in Europe) and gets wider and warmer (and flatter) over time as the Equatorial Zones get added. A trend line through it will mostly show the discovery of the rest of the world by Europeans… 😉
If you make them “anomalies”, then you must answer “vs what?”. That’s what I did, and I ONLY compared any one data item to the same thermometer and (so what StationID it has makes no difference, it is only compared to itself regardless of number) and I only compare within the same month ( so ‘like to like’ in time as well). I can likely make a plot through that for you “in a while”. If I do I’ll put it up at my site and add a comment here.
I did a general ‘bulk comparison’ of v1, v2, and v3 actual temperatures that shows a little bit of how the data flatten over time as more coverage comes to the more stable tropics. Not what you want, but a hint in that direction.
http://chiefio.wordpress.com/2012/05/24/ghcn-v1-v2-v3-all-data/
Mostly it shows “summer months” getting a tiny bit less hot (as more Southern Hemisphere and equatorial data enter the series) and the “winter months” getting a more warmer.
It’s an interesting chart, but not useful for climatology, just for seeing how the composition of the data set changes over time.
@Phi:
I can’t say there is “NO” such set available ( I’ve not done an extensive search) but I can say that GHCN v3 comes pre-homogenized and with the ‘splices’ between “Duplicate Numbers” built in…
T. Fowler:
You are most welcome. Just glad it’s of use to someone.
@vukcevic :
I really enjoy the correlations you find and the graphs you make. Don’t know exactly what to make of them (which makes what do what…) but it hints at something really interesting lurking out there in causality land…
@Wayne:
All the figures here are the product of the poster. My charts and figures are on my site.
@Amino Acids in Meteorites:
Thanks! Very nicely illustrated the problem…
:
A very nice example of “structural errors” that will NOT be averaged away…
FWIW, I think that in many cases the early LIG thermometer stations were just set up and then you didn’t touch the thing for a long time. In some cases, the same LIG thermometer was used for decades (centuries?) especially in some classical / historical cases. It is one of those dangling “loose ends” where each “thermometer” is really a variety of instruments that vary over time and may be one long lived instrument or an endless series of “splices” of different instruments depending on location. ( I’d expect the Stevenson Screens in places with annual hurricane / cyclone visits were replaced far more often than the one on the wall of an Observatory at a University…) So each “record” will be idiosyncratic with respect to instrument change and calibration issues over time…
But some folks are sure it will all just ‘average out’ and give 0.001 of precision 😉
With this, I think I’ve caught up with comments and can actually go visit my own blog for once 😉

E.M.Smith
Editor
June 23, 2012 4:01 pm

Gak! Spoke too soon…
Hello Lucy!
@Lucy Skywalker:
Thanks for the endorsement via instinct 😉
It’s that ‘assembly’ process that’s the issue. This critique avoids the ‘assembly’, so doesn’t find the issue (thus is not really a critique of a method that sets out to do that and does find it. IMHO)
@Phi:
Yup. The devil is in the splices (of various kinds).
Pearse:
That’s why I do the “by region” graphs (and eventually the ‘by country’ graphs) that show wildly divergent changes by region and by country. It just hollers “not CO2” and clearly indicates “data artifact” and “local land use” issues.
And very much in agreement on the incredible waste of resources in the AGW “issue”.
Heck, with the $5 Million “wet kiss” to Mann for surviving a whitewashing one could provide a rocket stove to just about everyone in Madagascar and both save their forest from further destruction for fuel wood AND save the eyesight of huge numbers of women.
Saw on the news that $2 Billion of US dollars were being pledged in Rio for more UN Climate Boondoggles. The amount of real good that could be done with that is so great, and the shear waste of it there so pathetic…

June 23, 2012 4:58 pm

Here’s another short video (34 seconds) showing where GISS does not have stations taking actual temperature data. It will be the black areas on the globe. In these areas they use the areas around the black holes to do estimates of what the temperature would be in the black holes. And as EM Smith is pointing out, these estimates, in some cases, are estimates of estimate of farther away stations.
Maybe we can pass the hat and send some money to NASA so they can buy temp stations for these black hole areas. 😉

ferd berple
June 23, 2012 5:19 pm

Effectively the raw data points are randomly distributed. Gridding has removed the randomness and in the process changed the mean and deviation of the data set. This is a form of selection bias. Similar to what is done with the tree ring circus.

ferd berple
June 23, 2012 5:25 pm

Saw on the news that $2 Billion of US dollars were being pledged in Rio for more UN Climate Boondoggles. The amount of real good that could be done with that is so great, and the shear waste of it there so pathetic…
==================
Look on the bright side. It wasn’t the $100 billion Obama pledged 3 years ago.
Unfortunately the sad fact is that this money has to come from other programs that actually could save lives. Instead tens of thoussands of people die every month from preventable causes, as money that could have been used to save them is squandered on politically correct climate science repackaged as sustainable development.
Rather than save tens of thousands of people every month today, politicians and scientists do nothing and pledge our money to save tens of thousands of people in the distant future. By which time the money will be long gone and no one will be saved.
The real problem is that these people are totally ineffective and anything more than lining their own pockets and procaliming to the heavens how richeous they are.

ferd berple
June 23, 2012 5:47 pm

Gridding to fill in missing data is statistical nonsense. It assumes that the missing points are the average of the surounding data, without any basis in fact. You are mch better to work with the raw data missing the data points for statistical analysis than you are with the processed data.
Want to test this for yourself? Take a non linear series like the Fibonacci series. Randomly delete points from the series. Do a statistical analysis on the result. Now fill in the missing data points with averages of the surrounding points. Repeat your statistical analysis. In every case the results will be more representative of the original series using the data before you filled in the missing points.
This is the nonsense of gridding. It fills in data based on an assumption that is less accurate than the data before you filled it in. You need to know the relationship over time for how the missing points interact with their neighbors before you can fill them in. Nothing says this will be an average. Thus, if you use an average you are making the data less accurate, not more accurate.
Any time you hear theword gridding, put you hand over your wallet.

Paul in Sweden
June 23, 2012 6:03 pm

EM Smith, Thank you for taking the time for a reply, your volunteer service as well as the volunteer service of so many others has not only been noticed but has made a difference.
“FWIW, the very notion of a Global Average Temperature is based in a fundamental error of Philosophy of Science. It is an intrinsic or intensive property. You simply can not average two temperatures from different things and have any meaning in the result. It is an obscure, but vitally important point; that is consistently ignored by the entire Climate Panic Industry…”
Agreed.
Moving on to data quality:
If we were moving a top level general ledger commercial banking system from one set of data centers to another set of data centers and independent auditors were complaining that accounts were inexplicably off from the present system to the old one and the response from internal auditors at an op meeting was:
“This shows that while the raw data in [GHCN v1] old general ledger and [v3] new general ledger is not identical (at least via this method of station matching), there is little bias in the mean.”
There would be a seemingly long pause followed by language that should not be repeated and a two fold scramble to identify the individual regional discrepancies, quantify them, put teams on them to estimate reconciliation time, all the while other teams would be evaluating the prudence of initiating the meticulously planned fallback and restore plan. Not for nothin’ but I can tell you that versions of that scenario happen at all of the major financial institutions several times a day in order to produce a close of day database. This is done because major financial decisions must be based on quality data and if the books are off heads roll and people are JAILED. The concept that Asia has multiple accounts inexplicably down 40 percent but Europe inexplicably has multiple accounts up 40 percent and somehow this is OK because “there is little bias in the mean” is something that just does not fly.
Chiefio, I totally agree with your “a set of data that are not ‘fit for purpose’” statement but others do not seem to be very concerned. I do not know, it is not like someone would actually spend hundreds of billions of dollars, task 10s of thousands of people in multiple countries, enact punitive laws and turn whole governments & economies upside down based on a decision that even included this irreconcilable data. Right? Surely, they can definitively empirically isolate, quantify each of the first order known(natural & anthropogenic) climate forcings. Right?
With regard to GHCN v1 & v3, the data needs to be reconciled. There are climate scientists out there who actually do climate science that contributes towards regional agricultural and civic planning in addition to the advancement of our general understanding of climate science.
‘There is little bias in the mean’ does not change the fact that the data is corrupted, all regional products cannot be validated. The data is all well and good for academics and MMORPGs on taxpayer supplied super computers, but would never standup to commercial and regulatory standards. It is useless for practical applications and should not be included in any policy decision.
Our historic climate records are important.

ferd berple
June 23, 2012 6:29 pm

One of the most powerful concept in information processing is the NULL. Different from Zero, NULL means we don’t know the value in this cell. Zero means we know the value and it is 0. You can always spot the rookies in data processing, they forever try and replace NULL by Zero or some other value, because they don’t know how to deal with it.
Gridding is no different. A rookie mistake. Rather than tell the statistics that you don’t know what value should be in the cell, you are telling a statistical lie. You are saying you know the value, and it is X. As a result every statistical analysis you perform downsteam will underestimate the error. Your data will appear more accurate than it acutually is, because you have not been truthful about your data..

KenB
June 23, 2012 6:30 pm

A most interesting post, its way past time for Steve Mosher to engage and demonstrate, and thanks for doing so, please continue. Chiefio your Forensic approach to checking and re-checking is needed, even if only to break down the issues, so that they can be understood by all, rather than just serve an ivory tower of convinced individuals who then dictate we should take their groupthink as reality.
Otherwise this sort of data manipulation and declaration, only promotes a monetary search for a new Godlike world temperature better or BEST than all the others “trust us” and in the end what have we got? Problem solved? Fixed? Not likely, but send more money to fix the unfixable, i.e. tax till it hurts!

Venter
June 23, 2012 8:06 pm

Cheifo,
Thanks for an excellent set of replies, laying down clearly what you said, making it easy foreveryone to understand. Straightaway on that score, you’v shown more science, data, facts and humility compared to ” go check for yourselves ” or ” go study more ” bullshit espoused by Mosher.
And lastly thank for showing that in typical climate science fashion, Mosher set up a strawman and pretended to demolish it, completey missing the gist of what your initial post said.
I work in the healthcare field, handling clinical trials being one of my responsibilities. If I use or present data like this GHCN data to prove claims and get FDA approvals in healthcare, I’ll be up before the beak in half a jiffy, charged with data manipulation and fraud.
And we have ” ex-Phurts ” here who are smarter than everybody else by their own self conferred status, defending such data. It’s a joke. These ” ex-phurts ” wold be turfed out of the gate in any professional industry that pays people to be efficient.

David Falkner
June 23, 2012 8:35 pm

I seem to remember a graph in Berkeley’s original release that showed many stations in CONUS cooling and many warming. Apparently, averaging them together gives a warming signal. But does that really reflect reality?

David Falkner
June 23, 2012 8:45 pm

@ Mosher & Stokes,
Comparing trends does lend some semblance of proof, but trends can match for different reasons. GISS and GHCN can match trends for entirely different reasons. Does that make them right?

E.M.Smith
Editor
June 23, 2012 9:25 pm

@JT:
I computed all the anomalies, put them as a csv file, and loaded it into OpenOffice as a spreadsheet.
All that is fine.
Then I asked it to make a chart, and it crashes.
It would seem that a 24 MB file is too big to chart 😉
I can do a graph of the AVERAGE anomalies in any given year, but the whole data set as anomalies (or as temperates) is just too big for OO.

E.M.Smith
Editor
June 23, 2012 10:40 pm

@Ferd Berple:
As a side note, one of my college dorm roomies was named Fred. We all called him Ferd (which I think he applied to himself). I have fond associations with “Ferd”… 😉
The explanation given for the infilling bahaviour is a paper done by Hansen, IIRC, that justifies the Reference Station Method. I’ve read the paper. It basically tests a limited set of stations in a short period of time and shows that a reasonable prediction of one can be made from another up to 1200 km away.
This is then used as justification for filling in any missing temperature from ANY thermometer up to 1200 km regardless of relationship changes over time. AND doing it recursively.
So if the comparison period is when the PDO is in the cold phase, cold phase relationships can be used to fill in data when we are in the hot phase. (And vice versa).
Now think about that a minute. During the warm phase we had a very flat jet stream. West coast and East coast both neutral warm. Now in the cold phase, the west coast is quite cold, and the east coast is having tropical air pulled up over it…. Yet the RSM says you can use the former relationship to fill in missing data during the later relationship. Think that has opportunities for “fill in” and “homogenizing” and “UHI Correction” (all based on RSM) to enable “dropouts” to have “selective influence”?
The other one I like is that here in California, we have the interesting inverse relationship between San Francisco and the Central valley during the summer. When the central valley gets very hot, air rises, and pull cooling fog over S.F. During the winter, the cold just flows over everywhere, but SF is usually warmer than inland via water moderation. So cold: SF warmer. Hot: SF colder.
Now the average of that activity can be used to fill in missing data. Even though either regime might or might not be present at any given time. (Often it’s a 3 day oscillator during the summer. One HOT day, then the air starts to move. Day two still hot, but with a breeze. Day three, cooling air arrives. (Day four it kind of halts and then heat starts to build again…) )
So you get a value that is “the average” but during non-average times…
Then that method gets used recursively. First for infilling. Then for UHI “adjustment”. Then for “grid / box” filling in so you can make anomalies… No paper ever justifies serial use. It is not a peer reviewed behaviour…
IMHO, the RSM needs to be subjected to validation testing in multiple geographies (where I suspect it will fail in some) and in multiple long duration regime change times (so hot vs cold PDO or AMO or Indian Ocean Dipole or AO or …) and as a recursive use; and shown to be invalid in those cases. That would invalidate most of the ‘climate codes’ IMHO. and likely the GHCN now too.
@Paul in Sweden:
You are most welcome. I’d had other plans for what I was going to do today, so it’s nice to know that changing them was of benefit…
Early in life I was the “Night Auditor” at a hotel. I had to close the books each night. If the tabulating machine was off by 1 penny from the books, I could not close. (On one occasion I found a discarded ledger card in the trash for $26.10 (IIRC). It was the amount by which the cash register was off from the posting machine (tabulator). That was a ‘single queen sized bed room’ then. I’d recognized the value, figured someone blew a posting and didn’t put it in the errata book, and dug through the trash until I found the torn up card evidence.
It is things like that which make me cringe looking at what can be done in “climate science” …
I did computer book keeping for companies, including transitioning systems. Didn’t always have to be ‘to the penny’, but pretty darned close…
Then I was manager of QA for a compiler company. Talk about hard core… Think anyone would be happy with “Well, the math suite has some jitter in the low order half of the digits, but the mean result calculated is close”… Or “Well, when we use the Float package it does OK, but using the integer package doesn’t work as well, so don’t use it on that data. Didn’t you get the memo?…” (Yes, I bite my tongue a lot around ‘climate scientists’…)
I’ve also done “Qualified Installations” for a drug company for FDA compliance. I can state with absolute certainty that the data processing and archival process applied to the GHCN would FAIL the FDA requirements for even the most trivial drug approval. Even a new aspirin coating.
Maybe I’m just expecting too much… ( Then I remind myself folks are wanting to play “Bet The Global Economy” based on it …)
@Ferd Berple:
Hadn’t thought about it that way… Yes, NULL is your friend. (Must have been spending too much time in GHCN land where -9999 is the missing data flag 😉
The basic problem, IMHO, began back about 1970 when Hansen first started trying to do the whole GIStemp thing. At that point, I think they realized that the spacial coverage was just crap and the data had so many dropouts that it was useless. Then they had to confront the data quality issues and the horrible precision in the recording of the data.
If you look at the history of the “science” involved (and read the code with an eye to era of FORTRAN to date it 😉 You can see the “fixes” being layered on… At the end, they got a number, but were foolish enough to believe it…
So First Differences does this “reset” on a data drop out. There are so many dropouts that lots of places just make crap results on a FD run. (That’s WHY I made the change I did to span dropouts. The thesis being that “even if I’m missing 3 January values in a row, if THIS January is 1/2 C lower, it’s lower, and that is a usable fact.” The other folks ran off to this “baseline” method averaging a couple of decade values and then using that average to stand in for ‘normal’. All prior to having anomalies, so ‘has issues’ in terms of intrinsic properties…
Then the RSM gets invented to try and “fill in” some of the missing bits. Eventually when they realize the geographic coverage is crap, they use RSM to smear what thermometers they do have over 1200 km away into other grid boxes. Now they can do anomalies for each grid box. Never mind that most of them are entirely void of any actual data.
So in my approach, I deliberately avoid all of those behaviours. A thermometer is ONLY compared to itself. Dropouts are seamlessly bridged without without making up any values at all and with preservation of the actual changes measured over time. NO averaging of a temperature is ever done (only anomalies are averaged). The data are never stretched into empty “grids”. Instead I can take a “cut” of the data based on the Country Code (first 3 digits of the StaionID), or the Region (continent, the first digit of Country Code), or even parts of the WMO number if desired (the actual station ID, the later digits of the StationID). This lets you say “Give me the data you have for a place that has data”, but doesn’t try to change where it covers. Basically, every step I saw that smelled like a ‘kludge’ to me in GIStemp, I replaced with something that was not a kludge. (Except that I left in the “splice artifact” character that comes from blending records in the final report step and I left in the FD tendency to be sensitive to the fist data item in a series – for reasons I’ve explained above, wanting to measure the splice risk).
Then things get a bit speculative. I would speculate that during this process, folks noticed that by shifting things around they could select the result. It isn’t a large leap from “No mater what I do, the result is sensitive to how I select stations” to “Gee, I can select stations to get the result I want.” No, no proof. But the pattern of data dropouts, station dropouts, and that we know have three nations where the local Met Office says “Hang on, use ALL the data and we get a cooler result” does look mighty suspicious…
Oh, and on Steven’s assertion that the StaionIDs all changed… if he looks a bit closer he may find that the Country Code part is all different, but the Region is the same, and that the WMO number mostly is the same modulo an added ‘zero’ in a sub-station field and the lack of a Duplicate Number. Yes, some have changed WMO numbers, but not many. More distressing is that some with the same WMO number have different LAT and LONG values. So that matching on LAT/LONG may not be so perfect either… I didn’t mention it before because it isn’t a very big deal. But my first ‘cut’ at this had me spending about a week trying to build a StaionID map (that’s about 3/4 done) and noticing all the patterns. I looked at matching on LAT/LONG and it still has some errors in the match. That’s why I don’t even try to match on StationID, just let you select subsets by range. (So 5 gets you Oceana – Pacific Islands. 513 in v1 or 501 in v3 gets you Australia. You can go to things like ^501[1-4] to get a particular subset of WMO numbers (that tend strongly to be grouped by geography on the first digit or two). I suspect I know way more about the structure of StationID than any person ought 😉
This technique lets you grab collections of stations from “areas of interest” and compare “group to group”. So ^513[1-2] from v1 will tend to get stations from the same sub-geography of Australia in v1 as ^501[1-2] will get from v3. It is a nice little technique for comparing, say “Stations around Tahiti” to “Tahiti”. Very very useful for inspecting things like “IF I take out this one group of stations with a load of thermometer change, what does the rest of the region look like? It isn’t a problem, it is a giant advantage and avoids the grid / box trap.
@KenB:
Thanks! Sometimes I wonder… Like, I figure Steven must “have a life” and is likely having a nice Saturday Night out. Me? I’m trying to put a plot line through 24 MB of data points :-0
Falkner:
Verity Jones at Digging In The Clay has a very nice set of postings showing warming vs cooling stations and changes over time. She and TonyB even have a nice data base interface on line last time I looked. WELL worth time spent there:
http://diggingintheclay.wordpress.com/
Then click on the “sister website” graph on the right hand side for the database plotting tool.
So we have stations warming and stations cooling and different regions going in different directions. And an average off all that means???? Yeah, me too 😉
@All:
I may check in again a bit later, but I’m more likely to go to bed soon. It’s been a long day…

Carrick
June 23, 2012 10:40 pm

Willis:

Heck, contrary to your usual practice, you even actually answered a few questions

Serious question here:
How many questions are they required to answer? I don’t expect d****bags deserve any answer at all, for example.
As for Smith, he wears his bias on his sleeve, some independent statistical analysis that will turn out to be (NOT) and whatever else, is unsurprisingly is wrong (bias does that too you, it makes very smart people stupid and prone to confirmation bias).
OTOH, people like phi and Pamela Gray need to quit leaning on other people and do their own homework, that’s a fact, especially if they are going to frequency comment on certain topics. When phi argued with Mosher, one of the authors of BEST’s code, over BEST’s capabilities, that was the funniest moment on the thread for me. Right up there with phi claiming that tree rings make better thermometers than real ones. Theres a difference between skepticism and boneheadness, enough said there.
So this is meant as a serious question, Willis? How clueless does a person need to be before we are allowed to blow them off?
I have a very low admitted threshold here, hence I don’t do front page blogging. Not at this point in my life, not while working up to 80 hours a week on my own, very engaging and satisfying, research. Noobs just don’t interesting me much, especially when they are chalk full of their own “answers” already.

E.M.Smith
Editor
June 23, 2012 11:04 pm

@Venter:
Just noticed your comment.
Thanks! I try… Compulsive Service Personality Disorder 😉
I’ve had a long time goal of trying to speak clearly and generically about complex things. In my opinion, anyone can “get it” about complex and technical things if they are described in clear terms. I’ve not found any concept that was so abstract that it required jargon. Jargon can be faster, and I’ve used it sometimes when needed for speed or precision; but generally just thinking clearly about things for a few minutes can come up with something more understandable.
I’m also modestly intolerant of snobs and folks who like to play “Gotcha!”… Which kind of makes me not want to be like them 😉
So you’ve had experience with the FDA, eh? Painful, huh! 😎 It amazes me how many fields have standards that are just incredibly more high than “climate science”, yet the practitioners of it seem to think we are being petty for expecting things like, oh, a Golden Master date on a key data file, or revision control, or archives with revisions, or benchmark suites or regression testing suites or… All just standard SOP in most fields.
But don’t be too hard on Mosher. He’s been hanging out with Warmers and I think it has slowly been reducing his ‘lukewarmer’ independent thinking. He’s just become convinced that “If you just do it exactly the way THEY do it, it works just like they say!”…
His approach to doing the “testing” / critique would be valid if my goal was to make a One True Global Average Temperature like everyone else. It would be valid if I was doing “by station” compares. IMHO, all that happened was he ran to ‘defense’ assuming I was “doing what everyone does” before actually looking at what I said, what I do, and why I’m doing it. But that takes time. I’d guess about 2 days for someone with his skill level (if he already knows FORTRAN it would help) and I think he just didn’t want to put in that time. It is a common thing for folks to do when they are very close to an issue and someone comes at it from a new direction.
That he had trouble understanding my description of handling dropouts tells me he was distracted or not putting much time into it. How hard is it, really, to understand “On a drop out, do nothing and proceed to the next valid data item.”? Just ‘span the gap’ 😉 But it was ‘different’, so ‘not what everyone else does’, so ‘an issue’… I figure he was just up late making the graphs and writing code for this posting…

June 24, 2012 1:26 am

i reckon this issue is now ripe for the lads at Climate Audit.
Thank you Steve and Zeke for making us all sit up and think these issues through more precisely.
Like showing the HS is an unavoidable statistical result of the Team’s method of selection, I suspect CA would show that the warming is an unavoidable statistical result of dropping stations etc. But I am open to disproof.
Carrick?

June 24, 2012 1:42 am

Louis Hooffstetter says: “I’ve often wondered how and why temperatures are adjusted in the first place, and whether or not the adjustments are scientifically valid. If this has been adequately discussed somewhere, can someone direct me to it? If not, Steve, is this something you might consider posting here at WUWT?”
A good review is Peterson et al.: Homogeneity adjustments of in situ atmospheric climate data: A review, Int. J. Climatol., 18, 1493–1517, 1998.
http://onlinelibrary.wiley.com/doi/10.1002/%28SICI%291097-0088%2819981115%2918:13%3C1493::AID-JOC329%3E3.0.CO;2-T/abstract
I recently published a blind validation of the most-used and most advanced homogenisation algorithms. This article also includes the references of the articles describing these algorithms in detail.
http://www.clim-past.net/8/89/2012/cp-8-89-2012.html
To accompany this article, I wrote a blog post with a (hopefully) more easy to read introduction on the main reasons for inhomogeneities in the historical climate record and the main ideas behind the homogenisation algorithms:
http://variable-variability.blogspot.com/2012/01/homogenization-of-monthly-and-annual.html
I hope these links help you find your way into the scientific literature.

phi
June 24, 2012 1:57 am

Carrick,
“When phi argued with Mosher, one of the authors of BEST’s code, over BEST’s capabilities, that was the funniest moment on the thread for me.”
The funniest interventions could be yours. We spoke with Steven Mosher of the ability to disable the implicit homogenization in BEST. This implicit homogenization is the result of the segments adjustments. If you disable this setting, there is simply no results.
“Right up there with phi claiming that tree rings make better thermometers than real ones.”
Yes, this is the case, proven for tree rings densities in the medium term (10-100 years). You still have a lot to learn.

Editor
June 24, 2012 2:04 am

E.M.Smith says:
June 23, 2012 at 10:40 pm says
“Verity Jones at Digging In The Clay has a very nice set of postings showing warming vs cooling stations and changes over time. She and TonyB even have a nice data base interface on line last time I looked.”
Thanks for the plug – actually it is KevinUK who is the database and mapping expert, not Tonyb.
Original post – http://diggingintheclay.wordpress.com/2010/01/18/mapping-global-warming/
Update – http://diggingintheclay.wordpress.com/2010/10/08/kml-maps-slideshow/
I know Kevin has done more recent work putting all this on Google Maps http://www.climateapplications.com/MapsNCDC2.asp but the data for the USA hasn’t been completed yet and we’ve not written anything up on the blog – too busy with the day jobs.

June 24, 2012 3:24 am

Well clearly Mosher & Co have done a favor by showcasing how scientists are mostly clueless about data management. EMS’s analyses of the whole mess is clear and obvious. He has clearly shown GiGo in action.
The replicating temperature smearing is beyond belief. In fact all numbers that show any form of ‘global’ temperature are actually 20% of the world’s measured temperatures (of doubtful quality themselves) smeared out over the entire globe.
It beggars belief. I once had to write a model calculating the dBa of every train on any given time on any given location on any given height on any given distance over the entire national railroad track.
The data used in Climate ‘Science’ would be akin to me putting the noise profile of standard track and an intercity commuter train in a database and use them to calculate the noise of a highspeed train traveling at 200 miles per hour

June 24, 2012 4:47 am

Carrick
Speaking of bias………
There’s been talk of GISS in this thread. Do you see any bias in data handling in these?
Does GISTemp change? Part 1 (6:53 min)

.
Does GISTemp change? Part 2 (11:09 min)

June 24, 2012 6:36 am

Don’t let Carrick fool you with his non-scientific intimidation tactics. Him an Mosher are good ‘ol boy buddies. As we’ve seen in Climate Science countless times, Warmer tribalism will trump objective scrutiny every time.
Andrew

June 24, 2012 7:01 am

Carrick: “As for Smith, he wears his bias on his sleeve”
Mosher clearly has stated numerous times he believes if CO2 has increased it must have warmed the earth. Therefore he works hard to find some magical formula that proves crap data proves the earth is warming.
He has no interest in the third of stations even Mueller admitted were cooling.
He has no interest in bright sunshine data which HAS changed up and down since 1900.
And he certainly has no interest in anyone criticizing his “proof”.
He is like an alchemist insisting that one day, with the right code, he can turn crap data into gold.

June 24, 2012 7:01 am

Mosher and Zeke? Crickets…

Mariana Britez
June 24, 2012 7:09 am

So Mosher was involved with the BEST project now im 100% convinced its C*** now wonder the guy is turning warmist. I would say stick to investigations of Gleick etc your really good at that stuff LOL

A C Osborn
June 24, 2012 7:39 am

Verity Jones says: but the data for the USA hasn’t been completed yet and we’ve not written anything up on the blog – too busy with the day jobs.
You need to spend some of the $Millions that BIG OIL has been paying you all these years and give up the day job.
Sarc off/

Carrick
June 24, 2012 7:40 am

Bad Andrew, how am I intimidating anybody? I just call BS when I see it. I can’t say that I’m more than an acquaintance to Steven. Hardly some old buddy system, and if there’s anybody anti-intellecutal playing games with the truth, it’s you for making these wild claims.
This sort of analysis is much easier to screw up and get a result like Smith finds, much, much more difficult to screw up and find a result like Zeke and Steven find that it “doesn’t make much difference”. Seems like the gauntlet is thrown for Smith to put up or shut up. We have counter evidence, the ball is in his court to explain how code that has been heavily regression tested like Zeke & Stevens is wrong, and something he slung together to prove something he already knew is right.
I’ll note that predictably phi is still trying to argue how the BEST software functions (while being barely being software literate) and still claiming that (the much smaller geographical coverage and non-uniform response to temperature provided by) MXD is a better representation of temperature than thermometers can provide.
Amino Acids—think about the problem this way. Look at the geographic distribution of warming, then think about what happens to your global mean trend when you add in more stations at northern climes. When you adjust for differences in the “land-only” algorithms, BEST and GISTEMP get very findings, since they have the largest geographical coverage, so this is believable. If you want to do an “apples to apples” comparison, look at the 40-50° N zonal average, land only, how does this compare across algorithms? Does it make a flip of a difference?
But my question is really for Willis. How much engaging in his opinion is required with people who have such strong confirmation biases like bad andrew that they apply completely different standards to people who feed them what they want to hear than people who don’t, and anybody who raises doubt about their beliefs is blown off as a “true believer” in any case?
What’s to be gained with engagement here? I do think “do your own homework” is a reasonable retort to people who aren’t going to be touched by reason.

Pamela Gray
June 24, 2012 7:56 am

Victor, I read your blog post. Very interesting. What are your thoughts regarding non-random station dropout that may have over-emphasized ENSO-related geographic decadal oscillations? Would that not bias the raw data? Remember that these oscillations make some areas colder and some areas warmer, depending on the ENSO decadal pattern we are in. These patterns also drive changes in day versus night highs and lows, sunshine days, early versus late onset of seasonal temperature and precipitation changes, etc. If decadal ENSO/station dropout conflagration is a source of inhomogeneity, it would be a big one, would you agree?

June 24, 2012 8:01 am

“people who have such strong confirmation biases like bad andrew”
Carrick, you just made an unfounded accusation. You don’t know what my biases are, if I have any. You can’t know. On the other hand, for years I’ve seen you and Mosher defend each other’s position in blog comments while calling people with other opinions d-bags. Typical Birds of a Warmer Feather, is what the evidence indicates,
Andrew

Carrick
June 24, 2012 8:08 am

sunshine:

Mosher clearly has stated numerous times he believes if CO2 has increased it must have warmed the earth. Therefore he works hard to find some magical formula that proves crap data proves the earth is warming.

Well that’s what you believe so I can see why you’d believe everybody else thinks like you.
Mosher like any rational person with science background understands that there is a direct forcing from CO2 that causes warming. His view is accepted by any skeptic I know with science training, including Jeff Id (hardly a froth at the mouth global warmer).
People who don’t understand radiative physics AT ALL can choose to deny it but absence of knowledge is not the same as knowledge of absence…. and in any case it’s been demonstrated beyond any reasonable doubt, to the point where the very strong critic of the IPCC Steve Fitzgerald, makes a living off of selling devices that utilize the same exact physics that tend to cause climate to warm as more CO2 is increased. I don’t know of a stronger proof for an effect than “it works and it is a viable economic product.”
Does that mean that CO2 causes warming? Probably, but the direct effect is only about 1°C/doubling of CO2. Does it demonstrate that warming is substantive enough that we need to change our global economies to mediate it? No. I think Mosher has said similar things too. Has the IPCC nailed the most likely sensitivity? Probably not, many of us think they are quoting values that are too high including questionable studies on sensitivity to increase the range of uncertainty.

He has no interest in the third of stations even Mueller admitted were cooling.

More confirmation bias, unskeptical thinking on your part. The marble diagram of the US shown by Mueller’s group is badly flawed and misleading. But you accept it uncritically because it feeds a story you want to hear.
Here’s Mueller’s US only figure done right. (Red is warming blue is cooling.) This is 1/3 of the stations, if by stations we mean stations that operated over the entire 1940-2010 period. Taken straight from the same data set Mueller’s group used to produce their figure (to be fair to them, it has been misconstrued by people like you who have used it to interpret something different than what they meant. The figure I produced on the other hand is meant to allow you to make the comparison that you wanted to make.)
And here’s a histogram of trends for land stations both for US only and for global.
We learn three things from this: Climate has noise (who knew?) but also that most stations globally have shown warming in the sense that a scientist would use the word, namely their trend in temperature is positive from 1940-2010. (That is, we don’t look at one point at the end and one point at the beginning to determine if a noisy series is exhibiting warming, we look at the regressed slope of the data.) Third: Confirmation bias is a dangerous thing and unless you need to apply at least as much critical thinking to data and analysis that support you as you do to data and analysis that disagrees with your views.

Pamela Gray
June 24, 2012 8:15 am

What concerns me about confirmation bias is the speed at which climate scientists were convinced to study CO2-related issues while still not having completed all that was needed to research natural drivers, and certainly not all that was needed to research the quality of the multiple sets of temperature data, be they proxies or sensors.
From what I can see at my armchair, AGW scientists were bred and funded from a bias point of view, even though these same scientists may claim they have no bias. If this bias were not the case, we would be seeing a lot more articles from them reporting on their studies of natural drivers, much of which is not clearly understood and admittedly poorly represented in “models”.

Carrick
June 24, 2012 8:21 am

Bad Andrew:

Carrick, you just made an unfounded accusation

Straight from the mouth of the guy who just made unfounded accusations.

don’t know what my biases are, if I have any.

Au contrarie, I doubt anybody who has seen your writing is unaware of your biases. I doubt many will mistake you for Gavin Schmidt for example.
As to “If I have any”? Really??? You’re an android now and not a human??? Fascinating.
Humans have biases, it’s how we function cognitively, and it’s why science is designed with the notion in mind that we have biases and has to be self-correcting against it.

On the other hand, for years I’ve seen you and Mosher defend each other’s position in blog comments while calling people with other opinions d-bags.

OK, give us a link where Mosher and I defended each other while calling people with other opinions “d-bags”. In those exact terms.
Truthfully I wasn’t even thinking of this thread when I wrote, but times when I haven’t engaged people I disagree with opinions on.
You made the accusation, seems like it’s your responsibility to prove it or withdrawn it. Since I know you can’t, I’ll say in advance that your comment in a nutshell demonstrate the types of anti-intellectual games you personally engage in. Psuedo-intellectual arguments followed by blanket, unsupported (and unsupportable) accusations.

June 24, 2012 8:39 am

To Mosher, Zeke and Carrick, for example… All I would say is that in the 50 years i have been around, the temperature isn’t any warmer that I remember when I was 5 years old. It is not as warm as the lovely long hot summers of the 30’s when my Mum and Dad were enjoying their youth. Sea level is the same as it has always been. But apparently CO2 is much higher than when I was a lad. Not that I have noticed…
Sorry to all you lukewarmers and warm-mongers – but I don’t see a problem. And I don’t care how you play with the figures.

June 24, 2012 8:48 am

“Truthfully I wasn’t even thinking of this thread when I wrote”
Evidence that you aren’t the greatest thinker, either. 😉
Andrew

June 24, 2012 8:48 am

Carrick
You can do all sorts of things to make it look like nothing biased is happening with GISTemp. You are showing the bias you think others are showing. You do the thing you accuse others of doing. I’ve seen your type of arguments so many times. You want to say anything that doesn’t show the results you are looking for is wrong and whatever shows what you are looking for is right.
There is no way GISTemp is a true record of temperature on earth.
You also have to take into account the head of the department handling GISTemp is an environmental activist. He says he is himself. If you truly wanted unbiased data then you’ll have to end all appearance of bias. In other words, you’ll have to look at data that is not handled by an environmental activist. Even if you believe his data is unbiased you still cannot reference it since doing that could give the appearance you want data handled by an activist.
There’s other data sets. You’re better off not referencing BEST and GISTemp. Use the others.

June 24, 2012 8:59 am

Carrick
funny how you’re trying to guide the argument away from obvious issues. Like, for example, how GISTemp data does not use ARGO buoys anymore. It looks like the environmental activist running GISS did not like the cooling trend shown in ARGO buoys—even though ARGO has the best coverage of oceans. He dropped ARGO and went to an inferior data set.
And even if that wasn’t the reason the environmental activist did drop ARGO it still can, legitimately, be said that’s why he did because that’s the appearance of why he did it. So if you want to give the appearance you are not biased it’s the better part of wisdom to steer away from GISTemp and not defend it.

Bill Illis
June 24, 2012 9:06 am

Here are the changes made to the Land Temperature Record by the NCDC from Version 2 to the current Version 3 they are using.
They cooled the 1930s by around -0.1C and warmed the recent months by +0.05C.
http://img18.imageshack.us/img18/8978/changencdclandv2tov3.png
This is clearly a systematic change versus random homogeneity adjustments.
Now what were the changes from Version 1 to Version 2? Where is the original Version 1?
And I don’t understand how this systematic change does not show up Zeke’s charts. Is there a difference between GHCN and what the NCDC eventually uses as the actual reported temperature?
Here is the data.
ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/monthly.land.90S.90N.df_1901-2000mean.dat
ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/usingGHCNMv2/monthly.land.90S.90N.df_1901-2000mean.dat

Carrick
June 24, 2012 9:09 am

And I’m out of here. Not because it hasn’t been real. Kitchen remodel in progress. I may or may not get back here later. One can state ones views sometimes, present evidence for them, and move one.
Cheers.

June 24, 2012 9:12 am

Carrick, I didn’t misconstrue anything. I downloaded the data and mapped it.
http://sunshinehours.wordpress.com/2012/03/18/cooling-weather-stations-by-decade-from-1880-to-2000/
“That is, we don’t look at one point at the end and one point at the beginning to determine if a noisy series is exhibiting warming,”
I look at 5 year averages. Many US States are cooler over the last 5 years than periods as far backs as the early 1900s.
http://sunshinehours.wordpress.com/2012/06/24/usanoaa-5-year-averages-plotted-using-all-montly-anomalies/
The climate cycle is up and down. Plotting trends using annual anomalies from as many stations as possible HIDE the climate signal.
The climate signal is up and down and up and down.

June 24, 2012 9:30 am

sunshinehours1
Carrick is (I hate to use the term because it’s used so much, but it does apply) cherry picking data.

June 24, 2012 9:33 am

Carrick says:
June 24, 2012 at 9:09 am
“One can state ones views sometimes, present evidence for them, and move one.”
What you actually did is cherry picked data that suited your paradigm. Then you said if anyone doesn’t agree with you they are biased. Then you left.

June 24, 2012 9:53 am

Carrick’s probably gone for a beer – non carbonated – with Mosher and Zeke.

Venter
June 24, 2012 9:54 am

Chiefo,
Yes, I’m up against FDA and TGA and EMEA / EDQM all the time and work on filing DMF’s and CTD’s for obtaining approvals. Every bit of data and every experiment, whether positive or negative has to be recorded and documented and no ” in-fillng ” is allowed. Every small change has to be documented and DMF’s updated constantly. Every single move and process has to be validated independently, quite rigorously. That’s what makes me laugh when so called educated gents come and spout BS about the way GHCN and other temperature data are recorded and handled, or I would say mishandled. What they are condoning would be called out as fraudulent practice in every hard scientific field.

Pamela Gray
June 24, 2012 10:45 am

Carrick, love marble graphs. However, that graph needs to be viewed using unhomogenized raw from the sensor data, and correlated with ENSO oscillations over time spans defined as El Nino, La Nina, and Neutral and probably with multi-variate parameters, IE PDO, AO and AMO oscillations. Then turned into a movie with the marbles changing colors as they cool or warm year by year, with the background color scheme changing according to analogue ENSO years. In addition, each station listed needs to be given a numerical error bar value related to its degradation/equipment changes over time as role over popups.

Editor
June 24, 2012 10:48 am

Carrick says:
June 23, 2012 at 10:40 pm

Willis:

Heck, contrary to your usual practice, you even actually answered a few questions

Serious question here:
How many questions are they required to answer? I don’t expect d****bags deserve any answer at all, for example.

So this is meant as a serious question, Willis? How clueless does a person need to be before we are allowed to blow them off?

Asking someone for a citation to their claim is not a “clueless question”, Carrick. It is everyday scientific practice.
How many requests for citations do they need to answer? Well … about as many as the number of uncited claims that they make. In this particular example, Steven Mosher claimed that the TOBS adjustments didn’t just apply to the US. Citing his authority for this would have been a trivial thing to do, but instead he wants to play “Go Fish” …
I’ve played that game, Carrick, and here is how it turns out all too often. Someone refuses to answer my request for a citation, so I go to look. I find something, and return to discuss it. The person who told me to go fish says ‘No, that’s not the citation I was thinking of’, so we’re back to square one. So I refuse to play it any more. Mosher is one of the larger offenders in this area.
One of the things that makes science work is transparency. Part of that transparency is to cite your sources for your claims. That’s why there are long lists of references at the bottom of scientific papers. If you do not do so, people are well justified in asking for those citations. If you refuse to provide them upon request, I’ve gotten to the stage where I just point and laugh.
Finally, as to whether “douchebags” deserve an any answer, I just re-read the comments from the person “phi” asking Mosh the question. I find nothing to indicate that he is a troll, a nit-picker, or a “douchebag”. But even if he were, HE’S JUST ASKING FOR A FREAKIN’ CITATION TO A SCIENTIFIC CLAIM.
So yes, he definitely deserves an answer, even if he is just the janitor.
w.

E.M.Smith
Editor
June 24, 2012 11:16 am

Interesting set of comments since I was last here.
As I need to get ready for a visit to a local church, I’m not going to do my usual canonical response.
Looks like Carrick tosses some smears, flung some “poo insults”, and when he saw “no joy” packed up and left. Fine with me. Never saw much reason to indulge in “poo fling contests” as things are not more pleasant nor cleaner at the end. (Better to keep just enough distance and watch what the wind does 😉
Attempts at distraction to “The Radiative Model” again, too. I’m bored with argument of the form:
“If we ignore conduction, evaporation, convection, and condensation: radiation dominates!”
The world is a spherical heat pipe with water as the working fluid. Radiation is irrelevant below the top of the atmosphere. At those levels, added CO2 causes more heat loss.
http://wattsupwiththat.com/2012/06/19/a-demonstration-of-negative-climate-sensitivity/
http://chiefio.wordpress.com/2011/07/11/spherical-heat-pipe-earth/
Per GIStemp:
Having been through the code and having it running on LINUX; it is just a giant “pot stir” on top of GHCN. Does yet more smearing and blending and yet more changing of data. Then does an “anomaly creation” for grid / boxes at the very last step that compares fictional value to fictional value. Don’t see much merit in it at all.
Oh, and created by a guy who testified that it was a GOOD thing to break the law if your cause was good enough… Yeah, I’ll trust that Moral Compass to not break the ethics of science when he thinks he is saving the world… /sarcoff>;
Per Church:
Just so folks know… I’m not a strongly religious type (though I have my moments and did pick up a D.D. at one point.) The spouse is. Mostly I find religion and comparative religion very interesting and the historical record in books like the Bible rather accurate (as recent archeology digs have shown). So we “collect churches” from time to time going to different ones just to see how each does things. (Some are pretty strange, others fun, some somber, others songs and play… and a few just flat out confusing and alien in foreign languages.) It can be a fun hobby and gets me away from computers and climate for a while…
@Verity:
Oh Dear! I did swap TonyB for Kevin… My apologies! What can I say? It had been a long day…
I’ll check back in this evening.

phi
June 24, 2012 12:13 pm

Willis Eschenbach,
Thank you.
But I would have said:
“So yes, he definitely deserves an answer, ESPECIALLY if he is just the janitor.”

June 24, 2012 12:54 pm

Pamela Gray says: “Victor, I read your blog post. Very interesting. What are your thoughts regarding non-random station dropout …”
Thank you Pamela Gray. I have studied homogenisation, therefore I feel qualified to make a statement about that.
I did not study non-random station dropout. Intuitively I would be very much surprised if it would be possible to change the trends by removing a small fraction of the climatological stations. After homogenisation, the trends in neighbouring stations are quite consistent. Thus removing one will probably not change the average regional trend much. Do you have any study that indicates that such a thing is possible? You could study it by taking the homogenised GHCN data and removing the x percent of stations with the smallest trend and see what happens. I would expect you get a larger effect if you do this with the homogenised data as with the raw data (and then homogenise it, of course).
For the US there is a small climate reference network, set-up at pristine locations. The series is still only a few years long, but Matthew Menne and Claude William (NOAA) found that the trends averaged over the US of this reference network matched those of the homogenised data of the full network. Which is an indication that dropout is not a serious problem for the trends in the average temperature over a large region.
For the dropout to be non-random, you would need a conspiracy of hundreds of people in all countries of the Earth. Do you really see this as realistic? As a comment to the original E.M. Smith post, I already explained why I do not believe in a conspiracy. Unfortunately this comment was censored.
For the global mean, I do not expect much problems, but station dropout is a problem for studying the regional climate and changes in extreme weather. Just at the moment, the Czech Republic is closing down a third of its stations to balance the national budget. Which stations will be closed, is normally decided by the meteorologists and the financial department. Climatologists are not the powerful elites you guys seem to think they are. If you can make some noise and are able to fight the station dieback, you will have the climatologists at your side. Maybe we could try to get the UNESCO on our side; the climatological network is part of our human heritage.
E.M. Smith wrote: “For homogenizing, the techniques vary, but largely use the same kind of “average a bunch” to get a value. ”
Please educate yourself. Your description is completely wrong.
http://variable-variability.blogspot.com/2012/01/homogenization-of-monthly-and-annual.html

phlogiston
June 24, 2012 1:19 pm

Should we not just scrap the corrupt thermometer record and look for reliable recent proxies (not tree rings)? Or make a rocket big enough to put up a satellite which peeks at earth out of a lead box so its electronics and CCDs dont get roasted by solar and cosmic rays?

Pamela Gray
June 24, 2012 3:06 pm

I am speculating. I think station drop-out and change could very well be non-random in terms of ENSO patterns but not because of a conspiracy. I think station dropout has to do with abandoned stations in less populated areas of the US. If you look at ENSO analogue years, you will find that areas that are highly sensitive to ENSO multi-decadal changes also happen to be in low-population areas. There is potential there for abandoned station drop-out removing sensors that would have otherwise recorded the large decadal ups and downs of ENSO effects on temperature.

DocMartyn
June 24, 2012 3:17 pm

“Victor Venema
I did not study non-random station dropout. Intuitively I would be very much surprised if it would be possible to change the trends by removing a small fraction of the climatological stations. After homogenisation, the trends in neighbouring stations are quite consistent. Thus removing one will probably not change the average regional trend much”
Well if you make sure all the cooling stations left are inside very closely spaced clusters of warming stations and make sure that the ones you remove are near by the many voids, you make the voids warmer.
You then make sure all the warming stations removed are inside very closely spaced clusters of warming stations and make sure that the ones you keep are near by the many voids, you make the voids even warmer.
It is placing the warming stations and removing of cooling stations adjacent to void regions which will make all the difference.
If you find a cooling station all alone, a long distance from everyone else, get rid of it (intuitively ‘knowing its crap), and the created void is average to the surrounding stations.
If you find a rapidly warming station all alone, a long distance from everyone else, keep it (intuitively ‘knowing its crap), and the non-void is average to the surrounding stations.
You can look for selection bias quite quickly, just look at the nearest four station distances in the dropped data, if the distance of the dropped colder stations is bigger than in the warming stations, you have a smoking gun.

Carrick
June 24, 2012 3:22 pm

Willis, thanks for the response. I would phrase it slightly differently, whether a person “deserves” an answer depends a bit on how many times he’s been told the same information. Wouldn’t you agree?
I am not myself loathe to give out references to literature I think is pertinent so generally I do agree with your sentiment.

Carrick
June 24, 2012 3:31 pm

sunshine:

What you actually did is cherry picked data that suited your paradigm. Then you said if anyone doesn’t agree with you they are biased. Then you left.

… because of “Kitchen remodel in progress”, as I mentioned. But nice job of being a completely dishonest sleaze in your description of what transpired.
As for the other, I didn’t cherry pick the data, I choose it based on objective criterion, to minimize homogenization issues among other reasons. This goes under the rubric “quality control” and the criteria and methodology I used are both credible and in general use in other branches of science.
You don’t like what you see because it doesn’t give the answer you like, but regardless, you’re wrong about the claim about 1/3 of Mueller’s data (really GHCN) having negative trends and so was he, though as I mentioned you didn’t understand the caveats associated with that claim, you were uncritical of reports that favored your prior beliefs, which makes you credulous, not a skeptic at all, and now you aren’t able to honestly admit error when it’s pointed out to you. .
As to the lat, when I said people were biased, I meant everybody is biased, and that includes me, we know this in science, and we account for it in the process of the scientific method.
Cheers… Lunch/dinner stop from tiling. Then back to the kitchen.
[Moderator’s Note: Rare agreement with Carrick: some people (probably not moderators) have lives outside of commenting on WUWT. Carrick, good luck with the tiling! If I try that myself, I now know where to go for advice. -REP]

gallopingcamel
June 24, 2012 3:34 pm

Sadly, this discussion is way above my pay grade.
However it inspiring to see a debate carried out in such depth with minimal name calling or appeals to authority. Even the usually grumpy Steve has been mostly civil. It probably means that there is some mutual respect among the principals (Steve, Zeke and Chiefio) sadly lacking in most discussions of climate change issues.

davidmhoffer
June 24, 2012 3:44 pm

Pamela Gray;
I didn’t study station drop out per se, but a few years ago I pulled apart NASA/GISS in considerable detail to compare the trends of grid cells with continous data to those that “came and went”. I fully expected that the increase in grid cells with data, and the subsequent decline, would account for some portion of the temperature trend. I proved myself wrong (to my chagrine).

June 24, 2012 4:10 pm

1) The paragraph Carrick claims I wrote, was not written by me. It was written by Amino Acids.
Maybe Carrick and Editor should check …
Victor: “For the dropout to be non-random, you would need a conspiracy of hundreds of people in all countries of the Earth. ”
Like the average elevation dropping 46 meters from 1940 to 2000?

June 24, 2012 4:12 pm

To Carrick and Moderator :
Amino Acids wrote the paragraph attributed to me.
But calling me a dishonest sleaze is definitely consistent with Carrick’s modus operandi of verbally bullying people he is losing arguments to.

Manfred
June 24, 2012 4:16 pm

wayne says:
June 23, 2012 at 5:49 am
In Figure 3: http://wattsupwiththat.files.wordpress.com/2012/06/clip_image006.png
“This shows that while the raw data in GHCN v1 and v3 is not identical (at least via this method of station matching), there is little bias in the mean. Differences between the two might be explained by the resolution of duplicate measurements in the same location (called imods in GHCN version 2), by updates to the data from various national MET offices, or by refinements in station lat/lon over time.”
Zeke, that is not a correct statement above, “there is little bias”. I performed a separation of the bars right of zero from the bars on the left of zero and did an exact pixel count of each of the two portions.
To the right of zero (warmer) there are 9,222 pixels contained within the bars and on the left of zero (cooler) there are 6,834 pixels of area within. That is makes the warm side adjustments 135% of those to the cooler side. Now I do not count that as “basically the same” or “insignificant”. Do you? Really?
—————————————————————-
Hi, Mosher, I find it a bit distressing to see you again ignore a clear and very easy to understand issue with your analysis. I don’t think by doing this you are up to the standards of this website.

June 24, 2012 4:41 pm

steven mosher says:
Here are some examples; Tmin is reported and being great than Tmax, tmax is reported as being less than Tmin. temperatures of +15000C being reported, temperatures of -200C
being reported. There are scads of errors like this. data items being repeated over and over again. In a recent case where I was looking at heat wave data we found one station reporting freezing temperatures. When people die in July in the midwest and a stations “raw data” says that it is sub zero, I have a choice: believe the doctor who said they died of heat stroke or believe the raw data of a temperature station.

To me, this speaks to the unreliability of the record more than anything else. How can unreliable data be used to make reliable models?
BTW – It’s an honest question, from a non-scientist who’s trying to understand, surely that’s a worthwhile question to answer?

Carrick
June 24, 2012 4:57 pm

sunshine, I apologize for the misattribution.

June 24, 2012 5:07 pm

I wonder if homogenization of data or some other algorithm has removed some of the more extreme negative anomalies from more recent data because even in the NOAA data low extreme’s seem to have stopped.
Here is a pattern I find in many US states and some of the western provinces. I’ll use Washington States as an example.
Before 1987, there were 11 months with an anomaly of more than -10F (two of which were over -15F). After 1987 there were none. The number of large positive anomalies did not go up. And those never went over 8.
http://sunshinehours.wordpress.com/2012/06/24/usanoaa-5-year-averages-plotted-using-all-montly-anomalies/
BC is similar:
http://sunshinehours.wordpress.com/2012/06/24/british-columbia-environment-canada-5-year-averages-plotted-using-all-monthly-anomalies/
Or maybe large negative anomalies just stopped occurring.

Gene
June 24, 2012 5:17 pm

, re: thermometer bulb shrinkage
It is news to me, although it does sound plausible. The only data point I can throw at this issue is this: the oldest mercury thermometer I possess is Jenaer Normalglas 18/III made in 1940, -38 .. +46C, graduated to 0.2C. I just compared it across its entire range to an electronic instrument with a K-type thermocouple and 0.1C precision. Their readings are identical to +/-0.2C, with no perceptible bias.

June 24, 2012 5:36 pm

TonyG says:
June 24, 2012 at 4:41 pm (Edit)
steven mosher says:
Here are some examples; Tmin is reported and being great than Tmax, tmax is reported as being less than Tmin. temperatures of +15000C being reported, temperatures of -200C
being reported. There are scads of errors like this. data items being repeated over and over again. In a recent case where I was looking at heat wave data we found one station reporting freezing temperatures. When people die in July in the midwest and a stations “raw data” says that it is sub zero, I have a choice: believe the doctor who said they died of heat stroke or believe the raw data of a temperature station.
To me, this speaks to the unreliability of the record more than anything else. How can unreliable data be used to make reliable models?
BTW – It’s an honest question, from a non-scientist who’s trying to understand, surely that’s a worthwhile question to answer?
#############################################
A couple points.
1. The frequency of these types of errors is low.
2. even with these errors left in your get roughly the same answer
3. This data is not used to build models. Models are built from first principles, not data

DR
June 24, 2012 5:43 pm

3. This data is not used to build models. Models are built from first principles, not data

Actually, very little is built from first principles.

June 24, 2012 5:43 pm

Gene,
Having calibrated hundreds of thermometers [mercury & alcohol], and B, J, R, K, and S-type thermocouples, etc., plus PRT’s and just about every other electronic temperature measuring device invented, I can assure you that a good mercury thermometer is more accurate, linear, repeatable and reliable than the others. If bulb deformation happens, I’ve seen no evidence of it. We would often use a known accurate mercury thermometer to do a quick verification of an electronic thermometer; but not vice-versa. Hysteresis is more of a problem in electronic instruments than in mercury thermometers.
WUWT had an article on thermometer calibration a year or two ago, which I can’t locate for some reason. But it was taken from this blog, which has lots of good info, and it shows how complex even calibrating a simple mercury thermometer can be.
That is one reason why I place great reliance on the Central England Temperature [CET] record, which is based on a mercury thermometer and covers the past several hundred years. It shows the planet emerging from the LIA along the same long term trend line with no acceleration of temperatures, even though there has been about a 40% rise in CO2 during that time. That is extremely strong evidence that the claimed warming effect of CO2 is greatly exaggerated.

June 24, 2012 6:01 pm

Carrick
You did cherry pick data.
You told me to compare some land data to some other land data. You did not include ocean temperature. And coean temperature is where GISTemp fails.
To pick only Meuller and Hansen work and say they are correct but not other data sets shows a clear bias. Why can’t you see that? Would it be because of your own bias? You cannot see you are cherry picking. These arguments over which data set is valid always go in these same circles.
Also, name calling should have no place in any discussion other than elementary school yards.
And, apparently, you aren’t really gone.

Mike G
June 24, 2012 6:02 pm

Smokey, are you aware of any work comparing the response time of modern electronic thermometers to traditional mercury max/min thermometers? I suspect there a built in high bias in the modern record due to quicker response of electronic thermometers.

June 24, 2012 6:05 pm

Carrick
I am thinking over my comments and trying to see how “sleaze” is an appropriate description for any of them. It could be true that I could have worded two things differently. But even in their current wording they are not “sleaze”. You really need to do a reassessment of your opinion of people in the world that don’t agree with you.

June 24, 2012 6:16 pm

steven mosher
“2. even with these errors left in your get roughly the same answer”
“Roughly” does not count. Roughly leaves too much play in the data. Roughly here and roughly there makes for useless data. Again, “manmade global warming” is based on 1/10ths of a degree which is roughly the same as there being no “manmade global warming”

Editor
June 24, 2012 6:20 pm

Carrick says:
June 24, 2012 at 7:40 am

… But my question is really for Willis. How much engaging in his opinion is required with people who have such strong confirmation biases like bad andrew that they apply completely different standards to people who feed them what they want to hear than people who don’t, and anybody who raises doubt about their beliefs is blown off as a “true believer” in any case?

Good question, Carrick. For me, in general a request for a citation to a claimed fact should always be answered, regardless of the questioner. It is important for a couple of reasons. First, it is an issue of transparency, traceability, and accountability, which is why there is an entire section of references at the end of every scientific paper.
But more to the point, the questioner is likely not the only one who is interested in a request for a citation. For example, in this case I’d like to know why Steven made the claim that he made about the time of observation adjustment, and I suspect I’m not the only one.
So in answer to your question, “How much engaging in [my] opinion is required with people who have such strong confirmation biases …”, if what they are asking for is a simple citation, in general that is a valid scientific question no matter who is asking it, and it should be answered.
w.

June 24, 2012 6:24 pm

“[Moderator’s Note: Rare agreement with Carrick: some people (probably not moderators) have lives outside of commenting on WUWT. Carrick, good luck with the tiling! If I try that myself, I now know where to go for advice. -REP]
I would hope you are not including his “a completely dishonest sleaze” as part of that rare agreement.
[REPLY: Excellent point. No. It would be a very good thing if all commenters, including Carrick, would refrain from personalizing things. -REP]

Editor
June 24, 2012 6:24 pm

Carrick says:
June 24, 2012 at 3:22 pm

Willis, thanks for the response. I would phrase it slightly differently, whether a person “deserves” an answer depends a bit on how many times he’s been told the same information. Wouldn’t you agree?
I am not myself loathe to give out references to literature I think is pertinent so generally I do agree with your sentiment.

Carrick, maybe I’m not following your point, but if you have evidence that “phi” has been given this same information before, please bring it forward. Absent such evidence, Steven should have just posted a reference to support his claim and moved on.
And no, there’s no reason to do that more than once.
w.

June 24, 2012 6:28 pm

How many requests for citations do they need to answer? Well … about as many as the number of uncited claims that they make. In this particular example, Steven Mosher claimed that the TOBS adjustments didn’t just apply to the US. Citing his authority for this would have been a trivial thing to do, but instead he wants to play “Go Fish” …
#############################
using google and doing some reading is all it would take. I’ve explained this and cited it on several occasions. I get tired of doing other peoples work. I get tired of lazy people who are not interested in the truth, who expect others to do their work for them.
A SIMPLE GOOGLE ON TIME OF OBSERVATION BIAS will get you the mere
beginnings of the literature on this. Then read the papers because the papers have bibilographies going back decades
But here goes yet again. The US is SOMEWHAT Unique in this matter because unlike other countries we had no standard observation time, However, we are not entirely alone in this regard. There are several other countries who have the same issue. A few follow.
Japan
http://sciencelinks.jp/j-east/display.php?id=000020000500A0108818
Or if any of you started to look at the sources of Crutem4 ( none of you have ) you would have found
this right away
Canada:
http://www.ec.gc.ca/dccha-ahccd/default.asp?lang=en&n=70E82601-1
This website provides monthly, seasonal and annual means of the daily maximum, minimum and mean temperatures from the Second Generation of Homogenized Temperature datasets which now replace the first generation datasets.
A First Generation of Homogenized Temperature datasets were originally prepared for climate trends analysis in Canada. Non-climatic shifts were identified in the annual means of the daily maximum and minimum temperatures using a technique based on regression models (Vincent, 1998). The shifts were mainly due to the relocation of the station, changes in observing practices and automation (Vincent and Gullett, 1999). Adjustments for the identified shifts were applied to monthly and daily maximum and minimum temperatures (Vincent et al. 2002). Observations from nearby stations were sometimes combined to create long time series that are useful for climate change studies.
The Second Generation of Homogenized Temperature datasets were recently prepared to provide a better spatial and temporal representation of the climate trends in Canada. In this new version, the list of stations was revised to include stations with long-term temperature observations covering as many of the recent years as possible. New adjustments were applied to the daily minimum temperatures in order to address the bias due to a change in observing time (Vincent et al. 2009). Techniques based on regression models were used to detect non-climatic shifts in temperature monthly series (Wang et al. 2007; Vincent 1998). A new procedure based on a Quantile-Matching (QM) algorithm was applied to derive adjustments (Vincent et al., 2012; Wang et al. 2010).
Want More?
Well, there is also Australia which was posted here. They changed observation time in
1964 and in there latest product I believe they made a few site specific adjustments for TOBS
See section 8 here
http://cawcr.gov.au/publications/technicalreports/CTR_049.pdf
And Norway.
Nordli, P.Ø. 1997. Adjustments of Norwegian monthly means of daily minimum temperature.
KLIMA Report 6/97, Norwegian Meteorological Institute, Oslo
And almost every long temperature series ( see the 4 long european stations) has TOB adjustments

June 24, 2012 6:32 pm

Willis Eschenbach
” if what they are asking for is a simple citation, in general that is a valid scientific question no matter who is asking it, and it should be answered.”
What you are saying is pretty simple to understand.
I could speculate as to why someone would not want to give a cite reference: Maybe there isn’t one. Or maybe the reference has been called into question before. Or maybe those being asked aren’t understanding what is being requested. Or maybe they are expecting you, and everyone else, to just believe them without question. Or, maybe they don’t understand the simple scientific procedure that is so obvious to you, and me, and most people looking for truth.
But even if any of my guesses aren’t correct there still is no call for the name calling and character assassination that Carrick is engaging in.

June 24, 2012 6:33 pm

Amino Acids in Meteorites says:
June 24, 2012 at 6:16 pm (Edit)
steven mosher
“2. even with these errors left in your get roughly the same answer”
“Roughly” does not count. Roughly leaves too much play in the data. Roughly here and roughly there makes for useless data. Again, “manmade global warming” is based on 1/10ths of a degree which is roughly the same as there being no “manmade global warming”
###########################
I will give you an example. In GHCN daily there are two measurements with 15000C
guess what happens when you average that with thousands of other measurements
and take the trend over 150 years.
nothing. nothing if you leave it in, even less when you take that outlier out
So, if the fact that there are some bad “raw” data makes you want to dump the whole
dataset, then almost no knowledge is possible.

Reply to  steven mosher
June 25, 2012 6:49 am

steven mosher said:
So, if the fact that there are some bad “raw” data makes you want to dump the whole dataset, then almost no knowledge is possible.
So why not just drop the bad data, instead of making adjustments? You still never explained how you know what to adjust TO. So far, I haven’t heard any justification for making adjustments.

June 24, 2012 6:39 pm

DR says:
June 24, 2012 at 5:43 pm (Edit)
3. This data is not used to build models. Models are built from first principles, not data
Actually, very little is built from first principles.
####################################
Model E code is online. I started reading it back in 2007. Its 100K Loc. Suggest you
get started.
If you want an introduction about models
start here. Simple enough. two types of models
http://www.newton.ac.uk/programmes/CLP/seminars/082310001.html
read more; comment less,

June 24, 2012 6:39 pm

“[REPLY: Excellent point. No. It would be a very good thing if all commenters, including Carrick, would refrain from personalizing things. -REP]”
Thanks.
I hope none of what I have said in this thread was in bad taste. As I think over a couple of my comments I see I could have worded a couple things differently. How I put them could easily be construed in a way I didn’t mean. Lesson one: don’t be in a hurry to write your comment on the internet!

Pamela Gray
June 24, 2012 6:41 pm

David, did you correlate it with ENSO geographic location? The oscillations we so often talk about are not global, they are regional. If station dropout was also regional, we have potential artifact.

June 24, 2012 6:43 pm

steven mosher
I don’t think EMSmiths work in this case is about “raw” data.

June 24, 2012 6:48 pm

Willis.
I have pointed phi to the sources before. He refuses to read them or to acknowledge anything
Here. a month ago
http://rankexploits.com/musings/2012/a-surprising-validation-of-ushcn-adjustments/#comment-95737
where you will find the reference to japan as the first link.
But he is not interested in looking at the actual data, actual papers, actual code.
he is not interested in the fact that the skeptic John daly and Jerry B looked into the TOBS matter themselves. he is only interested in derailing the conversation.

June 24, 2012 6:53 pm

Jimmy Haigh says:
June 24, 2012 at 7:01 am (Edit)
Mosher and Zeke? Crickets…
##############
no projects.

June 24, 2012 6:54 pm

JT says:
June 23, 2012 at 11:52 am (Edit)
@Mosher
@Smith
question: if you took each raw temperature measurement and plotted it against the time when the measurement was made from the earliest known temperature measurement to the latest known temperature measurement so as to create a complete scatterplot of available raw data, what would it look like?
##############################
Zeke uploaded the data. Have at it.

M
June 24, 2012 6:58 pm

steven mosher says:
June 24, 2012 at 6:39 pm
DR says:
June 24, 2012 at 5:43 pm (Edit)
3. This data is not used to build models. Models are built from first principles, not data
—————————————————
“Bauer et al. used a large aerosol effect and still needed a large deforestation warming to bring her results in line with the Mann et al. reconstruction (in fact, it was done specifically for that reason)”
http://di2.nu/foia/foia2011/mail/1891.txt

June 24, 2012 7:01 pm

Pamela Gray says:
June 23, 2012 at 8:38 am (Edit)
Steven, a significant portion of your time as a scientist should be spent explaining what you have said. To refuse to do so by telling others to figure it out for themselves seems a bit juvenile and overly dressed in “ex-spurt” clothes. If questions come your way, kindly doing your best to answer them seems the better approach, especially with such a mixed audience of arm-chair enthusiasts such as myself and truly learned contributors. I for one appreciate your post. Don’t spoil it.
############################
Pamela. I am not a scientist. I create tools for others to investigate data that has been made freely available. I provide answers and help to anyone who shares my commitment to the following
1. use your real name when you post
2. post your data
3. post your code
I provide those tools so that people can see for themselves and do their own damn work.
I routinely get requests to “do this chart” or “do that chart” Guess what? I’m not your chart monkey. Im not your Phd director , read some damn papers, use google scholar, read bibliographies. read more, comment less. answer questions for yourself UNLESS YOU CANT.
I give you all the tools you need to answer the questions you have. If I spend time showing you the library or coding up a chart for you, that LESS TIME I have to build a tool for a guy who wants to do his own work or who wants to work with my software.

June 24, 2012 7:03 pm

I don’t think EMSmiths work in this case is about “raw” data.
############
since GHCN is unadjusted and and he claims to use GHCNv3 unadjusted, we can probably agree this is about unadjusted.

June 24, 2012 7:07 pm

sunshine
:”He has no interest in the third of stations even Mueller admitted were cooling.”
This is a very crude misunderstanding of a bad chart.
What’s more, sunshine knows this is wrong since I spent a considerable amount of time looking at this issue. A point which Carrick can attest to since he gave me some really interesting ideas.
However, remember this is sunshine. he posts data that is estimated and does not disclose that fact.

Pamela Gray
June 24, 2012 7:08 pm

This issue related to ENSO parameters possibly conflagrating with station dropout as well as station degradation has to do with what we are experiencing in NE Oregon. Every year for the past 7 years, we have experienced much cooler Spring and Summer temperatures according to our ranch thermometers, corr0borated by late starts on much of the agricultural products, and decreased insect populations ubiquatious to our area. We have also experienced a massive surge in cold water loving sports fish.
Of interesting note, one of the local extreme NE Oregon weather stations that has a long term record has undergone significant deterioration and has not recorded this colding, while others not so deteriorated in nearby counties have.
NE Oregon is like the canary in the coalmine, being situated in a geographic area highly sensitive to ENSO oscillations. So much so that for decades peas will disappear from our fields then surge back, simply because peas are what we can, or cannot, grow under the highly sensitive conditions we are facing. Salmon and steelhead make their way up our alpine rivers in similar decadal oscillating fashion, falling to barely countable numbers year after year, then rising to such numbers that fishing season stretches for weeks and weeks year after year. And we are not the only canary. ENSO patterns result in other highly sensitive geographic areas across the US.
If station dropout concentrated itself to these ENSO sensitive geographic areas, it could have quite an affect on anomalies I would think. I think there are socially related reasons for this. Unstable climate areas tend to not encourage population growth and station dropout may be more frequent in areas not described as population centers.

June 24, 2012 7:19 pm

dp says:
June 23, 2012 at 9:25 am (Edit)
Start over and show where Smith’s error is. All you’ve shown is you have arrived at a different result with different methods. No surprise. I could do a third and be different again. It would not show that Smith or you are right (or wrong).
##################################
1. to do that with certainty would require EM to post his data and code in a turnkey fashion
like steve mcintyre, zeke, nick stokes, willis, and I do.
2. The flaws in First differences, established by skeptic Jeff Id, should be enough to
give you pause. My question to you is why didnt you spot that he had used
a discredited method? was it the fact that you liked the answer that made you
drop your skepticism? was it the fact that he called it “peer reviewed” that
bamboozled you?

June 24, 2012 7:24 pm

“Steven Mosher claimed that most adjustments came from Tobs not only in US. TObs adjustments are generally made ​​in all countries but they are weak and the problem is usually totally different. In this regard, US is in fact a special case. Very curious.”
The US is not a special case. Its probably the largest case, but the problem exists in other countries as well: Japan, Canada, Norway, perhaps a bit in Australia, and in all the extra long series that go back to 1800 and beyond. Thankfully with the Berkeley approach we dont have to explicitly do this adjustment.

June 24, 2012 7:30 pm

sunshine:
“Mosher clearly has stated numerous times he believes if CO2 has increased it must have warmed the earth. Therefore he works hard to find some magical formula that proves crap data proves the earth is warming.”
Wrong. Basic physics tell us that C02 is a green house gas. If C02 increases, then, all other things being equal, the earth will warm. How much? that is the really hard question.
The temperature record has little to say out this in the short term.
is the earth warming? Yes, there was a little Ice Age. I don’t know anyone who argues that
it is cooler now than in the LIA. but go ahead, fire away with your best data.

davidmhoffer
June 24, 2012 7:31 pm

Pamela Gray says:
June 24, 2012 at 6:41 pm
David, did you correlate it with ENSO geographic location?
>>>>>>>>
No, I didn’t. (I was just beginning my climate self education and had no idea what ENSO was back then). I looked at it strictly from the perspective of any major grouping of drop outs (or the reverse earlier in the record).

June 24, 2012 7:43 pm

“The funniest interventions could be yours. We spoke with Steven Mosher of the ability to disable the implicit homogenization in BEST. This implicit homogenization is the result of the segments adjustments. If you disable this setting, there is simply no results.”
Huh? once again phi is talking about something he knows nothing about.
You can turn the scalpel on or off. readers HERE should know that because they will remember that when Muller first ran some tests on station quality ( anthony’s issue ) those tests
were run with the scalpel off.
At some point people will address the following issues.
1. sunshine tried to pass estimated data from Env canada off on you without mentioning it.
2. EM tried to pass his version of First differences off on you as being peer reviewed
and None of you, not a single one of you, even had the memory to realize that
we had discussed that method on climate audit and Jeff Ids Air vent.
3. Phi, anonymously, continues to hi jack threads with requests for citations that
a) are right on google
b) are in the bibliographies of papers he claims to have read
c) have been given to him before.
4. we get the same answer as Giss and Cru using
a) no GHCNv3 data
b) using methods developed by skeptics
All that said there are some valid criticisms of GISS and CRUtemp but they are
scientifically uninteresting. They are fascinating nit picks, but in the end using different data and better methods ( skeptic approved methods– gosh it was everything we asked for ) we get the same answer. Yup, its warmer now than in the LIA. we know that with a little more certainty
than we did before. more temporal coverage and more spatial coverage. transparent code.
open data. Everything we asked for. What nobody expected was that the answer would change so little.

June 24, 2012 7:54 pm

Mosher: “he posts data that is estimated and does not disclose that fact”
Again, as I said above, the monthly data I got from EC did not say that.
However, my post also said that wasn’t the most important issue. It was data appearing BEST that had no basis in reality.
But I think I had a legitimate question as to where a -13.7C data point for Malahat came from in the BEST data. Even Nick Stokes admitted it appeared out of nowhere. You continue to try and divert peoples attention from mysterious data.
I got 4355 stations stations with a cooling trend from 2000 to 2011, most in the USA.
Further back the number drops but it is still significant.
http://sunshinehours.wordpress.com/2012/03/18/cooling-weather-stations-by-decade-from-1880-to-2000/
How many do you get by decade? Map them.
Anyway, frankly I think it is disgusting BEST only shows data from 1950 and totally ignores the warmer 20s and 30s. What a joke.

June 24, 2012 7:55 pm

Lucy
“Like showing the HS is an unavoidable statistical result of the Team’s method of selection, I suspect CA would show that the warming is an unavoidable statistical result of dropping stations etc. But I am open to disproof.”
we have offered that proof in many forms, even on this blog.
1. We did a global average using around 100 of the longest stations: same answer
2. We ran the average with random selections of data: same answer
3. We ran a test that only included stations that survived the drop out: same answer
4. We ran with completely different datasets: same answer
we ran different methods. we ran sensitivity tests. everything.
Now, here is what we did find. I will call it Carricks find since he did most the work and made the best suggestions.
When you have missing months stations tend to COOL. thats right. this came out of me looking at the cooling station phenomena. The more complete the record, the warmer the station will be. Carrick had some theories about this that I’m going to get to test someday when the day gets extended to 36 hours.
One other thing that remains is the latitude bias that Carrick has identified. I think there may be a coastal station bias as well, that is too many stations on coasts will depress the record.

June 24, 2012 8:17 pm

Mosher: “I don’t know anyone who argues that it is cooler now than in the LIA. but go ahead, fire away with your best data.”
1) Washington State. The last 5 years average temperature are cooler than ~1898 to 1908.
Or by “now” do you mean 1998. It isn’t 1998 anymore.
You can check here: http://www.ncdc.noaa.gov/oa/climate/research/cag3/wa.html
Arizona?
http://sunshinehours.wordpress.com/2012/06/24/arizonanoaa-5-year-averages-plotted-using-all-monthly-anomalies/
2) While the US temperature record starts at 1895, assuming that it was cooler before 1895 is not supportable. Looking at the Arizona graph or Washington, what would they look like if the data went back to 1800?

Michael R
June 24, 2012 8:19 pm

micheal r
if the differences were distributed in time as you suggest the rends would be different
that is the point of comparing
trends.

That wasn’t the point at all. I am a laymen. I come to sites such as this to find information and to digest it in as much capacity as I can do given that fact. What this ends up coming down to is usually one of two things –
1. Simple to follow ideas and
2. Trust
Because at the end of the day, to take what is being said and have faith in the person who said it really does underpin the average person’s knowledge of science, particularly in matters that are very complex.
Now one of my first forays into researching this science was Real Climate. In spending some time there I found so many arrogant posts – not just by the commenters, but by the moderators as well in the comments – to any question differing of opinion, and experienced this first hand. That initial experience had me so turned off to the people there that now I do not trust any of them. They could argue the sky was Blue, but I wouldn’t believe it.
It is true that you do not need to be liked to do good science, but when it is layment such as me and the general public and especially many of the politicians setting the policies on that science, it is important that we trust those giving their opinions.
The point, though it was long to get there, was that the last time I had seen the analysis used in the first section of this ost is was done intentionally to mislead people reading it into believing what was being said by an intentional twisting of data to suit their purpose. To be honest, I read the first paragraph, saw the first graphs and immediate had the gut reaction to ignore the rest of the post as not worth my time because of that. As I said, this is unfortunate as the rest may well be a valid post, and it you may not have had that intention in mind when you posted it.
All I suggested was caution when posting that kind of information because it doesn’t help anyone, particularly if that example is used as the lead in to the rest of the argument. You stated that it is meaningless without comparing trends. The thing is, if you read the initial post from a third party point of view, you will note you made clear distinction that these were two different methods to both show there is not a problem:

This shows that while the raw data in GHCN v1 and v3 is not identical (at least via this method of station matching), there is little bias in the mean. Differences between the two might be explained by the resolution of duplicate measurements in the same location (called imods in GHCN version 2), by updates to the data from various national MET offices, or by refinements in station lat/lon over time.

Fllowed by:

Another way to test if GHCN v1 and GHCN v3 differ

That says to me that you hold the first method to be substantial in showing no bias indipendantly of the second method – which is an issue as that is not exactly true. Now me as a layment then has to ask the questio, should I trust that the second section of the article?

Michael R
June 24, 2012 8:22 pm

Please excuse the typing errors, my keyboard is old and if I do not whack the key it can just miss letters. Also I have no idea why I keep adding a t onto “laymen” o_O.

Carrick
June 24, 2012 8:26 pm

Amino, I admit I did misconstrue your comments, and I apologize to all present in this conversation for my role in escalating and personalizing this. Let’s leave it at that, ok?
You said:

You did cherry pick data.

No cherry picking is something else entirely. I selected data based on trying to reliably estimate a trend. Briefly, cherry picking would be selecting a data to produce desired outcome. I had no idea when I did this analysis what I would find for sure. I repeated the Mueller “marble plot” analysis obviously because I questioned the result shown in his plot, and I designed the analysis based on best testing this… that includes common base line (data present over selected interval), not so much data “in sample” missing that the trend could not be reliably estimated, etc.

You told me to compare some land data to some other land data. You did not include ocean temperature. And coean temperature is where GISTemp fails.

Um, you were talking about Mueller, who is responsible for BEST, which is land only. Why do you think we should consider ocean data for a land-only reconstruction? And why are you bringing in GISTEMP to a discussion about Mueller’s data (really GHCN in the figure in question)?

June 24, 2012 8:41 pm

Willis
“Asking someone for a citation to their claim is not a “clueless question”, Carrick. It is everyday scientific practice.”
1. This thread is not about TOBS. I have had this discussion about TOBS over and over again
with the anonymous phi ( bad andrew always shows up ) Now, you often manage
the discussion on your posts by telling people what is on topic and off topic. You’
dont see me coming on to your threads and saying “hey willis, that seems on topic to me!”
No, I dont. because I respect your decision to engage those who believe are engaging
in good faith. You put good hard time into your posts and commenting on your posts.
trolls waste your time and they waste my time. Every second I spend on a troll is time
away from the real issues: First differences as a method; and the changes between GHCN v1
and GHCN v3.
2. Phi has been told before when he tried to hi jack other threads where he could look
to find the answer to his question. Its not that hard. Anybody who works in this field
and actually reads papers and reports knows that the US is not the only country
that has an issue with time of observation. Its received the most publicity to be sure
but one test i use to tell if another party is really interested in the topic is
have they read the core literature. Or are they popping up to ask their pet off topic
question to derail the conversation. Anybody who googles time of observation bias
will find the reference to japan. Anybody who looked through CRUTEM4 sources
( a sure test of interest in the subject ) would find the case of canada. Anybody
who read WUWT and is interested in the topic would have read the source
documents about the australian case. Derailers, thread jackers, are easy to
spot. They are typically anonymous. They talk about one topic and not in much
depth.
He has also been told that the long series have been adjusted for TOBS. I assume
that somebody truly interested in this issue would have read about the
most important series that we have: CET. you have read the CET papers, you didn’t forget the sections about adjusting for TOB changes. I expected you, Willis, to remember that Armagh has also been adjusted for Time of observation as I recall you wrote about
that station here on WUWT and assume you read the underlying literature. Anyone who has looked at that data and paper would have read about adjustments for TOBS.
basically, I have no time for people who refuse to read, especially when they have been given the information before, especially when their request is off topic. For those who cant remember, I have a bit more patience.

June 24, 2012 8:42 pm

sunshine. If you want to change the topic of this thread for yet another time and bring up your crazy denial of the LIA, Let me suggest that you take it to your blog.

Carrick
June 24, 2012 8:43 pm

Pamela Gray:

Carrick, love marble graphs. However, that graph needs to be viewed using unhomogenized raw from the sensor data, and correlated with ENSO oscillations over time spans defined as El Nino, La Nina, and Neutral and probably with multi-variate parameters, IE PDO, AO and AMO oscillations. Then turned into a movie with the marbles changing colors as they cool or warm year by year, with the background color scheme changing according to analogue ENSO years. In addition, each station listed needs to be given a numerical error bar value related to its degradation/equipment changes over time as role over popups.

The point of performing a 70-year trend estimate is to remove short-period fluctuations to expose whether long-term warming is present or not. If you did an annual estimate, all you’d be seeing is climate noise.
I did an analysis of the effect of climate noise on temperature trend as a function of the integration period (e.g., interval of measurement of the trend), and as you can see for very short periods, the trend is telling you nothing at all interesting about warming or cooling of climate. 30 years is the minimum period I would recommend for reliably estimating temperature trends.
By the way I also did 1970-2010 similar result.
We could do what you are suggesting in your movie (in fact something like this has been done), but I’m not sure what you’d learn from it.

June 24, 2012 8:43 pm

Carrick
It’s land data. You advocate it. What’s up with this putting it off on the man that made the data you prefer?

June 24, 2012 8:46 pm

Carrick says:
June 24, 2012 at 8:26 pm
“Amino, I admit I did misconstrue your comments, and I apologize to all present in this conversation for my role in escalating and personalizing this.”
Thanks for the apology. I accept it. And I appreciate it.
One thing I’d like to clear up is it did look like you were departing from the thread and not just temporarily to do some tiling. But it is easy to write a comment that appears to mean something other than it meant. I’m pretty sure we’ve all done it.

Carrick
June 24, 2012 8:51 pm

sunshine (double checked attribution this time >.< ):

1) Washington State. The last 5 years average temperature are cooler than ~1898 to 1908.
Or by “now” do you mean 1998. It isn’t 1998 anymore.

Generally the period given is circa 1350 to 1850. See my comments about about needing to be careful when comparing temperatures that you use a long-enough averaging period to remove short-period climate noise. I didn’t show this, but the period you need to average over becomes larger as you make the geographical area smaller. 10 years would be too short for full global temperature, it’s virtually worthless for very small (on a global scale) geographical areas like Washington State.

June 24, 2012 8:57 pm

Mosher: “he posts data that is estimated and does not disclose that fact”
Again, as I said above, the monthly data I got from EC did not say that.
############
Then before you come onto this web site to stalk me and derail the conversation
please read the web site where you get your data. Or you can use the software
that I wrote for people like you to read the data. That software was written for your benefit
because of questions you asked about environment canada. It allows you
to download all the data.and to analyze it.
But do not come onto a thread where something else is being discussed, throw up
data that you have not checked when I wrote a program that allows you to check
and make me waste my time correcting your mistakes.
Now in case the thread nanny Willis comes along and explains that I should help you
undo your mistake ( when I’ve already written a software package that allows you
to do it ) let me give you the links. click on the second one to get the csv.
or use my program to download and organize all of env canada. Its was only a few weeks work
http://www.climate.weatheroffice.gc.ca/climateData/monthlydata_e.html?timeframe=3&Prov=BC&StationID=65&mlyRange=1920-01-01|2005-04-01&Year=1920&Month=01&Day=01
http://www.climate.weatheroffice.gc.ca/climateData/bulkdata_e.html?timeframe=3&Prov=BC&StationID=65&mlyRange=1920-01-01|2005-04-01&Year=1920&Month=1&Day=01&format=csv

Carrick
June 24, 2012 9:00 pm

Amino:

It’s land data. You advocate it. What’s up with this putting it off on the man that made the data you prefer?

I’m not sure where you are getting that I am “advocating” land data or what you mean by “putting it off on the man that made the data” I prefer. GHCN, the data set I used, was made by many thousands of researchers not one, so who is the “man that made the data” that I prefer???
Please make sense.
I was responding to a specific issue that related to “Mueller’s data”, which is GHCN, which is land-only data. I used an analysis method that selected stations that could reliably be used to estimate temperature for a given period to test that data. I’m not sure what’s left to discuss besides that other than to conclude 1/3 of the GHCN stations do not actually show cooling (this is the data set Mueller’s figure was showing, and again it did specifically does not include ocean data).
The original comment made above about the “1/3 of the data” is wrong. We should move on to discussing something else. If you want to discuss ocean data (e.g. GISTEMPs extrapolation into the Arctic Ocean during wintertime months from land-only stations), I’m game. But it’s a different subject…and let’s wrap up this other topic by agreeing about this much at least.

Editor
June 24, 2012 9:04 pm

steven mosher says:
June 24, 2012 at 6:48 pm

Willis.
I have pointed phi to the sources before. He refuses to read them or to acknowledge anything
Here. a month ago
http://rankexploits.com/musings/2012/a-surprising-validation-of-ushcn-adjustments/#comment-95737
where you will find the reference to japan as the first link.
But he is not interested in looking at the actual data, actual papers, actual code.
he is not interested in the fact that the skeptic John daly and Jerry B looked into the TOBS matter themselves. he is only interested in derailing the conversation.

And in this case he has succeeded in derailing the conversion … but only because you were unwilling to give a simple link. In addition, as I mentioned above, there are others, like myself, interested in the same question who didn’t follow your previous discussion.
So next time, how about you just give the freakin’ link and move on, and not turn it into some kind of battle on principle? How hard is it to give a single link?
w.

June 24, 2012 9:07 pm

Mosher: “sunshine. If you want to change the topic of this thread for yet another time”
You asked and I quote: “I don’t know anyone who argues that it is cooler now than in the LIA. but go ahead, fire away with your best data.”
And for 3rd or 4th time you dodge the key point: Spurious data did appear in BEST.
And more importantly as I’ve shown with a few graphs: Starting BEST graphs in 1950 is dishonest when the distant past is just as warm as it is today (and I’m talking 5 year averages, not 1 data point).
It isn’t 1998 anymore. It has cooled. Why it has cooled is fascinating. But your blinkers are on. You have a project to protect.
California: http://sunshinehours.wordpress.com/2012/06/24/californianoaa-5-year-averages-plotted-using-all-monthly-anomalies/

June 24, 2012 9:08 pm

“But I think I had a legitimate question as to where a -13.7C data point for Malahat came from in the BEST data. Even Nick Stokes admitted it appeared out of nowhere. You continue to try and divert peoples attention from mysterious data.”
1. are you using the right dataset. last time we went through this you made the mistake of using
the raw data. Remember that?
2. Download the program BerkeleyEarth.
3. Using that program download the quality controlled data.
4. Using that program read in the sources.txt file. This will tell you the data sources
for every line of data.
5. Then realize that values that look “out of range” are handled by the last two
steps of quality control in the programs. In short, if the data is garbage it
goes through one last quality check and then a kriging process. In short,
just because a data element is in the file does not mean that it gets used.
There is a regional consistency test and kriging weighting.
6. Download the package CHCN and look at all the env canada data.
To recap. You are thread jacking. I have discussed this before with you and pointed you to
the software tools I wrote so that you can do this without making mistakes.

June 24, 2012 9:13 pm

And in this case he has succeeded in derailing the conversion … but only because you were unwilling to give a simple link. In addition, as I mentioned above, there are others, like myself, interested in the same question who didn’t follow your previous discussion.
##########################
willis, you handle derailers your way I handle them my way. phi has been told last time he derailed the conversation where to look. You yourself are well enough read in this field ( CET and Armagh) to know that the US is not the only record that does TOBS adjustments. Did you honestly forget that?

June 24, 2012 9:16 pm

Carrick: ” that you use a long-enough averaging period to remove short-period climate noise.”
I used 1971-2000 for the 30 year period. But it isn’t the value, it is the value compared to to other values.
And who said only Washington State is the only region warmer at the turn of the 20th century than it is in the last 5 years. California and Oregon.
Other regions were a lot warmer in the 20s or 30s than the last 5 years.
NOAA: “The Little Ice Age (or LIA) refers to a period between 1350 and 1900”
http://www.ncdc.noaa.gov/paleo/ctl/resource1000.html

June 24, 2012 9:25 pm

sunshine
once again with the derailing of the conversation. I’m guess that Willis is the only one here who gets to tell commenters that there comments are off topic
“I got 4355 stations stations with a cooling trend from 2000 to 2011, most in the USA.”
Without a doubt. You will find cooling stations throughout the record. But these are the important questions
1. How many of those records are complete? Carrick had a wonderful finding that as you
look at COMPLETE records, records that have no missing months, the fraction
of cooling stations goes down. Higher quality data appears to mean something.
This is a cool mystery.
2. How many of those trends are statistically significant? what kind of correction
for auto correlation did you use? Basically what I found in poking around with this
was that about 15% of long stations were cooling– 10% if you used complete records
and this number dimished more if you ask for statistical significance
Here is a CLUE. when a warmist says “Its warming” we all rightly pounce on them
and ask where are the error bars. It seems fair to ask that question when one
finds cooling stations.
Now dont get me wrong, I think cooling stations are very interesting. Especially
those in urban heat islands. in fact there are a couple papers written on the effect
you want the links? Willis has them. he is the link nanny

Editor
June 24, 2012 9:26 pm

steven mosher says:
June 24, 2012 at 8:41 pm

Willis

“Asking someone for a citation to their claim is not a “clueless question”, Carrick. It is everyday scientific practice.”

1. This thread is not about TOBS. I have had this discussion about TOBS over and over again with the anonymous phi ( bad andrew always shows up ) Now, you often manage the discussion on your posts by telling people what is on topic and off topic. You’ dont see me coming on to your threads and saying “hey willis, that seems on topic to me!” No, I dont. because I respect your decision to engage those who believe are engaging in good faith. You put good hard time into your posts and commenting on your posts. trolls waste your time and they waste my time. Every second I spend on a troll is time away from the real issues: First differences as a method; and the changes between GHCN v1 and GHCN v3.

Thanks for that explanation, Steven. If the issue is that phi’s question is off topic, that’s fine. As you point out, I say things are off topic on my threads. And so can you … but:
1. In this case, the person who introduced the TOBS topic was you, saying:

Sure. back in 2007 I started as a skeptic of adjustments. After plowing through piles of raw and adjusted data and the code to do adjustments. I conclude
A. Raw data has errors in it
B. These errors are evident to anyone who takes the time to look.
C. these errors have known causes and can be corrected or accounted for
The most important adjustment is TOBS. We dedicated a thread to it on Climate audit.
Tobs is the single largest adjustment made to most records. It happens to be a warming adjustment.

The fact that you were the person who introduced the topic into the thread means that it is far from obvious that you will later consider it off-topic …
2. You didn’t do what you correctly point out that I do, which is to tell a poster that their comment is off topic. Instead, you first brought up the topic of TOBS, and then when asked if you had a reference for your claims about TOBS you said:
“yes. do more reading and post your results.”
No matter how many times I read that, I don’t get “TOBS is off-topic for this thread” out of it. Nor do I get “I told you that before, phi, and you ignored it” out of it, nor do I get “phi, you are a troll, I’m going to ignore you.” We’re not mind readers out here.
My point is simple. Once again, your cryptic posting style has done nothing but cause confusion and dissension. If the issue is that phi’s question is off-topic, say so. If the issue is that phi has asked that question before and ignored the answer, say so. If the issue is that you think phi is a troll, say so.
Because saying what you said just makes it look like you have something to hide, even though you don’t, and even though you have valid reasons for your actions. And that is something that you don’t need.
You are brilliant man, Steven, I’ve learned lots from you, and I’m a huge fan of your work and your insights… but your posting style, not so much.
w.

June 24, 2012 9:30 pm

Bruce, err sunshine, if want to continue your LIA denial, please go to this thread.
There you have a link, so the nanny will be happy
http://wattsupwiththat.com/2012/06/24/hh-lamb-climate-present-past-future-vol-2-in-review-part-i/#more-66176

June 24, 2012 9:37 pm

Carrick
Again, and really, for the last time, you are biased. You prefer Meuller and Hansen. My goodness man, look around you.

gallopingcamel
June 24, 2012 9:38 pm

Just when I thought that Mosher had mellowed a little he starts ranting.
Methinks Steve doth protest too much. Calm down and listen just this once.

phlogiston
June 24, 2012 9:44 pm

“Hit ‘EM where they aint” I guess must be Steve Mosher’s principle in, astonishingly, refusing point blank to reply to the detailed rebuttal by EM Smith while filling the thread with musings about life in general. This looks like a mixture of cowardice and arrogance – in any case there is no doubt that it constitutes an admission of defeat.
This spectacular intellectual defeat of Mosher and Hausfather by EM Smith is reminiscent of the outcome of the famous Oxford debate on evolution by natural selection between Thomas Huxley and Samuel Wilberforce at the Royal Institution in 1860 (http://en.wikipedia.org/wiki/1860_Oxford_evolution_debate) – paradoxically perhaps considering the Cheifio’s faith and presumed atheism on the other side. Such is history. This is what happens when true intellect and honesty confronts establishment dogma and a dishonest defence of special interest.

Carrick
June 24, 2012 9:44 pm

sunshine:

NOAA: “The Little Ice Age (or LIA) refers to a period between 1350 and 1900

1900 is the extreme outer edge, and I think you would have to admit that an interval that extends beyond this isn’t typical of LIA weather.
Regarding the interval selection, if you’ll listen to and absorb what is being said regarding the issues about reduced geographic regions, you’ll sharpen your arguments as a result. You may or may not get the conclusions you prefer, but that’s not the point of your analysis, right? It’s to learn the truth.

June 24, 2012 9:48 pm

Sunshine
‘And for 3rd or 4th time you dodge the key point: Spurious data did appear in BEST.”
Let me explain this to you yet again.
The QC data file goes through ADDITIONAL checks. Those checks happen in code.
I can use GHCN daily as an example. For the GHCN daily data set every day of data
has a spatial consistency check
Want a link?
ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt
To lazy to read? here is the text
QFLAG1 is the quality flag for the first day of the month. There are
fourteen possible values:
Blank = did not fail any quality assurance check
D = failed duplicate check
G = failed gap check
I = failed internal consistency check
K = failed streak/frequent-value check
L = failed check on length of multiday period
M = failed megaconsistency check
N = failed naught check
O = failed climatological outlier check
R = failed lagged range check
S = failed spatial consistency check
T = failed temporal consistency check
W = temperature too warm for snow
X = failed bounds check
Now, when Berkeley earth reads this data in, it applies MOST OF but not ALL OF
these QA flags
DO you want to see what QA flags have been applied to the data in question.
1. get the package BerkeleyEarth
2. download the QC data
3. read the flags.txt file and see what flags have been applied
Now, the spatial quality flags are typically not applied. WHY?
A. The berkeley data includes more sources
B. If you have more sources your spatial consistency test can be better.
Spatial consistency is determined not for just GHCN daily but for all the data around that site.
That process happens as a part of the kriging process, where the weather error is minimized in a iterative process.
At some point in the future I hope to add two datasets that will allow you to see the final data
1. post scalpel data: this is like 170K station segments
2. final total feild values.
One of the draw backs of doing using this kriging process is that you cant easily
see what happens to outliers and you cant easily see what the final weighted values are.
So for example.. if you have a value that is -13.7 and all the surrounding stations
have other values, the weighting process will deweight unreliable data. Its an iterative process
that searches to minimize the weather noise.

June 24, 2012 9:49 pm

Carrick
You give the impression BEST is untouched raw data. But they had their method of handling the data. And the data is not untouched and pure.
Also, who were the peer-reviewers of Mueller’s BEST? Why do I say Mueller? He is the face of it. We all know it. Why distract like that.

June 24, 2012 9:56 pm

mosher, thanks for the link on the LIA. One of the links (http://notalotofpeopleknowthat.wordpress.com/2011/11/11/what-was-life-like-in-the-little-ice-agepart-ii/) in that article had this to say:
“The late 1870’s were equally cold in China and India , where up to 18 million died from famines caused by cold, drought and monsoon failure.
The cold snap persisted into the 1880’s and 1890’s when large ice floes formed on the Thames.”
Remember mosher, stuff did happen before 1950, even if BEST disagrees.
And if people do think I tried to hi-jack the thread, I apologize, but my first post on this thread did start with: “(Moderator — feel free to snip, but I think this is relevant)”

Carrick
June 24, 2012 10:01 pm

Amino:

Again, and really, for the last time, you are biased. You prefer Meuller and Hansen. My goodness man, look around you.

I have absolutely no idea what you are talking about. I prefer Mueller and Hansen. As in “Gentlemen prefer Meuller [sic] and Hansen?”
What???
And OK I’m looking around. Now what?
Look, this all started out as a discussion between sunshine and myself (which is how I got you and he confused) over Mueller’s data starting here. I think you’ve lost the thread somewhere because you are adding a lot of content to this thread that isn’t there.
I wrote something critical of Mueller that hardly shows a preference for him, I didn’t make any particular claims about GISTEMP other than a cautionary note about the need to compare apples to apples. (As you add more stations, if the new stations are at higher latitude, you expect a small as in *yawn* upwards shift in temperature, just like what has been seen.)
Beyond that, course I’m biased. We all have biases. But what does that have to do with anything we’re discussing here? How are my biases manifesting themselves in a way that distracts from the original thread? If you think I prefer (or am biased in favor of) the BEST temperature reconstruction or the GISTEMP (land-only) ones, that would be a mistake on your part;
Try and stay on track.

Carrick
June 24, 2012 10:02 pm

Amino, I’m looking at GHCN not BEST.

Carrick
June 24, 2012 10:05 pm

And I was criticizing a finding of Mueller and BEST in my comment to sunshine. And I’ve consistently said that BEST is based on GHCN. As to where I’ve “given the impression that BEST is untouched raw data,” I’ll have to ask for a link or a retraction there.
You keep making wild claims, you need to ante up now with factual examples or admit you were mistaken.

June 24, 2012 10:06 pm

Carrick
What’s up with all the arm waving and smoke screens? I did say it wrong. I should have said the data set Mueller has identified himself with. Why be so petty? But I think that’s in your nature. You didn’t do name calling this time. But the same condescension of name calling comes through anyway.
If you want to talk cherry picking, as you did in a comment above where you told me to compare one area of data in BEST to the same area in another data set that’s fine. It is cherry picking. Not sure why you can’t see it is.
I think you are not open to any real discussion. It is clear you have your beliefs. You are not here to discuss but to tell us why you think people who don’t agree with you are wrong.

June 24, 2012 10:08 pm

Carrick: “1900 is the extreme outer edge, and I think you would have to admit that an interval that extends beyond this isn’t typical of LIA weather.”
So some people say. But HADCRUT3 NH says is cooled from ~1875 to 1912 or so.
http://www.woodfortrees.org/plot/hadcrut3nh/from:1876/to:1912/plot/hadcrut3nh/from:1876/to:1912/trend
Again, I think there are climate myths that suggest late 1800 to early 1900 temperatures were a cold starting point and it was even colder before 1895, when in fact going back in time it might have been warmer all the way back to the 1870s and might well have been significantly warmer than today.
As for you not liking the size of the regions I am looking at, I disagree. I think a climate signal should be as visible in a place as big as BC or as small was California or Washington or Oregon. And if the past is warmer then we can then question why BEST ignores pre-1950 data.
The massaging of data that EM Smith is try to highlight is essential to understanding whether modern warming is a myth caused by adjustments or just a small blip in a long history of climate ups and downs.

June 24, 2012 10:09 pm

Carrick
not really sure what you are looking for.
Are you saying BEST is raw untouched data.
Are you saying it was altered?
Can’t be sure why you want some kind of retraction.

June 24, 2012 10:10 pm

Carrick
you do give a clear impression of BEST that it is true to real world temperature. That is where you give the impression it is raw data.

June 24, 2012 10:19 pm

Carrick
this is what you said:
“When you adjust for differences in the “land-only” algorithms, BEST and GISTEMP get very findings, since they have the largest geographical coverage, so this is believable.”
http://wattsupwiththat.com/2012/06/22/comparing-ghcn-v1-and-v3/#comment-1016898
I can’t stay with you any longer Carrick. You’re inconsistent. Maybe someone else would like to dance in circles with you.

Carrick
June 24, 2012 10:21 pm

Amino, I’m asking for a url link to where I said what you are characterizing me as saying. Or an admission that your characterizations are in error.
I will claim I never said “BEST is raw untouched data”, in fact I never said anything particular about the BEST reconstruction (other than a comparison of it to GISTEMP).
In which comment for example do I give “give a clear impression of BEST that it is true to real world temperature”? Can you point this out to me and others?

Carrick
June 24, 2012 10:25 pm

Sunshine:

As for you not liking the size of the regions I am looking at, I disagree. I think a climate signal should be as visible in a place as big as BC or as small was California or Washington or Oregon.

I didn’t say I “didn’t like it”. I said you had to be careful because the smaller the geographic region, the larger the effect of regional scale variability on the temperature record, and that *you* need to consider this when making your arguments.
And I know you aren’t going to like this one:

But HADCRUT3 NH says is cooled from ~1875 to 1912 or so.

I don’t think HADCRUT is reliable before 1950. I have a factual basis for making that decision, it could be flawed, it’s something that Steven Mosher and I both separately want to look at, and I can go into if you are interested.
This is back to the issue… if you don’t think 1970-2010 is reliable, why do you trust 1850-1912, where the global geographical coverage was much worse, and the quality controls in place and instrumentation were much more primitive?
Being skeptical means being skeptical all of the time, even at times when it appears to hurt your arguments. As I said, the objective should be about arriving at the truth, not about who can toss out the superior rhetoric.

Carrick
June 24, 2012 10:53 pm

Amino:

I can’t stay with you any longer Carrick. You’re inconsistent. Maybe someone else would like to dance in circles with you.

All I’m asking you do to is be honest about how you characterize my positions, and if you mischaracterized them through misunderstanding admit to that, and not to continue to attack me with mischaracterizations of my views.
This appears to be as close as you can get to substantiating your somewhat wild claims. I did say:

“When you adjust for differences in the “land-only” algorithms, BEST and GISTEMP get very findings, since they have the largest geographical coverage, so this is believable.

Same geographical coverage=same answers equals the algorithms give consistent results when they are expected to. You’d expect this (my point) and this is what you find (my point). What this doesn’t seem to imply is that GISTEMP is up to some “funny stuff” in their adjustments. Nor does it imply that the temperature series is necessarily better or worse than any other.
What this statement does not say is any of these claims of yours,
1) BEST uses raw temperatures,
2) that raw temperatures are “true” temperatures,
3) that I (like most gentlemen apparently ;-)) prefer Hasnen and Mueller,
4) fill in the blank(s) I’ve lost track of all of the silly claims you’ve made about me at this point,
You can’t even honestly admit that you overgeneralized my comments so now having accused me of slinking off (when I TOLD you I was going to remodel my kitchen and might be back assuming I didn’t perish e.g. running the wet saw), you now slink off without even that admission?
It’s all good, on your way then.

June 24, 2012 10:53 pm

Carrick, lets put it this way. Do I think the LIA ended in 1850 on the dot? No.
Do I trust HADCRUT? Not necessarily. But it does coincide with some of the Greenland data I’ve seen.
Upernavik: Warmest November 1878, warmest Dec 1873 etc
http://www.arctic.noaa.gov/reportcard/greenland_1873-2011_stats_vs_1981-2010_table_htc3.pdf
I suspect it was warmer in the 1870s in many regions than it is now.
.
“the smaller the geographic region, the larger the effect of regional scale variability on the temperature record, and that *you* need to consider this when making your arguments.”
My argument is simple. A GAT is like sausage maker attempting to mash all the climate records into one number … say 42 an make it appear that global temperatures are this relatively stable upwardly rising 2d graph.
The number 42 says nothing about climate. There is nothing unusual about our current climate. If anything the unusual part is how calm it appears and how small the fluctuations are from month to month.
Climate is chaotic and in many US states the last 5 years aren’t #1 or #2 and in some states it isn’t #3, or #4 either.
Why? It isn’t global warming.

Carrick
June 24, 2012 10:55 pm

Amino, if you wanted to know which series i thought were most credible as actual representations of global mean temperature, I’d put NCDC at the top and ECMWF close behind. GISTEMP has too many ad hoc steps for me, BEST is land only so it’s not even a global temperature reconstruction and anyway I’m not a big fan of the way they implemented kriging (assumption of radial symmetry is neither proven in their paper, nor likely to be true).

June 24, 2012 10:56 pm

Amino Acids in Meteorites says:
June 24, 2012 at 9:49 pm (Edit)
Carrick
You give the impression BEST is untouched raw data. But they had their method of handling the data. And the data is not untouched and pure.
##############################################
Let me see if I can help
using google.
You can use it to find things like this
http://berkeleyearth.org/data/
and the bottom of this page is a piece of english
“Source files
The source files we used to create the Berkeley Earth database are available in a common format here.”
That word “here” is a link.
it magically brings you to another place
http://berkeleyearth.org/source-files/
These are the source files. that means, these are the sources used to compile the dataset.
Now if you look at all of those datasets you will see some ( like CRUTEM4) that contain data series that are adjusted. and if you look at the sources for crutem4
http://www.cru.uea.ac.uk/cru/data/temperature/crutem4/station-data.htm
you will see other sources… and if you follow those sources down you find things like this
http://www.ec.gc.ca/dccha-ahccd/default.asp?lang=en&n=70E82601-1
which is derived from data like this
http://www.climate.weatheroffice.gc.ca/climateData/monthlydata_e.html?timeframe=3&Prov=BC&StationID=65&mlyRange=1920-01-01|2005-04-01&Year=1920&Month=01&Day=01
Got that?
Now the really hard part is taking all these disparate sources and creating a master file.
Because… that same source for CRU can also be a source for GHCN daily..
and that same source could be a source for GHCN v2 monthly back and GHCN v3
The process of resolving all the sources happens in the merge step. trust me
merging different datasets is not a fun job.
Maybe I’ll do a post on that. In the end there are a few sources that contribute huge
portions of the data. Ghcn daily is one of those. Since its daily data its about as “raw” as you get.. I think it may be helpful to folks to do an entire post on the various sources
used and how the final dataset is built. That’s a fair amount of work and I’ve already given months of time away writing and testing software that allows people to do that for themselves if they are truly interested.
All that said there is another initiative underway to compile another comprehensive dataset
http://www.surfacetemperatures.org/

June 24, 2012 11:00 pm

yes amino the link please..
link nanny, can you come and make amino link?

Carrick
June 24, 2012 11:04 pm

Amino:

If you want to talk cherry picking, as you did in a comment above where you told me to compare one area of data in BEST to the same area in another data set that’s fine. It is cherry picking. Not sure why you can’t see it is.

My final comment here, unless something interesting enough pops up.
Again this not cherry picking… comparing the same latitudes is an apples to apples comparison. (This allows us to compare how the algorithms changed the reconstruction, instead of how changes in geographical distribution changed the reconstruction. )

I think you are not open to any real discussion. It is clear you have your beliefs. You are not here to discuss but to tell us why you think people who don’t agree with you are wrong.

I am actually, anybody who knows me would tell you that. Don’t expect me to meekly agree that I’m wrong though, when you haven’t proven it or (for the last time) agree to inaccurate characterizations of my views, especially ones like the laundry list you’ve given above pulled apparently from thin air.
Bye.

Carrick
June 24, 2012 11:12 pm

Steven, as any astute observer will notice, BEST uses adjusted rather that raw data.
And as anybody who understands the issues with the data would recognize, is that properly adjusted data will give a more accurate picture of climate than unadjusted data. That doesn’t imply we take the adjustments carte blanche, but of course you haven’t. You’ve spent considerable effort (as has Zeke) understanding the effects of the adjustments on the global reconstructed temperature series.
I not claim BEST uses raw data, I would not claim that raw data is necessarily a better representation of climate than adjusted data, because it is land only, I would never mistake it for a global temperature series, and as I’ve said, some good ideas,it’s implementation has some details that need tested and possibly tweaked.

June 24, 2012 11:20 pm

EM
‘I noted one where I found 144 C for a station in the USA.
Now Steve is happy to say ~’but we can find that 144 C and change it to something valid’. I think “Hmmm… So we catch the 144 C, but do we catch the 40C that ought to be 38 C?””
wrong. To quote willis, where did I ever say that?
If I find 144C that value is DROPPED.
For the 40C that should be 38C? those errors can be caught as well. There is extensive literature on this where temperature series are “corrupted” with bogus values and then the algorithm is tested to see if it can find the error.

June 24, 2012 11:32 pm

Bruce
“The number 42 says nothing about climate. There is nothing unusual about our current climate. If anything the unusual part is how calm it appears and how small the fluctuations are from month to month.
Climate is chaotic and in many US states the last 5 years aren’t #1 or #2 and in some states it isn’t #3, or #4 either.”
The climate does not exist. Climate is a word people use to point to LONG TERM STATISTICS about weather.
“global warming” is not spatially uniform we dont expect it to be spatially uniform. we dont expect the whole globe to warm up at the same time or cool down at the same time.
that is WHY the LIA doesnt end on a given day. that is why is more severe in some places and less severe in others. you might even find a spot today that for the past 10 years was the same temp as it was in 1850. but generally over space and time the planet has been warming.
i
Is there anything unusual about the warming we have seen? First thats a ill posed question.
second its really not interesting. The question is can we explain the warming we see?
“natural variation” is not an explanation. It’s renaming the phenomema.
It’s the sun! is an explanation.
Its the sum of all forcing! is an explanation
Its not unusual, is not an explanation.
Its aerosols. is an explanation

June 24, 2012 11:36 pm

Carrick
“Steven, as any astute observer will notice, BEST uses adjusted rather that raw data.”
ya I didnt think you had made that claim and dont know how amino got that notion.
Ideally, I’m hoping that people learn to apply a better terminology than raw data.
the surface temperature folks are adopting the level 0, level 1, etc terminology.
where level 0 is the “first report” the actual written form if it exists.

June 24, 2012 11:41 pm

here sunshine. contemplate this

June 24, 2012 11:47 pm

sunshine
“And if people do think I tried to hi-jack the thread, I apologize, but my first post on this thread did start with: “(Moderator — feel free to snip, but I think this is relevant)”
so basically you knew that this thread was about a comparison between Ghcn v1 and Ghcn V3
and you figured that your post was off topic but you’d try to sneak in something that
you’ve tried to get away with before… posting data that you know is ESTIMATED and not being straightforward about your source.
Nice Bruce. I also like the way you explained to people that the first time you tried this
I busted you for using data that was not the QC data.

June 24, 2012 11:52 pm

EM
“Oh, and on Steven’s assertion that the StaionIDs all changed… ”
Once again, please find the quote where I said they all changed? You won’t find it. Because they dont all change. But enough of them change that you cannot use it as a reliable method for station disambiguation or station identification.

June 24, 2012 11:58 pm

vukcevic says:
June 23, 2012 at 1:18 pm (Edit)
steven mosher says: June 23, 2012 at 7:17 am
You can expect some updates to that Sante fe chart in the coming months. I suspect folks who do spectral analysis will be very interested.
I will look forward to your results.
I suggest to separate two hemispheres, South is less volatile, ocean inertia and CPC flywheel effect, North is affected by gyres and more in sympathy with the GMF
Till then this is what I get
http://www.vukcevic.talktalk.net/NH+SH.htm
When done I’ll email you magnetic data, so you can have some fun with it
##########################################
I’ll see what I can do to separate the data for you.
since you seem keen on making new discoveries and sharing your work
I’ll put that on the top of my list.

June 25, 2012 1:09 am

Sunshine
So that everyone can understand your question. You come onto a thread that is about Ghcn version 1 ( which is used by no one ) and Ghcnv3
and ask me
“But I think I had a legitimate question as to where a -13.7C data point for Malahat came from in the BEST data.”
Well, lets see. As I pointed out to others there are many sources that BEST uses and those sources have sources
IN the post you link to, you claim to have looked at two BEST data sets.
The “single value” dataset and the QC dataset
http://sunshinehours.wordpress.com/2012/03/13/auditing-the-latest-best-and-ec-data-for-malahat/
The -13.7C figure is one you found in the single value dataset.
As I explained to you before when you MISTOOK this data as the QC data, the single value dataset is the first dataset after the merge. That is, all the sources are combined into a merged dataset and then duplicates are identified and you end up with a dataset that
has a single value for every station month and a source for that data. Hence the name
“Single value”
So, your question is how did this data point get into the single valued dataset?
Answer: the data point was present in one of the sources of data.
And your point, that you jacked this thread around for was what? BEST reads in all the sources
and them compiles a final QC dataset. You pick out a dataset that hasnt been QC’d
and ask me what the source is for one month of data?
And you and the moderators and all the other readers think that this is a relevant question?
Now, I’ve taken my own free time to write software that allows you and anyone else to answer this question, you think that this answer will somehow be relevant to the work that EM smith and Zeke and I did on GHCN version 1 and 3? Say what?
relevant how? and then you beat me up because I wont do the work for you?
So, not only do I have to write the software to allow you to do it for yourself, that is not good enough? you want me to fetch that answer for you?
Good god, where is Willis the thread nanny to tell me where my responsibilities start and end.
Download my Code.
download the data
use the command readSources()
find the stations you want..
Here are the potential sources. That particular Month
7973 1 2002.792 3 22 0 0 0 0 0 0 0 0 0 0
7973 1 2002.875 3 22 0 0 0 0 0 0 0 0 0 0
7973 1 2002.958 3 22
The source is source 22.
Now you can read the source descriptions. I wrote a function for that as well.
here is what it reads.. all the sources.
can you find source 22?
1: US First Order Summary of the Day
2: US Cooperative Summary of the Day
3: Global Historical Climatology Network – Daily
4: Global Summary of the Day
5: Original Manuscript (from USSOD)
11: MAPSO (from USSOD)
13: Unknown / Other (from USSOD)
14: ASOS (from USSOD)
15: US Cooperative Summary of the Day (from GHCN)
17: US Preliminary Cooperative Summary of the Day, keyed from paper (from GHCN)
18: CDMP Cooperative Summary of the Day (from GHCN)
19: ASOS, since 2006 (from GHCN)
20: ASOS, 2000-2005 (from GHCN)
21: US Fort Data (from GHCN)
22: GCOS or other offical Government Data (from GHCN)
23: High Plains Regional Climate Center (from GHCN)
24: International Collection, personal contacts (from GHCN)
25: Monthly METAR Extract (from GHCN)
26: Quarantined African Data (from GHCN)
27: NCDC Reference Network / USHCN (from GHCN)
28: Global Summary of the Day (from GHCN)
29: US First Order Summary of the Day (from GHCN)
34: Scientific Committee on Antarctic Research
35: Hadley Centre Data Release
36: US Cooperative Summary of the Month
37: US Historical Climatology Network – Monthly
38: World Monthly Surface Station Climatology
42: Australian data from Australian Bureau of Met (from GHCN)
44: Ukraine update (from GHCN)
46: NCDC: US Cooperative Summary of the Day
47: NCDC: US First Order Summary of the Day
49: NCDC: CDMP Cooperative Summary of the Day
50: NCDC: Undocumented Summary of the Day
51: NCDC: US Cooperative Summary of the Day – Preliminary
52: NCDC: RCC-Preliminary Summary of the Day
53: GSN Monthly Data
54: Monthly Climatic Data of the World
55: GCOS Monthly CLIMAT Summaries
56: Global Historical Climatology Network – Monthly v3
57: Monthly Climatic Data of the World – Preliminary (from GHCN3)
58: GHCN-M v2 – Single Valued Series (from GHCN3)
59: UK Met Office (from GHCN3)
60: Monthly Climatic Data of the World – Final
62: CLIMAT / non-MCDW (from GHCN3)
63: USHCN v2 (from GHCN3)
64: World Weather Records (from GHCN3)
65: GHCN-M v2 multiple series 0 (from GHCN3)
66: GHCN-M v2 multiple series 1 (from GHCN3)
67: GHCN-M v2 multiple series 2 (from GHCN3)
68: GHCN-M v2 multiple series 3 (from GHCN3)
69: GHCN-M v2 multiple series 4 (from GHCN3)
70: GHCN-M v2 multiple series 5 (from GHCN3)
71: GHCN-M v2 multiple series 6 (from GHCN3)
72: GHCN-M v2 multiple series 7 (from GHCN3)
73: GHCN-M v2 multiple series 8 (from GHCN3)
76: World Weather Records
77: Colonial Archive
79: Colonial Era Archive (from GHCN3)
81: USSOD-C transmitted (from GHCN)
82: USSOD-C paper forms (from GHCN)
83: European Climate Assessment (from GHCN)
And the source data is here. on NCDC site
ftp://ftp.ncdc.noaa.gov/pub/data/gcos/
Now, that phi has got his links to countries other than the US that have done TOBS adjustments and now that you know the source of december 2002 for malahat BC
do you have any intelligent questions?

phi
June 25, 2012 1:19 am

I would have much to answer but seen my difficulty in English it’s above my forces.
Willis, thank you again for your comments, they seem to me quite appropriate.
I will be short by taking only two points that seem important to me in connection with this thread.
1. Steven Mosher claimed that TObs were the main adjustment made. That remains despite references, an unproven assertion outside US. That seems important in connection with this thread because TObs adjustments are the less problematic and in my interpretation, the divergence revealed by EM Smith is driven primarily by stations moves.
2. Implicit homogeneization in BEST are related to the segments adjustments. It is true that one can disable the scalpel but as I said earlier, the NMS aggregate the existing segments, disabling the scalpel only allow to prevent a part of the implicit homogenization and probably the least decisive.

Steve Richards
June 25, 2012 2:46 am

The benefit of posting links is that readers can read and understand the reason for thinking in a certain way.
If I were to google TOBS etc and read 10 sources, I still may have not found the correct source/paper that caused a particular comment or thought process.
A link to paper X, 2001 would immediately put all readers of this site in sync with the writer.
I may encourage others to reply with “have you read paper Y, 2009 with contradicts paper X….
In the race to publish papers and blogs, it is extremely helpful if people are told what source has informed opinions.

E.M.Smith
Editor
June 25, 2012 3:22 am

As Mosher has his panties in a bunch over my minor variation on First Differences, instead of answering comments here this evening, I re-did the anomaly code to do Classical First Differences and then re-ran it on both v1 and v3 “all data”. Calculated the difference between them, and plotted it on the same chart with my dP or dT method. ( i.e the ‘bridge the gap’ method).
Unfortunately, the difference was not as large as I had hoped. I was expecting more induced error in Classical FD from the gratuitous resets on data dropouts. Either there are fewer of them than I thought, or the average error tends to average out more than I expected ( i.e. random rather than systematic). In any case, not much difference between the two methods.
In recent years, near zero, increasing to about 0.05 C for most of it. At about 1850 (the earliest data used that I’ve found for various climate codes – that being Hadley – GIStemp uses 1880) the difference has expanded to about 0.12 C. I still hope the method can be shown more accurate (and superior) on smaller sets of data, or those with larger dropouts. But at least this ought to answer any doubts about dramatic impact from the change.
Chart of comparison here:
http://chiefio.files.wordpress.com/2012/06/classic-fd-vs-dt-or-dp.png
I’ll come back tomorrow and try to catch up on any comments here, then.

June 25, 2012 4:02 am

Steven Mosher, you said
“willis,……..You yourself are well enough read in this field ( CET and Armagh) to know that the US is not the only record that does TOBS adjustments.”
I dont think that the CET has a specific TOBs adjustment in it, Parker says
‘Manley took considerable care to compensate his monthly series for changes in observation time. Much of this compensation was implicit, through his use of overlaps between stations to make adjustments for changes of site’
http://www.metoffice.gov.uk/hadobs/hadcet/Parker_etalIJOC1992_dailyCET.pdf

June 25, 2012 5:13 am

Pamela Gray says: “I think station dropout has to do with abandoned stations in less populated areas of the US. “
An interesting speculation. It would be nice if you would study it and try to quantify the effect.
sunshinehours1 says: “Like the average elevation dropping 46 meters from 1940 to 2000?”
If the new lower stations are treated as new stations, this will have no effect because the trends are computed on the anomalies.
If the new lower locations are merged with the data of an older long station, you would remove this effect by homogenization. Either when the change is made, by performing several years of parallel measurements, as the WMO advices, or if this was not possible you can do this afterwards by statistical homogenization. This community is funny, one half is against homogenization and wants to use raw data and the other half points to problems that are in the raw data, which you could remove by homogenization. Whatever the climatologist does, it is wrong.
http://en.wikipedia.org/wiki/Lernaean_Hydra
DocMartyn says: “Well if you make sure all the cooling stations left are inside very closely spaced clusters of warming stations and make sure that the ones you remove are near by the many voids, you make the voids warmer. …you have a smoking gun.”
Yes you could make the effect of non-random station drop out stronger by also taking the density of stations into account. But please quantify this effect before you call it a smoking gun. Currently, I would call it an incense stick, at best.

June 25, 2012 5:40 am

I went to Steve McIntyre’s site and did a search on “Berkeley best”. I hadn’t been to his site in quite a while. I was just curious. He sees problems with BEST also.
http://climateaudit.org/?s=berkeley+best

June 25, 2012 5:47 am

Carrick
If you want to avoid giving the impression that you think BEST and GISTemp are true to real world temps you should word things more carefully than this:
“BEST and GISTEMP get very findings, since they have the largest geographical coverage”
If you think BEST and GISTemp have a bias then come out and say it. Don’t act like you think both can be possible.

A C Osborn
June 25, 2012 5:47 am

I have mentioned this before to Steven Mosher on other sites, he keeps talking about QC datasets and mentions BEST in the same breath.
Perhaps he can explain why so many of their northern hemisphere stations have as high or higher Average Temperatures in Mid Winter (Jan/Feb/Mar) as those in mid summer (June/July/Aug).
I do not know if any of the other QC datasets suffer with the same problem, but I know for certain that BEST does.
Quality Control that cannot identify the Improbable and maybe impossible is not fit for use.

June 25, 2012 5:53 am

Carrick says:
June 24, 2012 at 10:21 pm
“I will claim I never said “BEST is raw untouched data”,”
And I never said you said it either. I said you gave the impression of it. You changed what I said. Now can you see your bias?
More of your dancing in circles.

June 25, 2012 6:12 am

steven mosher says:
June 24, 2012 at 10:56 pm
“Let me see if I can help”
You were not able to help. You went off on a tangent. You digressed. You did not address what I actually said.
I said Carrick gave the impression. I did not say a Google search to find what BEST really is gave the impression. You and Carrick are turning what I said into something I did not say. Odd things happening here.

June 25, 2012 6:19 am
John Doe
June 25, 2012 7:01 am

steven mosher says:
June 23, 2012 at 6:20 am
“Tobs is the single largest adjustment made to most records. It happens to be a warming adjustment.”
How convenient.

June 25, 2012 7:08 am

Victor: “If the new lower stations are treated as new stations …”
Or they just stopped using the data from some higher elevation stations.
We don’t know. Maybe the people who are creating the GAT should mention it and explain what happened.

June 25, 2012 7:15 am

Mosher, I knew I was comparing SV to QC. I said so on my blog back in March:
“I am comparing the BEST SV (Single Valued) data to the BEST QC (Quality Controlled) data.”
http://sunshinehours.wordpress.com/2012/03/13/auditing-the-latest-best-and-ec-data-for-malahat/
Mosher: “Answer: the data point was present in one of the sources of data.”
Why? It bears no relation to reality.

June 25, 2012 7:24 am

Mosher: “but generally over space and time the planet has been warming.”
… and cooling and warming and cooling …
Then why do you change the topic when we try and find out which parts are cooling now and why?

June 25, 2012 7:28 am

I want to apologize to folks for not actively participating in this thread earlier. I was away at a retreat this weekend, and (unbeknownst to me beforehand) they did not have working wifi… Steve has done a pretty good job addressing concerns, but here are responses to some outstanding comments:
Wayne,
While there is a slightly positive mean bias in the differences they are relatively small as upwards of 90% of the differences are zero. Any systemic bias between the two sets would show up in Fig. 4, but as you can see the difference in century-scale trends is only ~1.5% and well within the error bars for each.
I’m rather swamped this week, but I’ll see if I can do an analysis of differences between the two over time. I’m also trying to get a station_id conversion from NCDC so I’m not just trying to match based on lat/lon.
Bill Illis,
The differences between NCDC’s record in GHCN v2 and v3 relates primarily to the adjustments. This post (and analysis) deals solely with unadjusted data.

June 25, 2012 7:53 am

For folks rehashing the station dropout debate, there are a number of data points that should be reassuring:
1) As we discovered when we first examined this, stations that dropped out of GHCN v2 around 1992 had a slightly higher trend pre-1992 than stations that did not drop out (due in part to better sampling of higher latitudes). This is the opposite of what you would expect to see if cooling stations were purposefully dropped.
2) GHCN v3 added in a bunch of new data post-1992 but the record over that period did not change appreciably.
3) Alternative datasets with much greater coverage over that period that experience no decrease in station count (GSOD/ISH, GHCN Daily, Berkeley) show effectively the same results as GHCN.

June 25, 2012 9:34 am

Zeke: “As we discovered when we first examined this, stations that dropped out of GHCN v2 around 1992 had a slightly higher trend pre-1992 ”
What do you mean by trend? 1950-1992?

phi
June 25, 2012 9:34 am

John Doe,
Anyway, this is wrong. TObs isn’t the largest single adjustment. Steven Mosher gave a link that contradicts it: http://sciencelinks.jp/j-east/display.php?id=000020000500A0108818
According to the summary, if you evaluate the effect on trends in average temperatures, you see that this should be about 0.1 ° C over the twentieth century. This is absolutely not the dominant cause.

Bill Illis
June 25, 2012 10:08 am

Zeke Hausfather says:
June 25, 2012 at 7:28 am
The differences between NCDC’s record in GHCN v2 and v3 relates primarily to the adjustments. This post (and analysis) deals solely with unadjusted data.
——————-
That’s why you have a chart above showing that GHCN V3 Raw, GHCN V1 Raw, NCDC and Gistemp that are virtually identical.
NCDC adjusted the record in May 2011 increasing the trend by 0.15C yet somehow all the Raw and adjusted records (all versions through time) are all identical. The adjustments go higher but all the Raw and adjusted versions are still identical. Not physically possible (unless the Raw data files were also changed).

June 25, 2012 11:06 am

Bill Illis,
Version 3 adjustments did not increase the trend 0.15 C. Its more like 0.1 C per century, which is 0.01 C per decade.
See http://rankexploits.com/musings/2010/ghcn-version-3-beta/ and
http://moyhu.blogspot.com/2010/09/beta-version-of-ghcn-v3-is-out.html
The net effect of GHCN adjustments on global temps is rather small compared to the magnitude of the trends.

June 25, 2012 11:15 am

Adjustments in data have other uses besides altering the trend. Adjustments in data can provide new record breaking temps which are useful fodder for headlines. Adjustments in data can also be carefully done in order to bring regional temp. curves into some resemblance to “global temperature averages.”

Reply to  Zeke
June 25, 2012 11:26 am

Zeke
Adjustments in data have other uses besides altering the trend. Adjustments in data can provide new record breaking temps which are useful fodder for headlines. Adjustments in data can also be carefully done in order to bring regional temp. curves into some resemblance to “global temperature averages.”
Still not understanding how you determine what to adjust TO. From everything I’m seeing, you need to have the end in mind in order to make the adjustment, which inherently biases the adjustment in the direction you perceive it should be. Again, why not just dump the bad data instead of adjusting it?

June 25, 2012 11:18 am

Zeke … I love that graph. I call it the 1950 seesaw.
Adjusted data is cooler before 1950 and warmer after 1950. You can see the colors change from bottom to top.
Cool the past … warm the present. Change the trend.

phi
June 25, 2012 11:34 am

All this would be fun if it was not serious. Adjustments based on effective algorithms are systematically important in the regional case studies (about 0.5 ° C for the twentieth century) but conveniently disappear for global comparisons. There is obviously something wrong somewhere. At this point, I think the problem lies in the implicit homogenization. What is not adjusted in the series is in cells with aggregation of segments. It is quite different when the regional averages are obtained by aggregated long time series. In these cases the extent of homogenization is fully visible. That’s the whole point of CruTem when you can dispose of the raw and adjusted data.

Pamela Gray
June 25, 2012 11:36 am

Zeke, you misinterpret my thinking. I am not focused on trying to show that dropped stations had cooled or warmed or did neither (I have no biased thinking along those lines except I think the null hypothesis has not been convincingly proved to be negative regardless of whether or not dropped or current stations have warmed or cooled).
I am focused on what affect ENSO parameters may have had on station drop and have several unanswered questions. Was station drop random? Yes or no, and what analysis did you do to consider this? Did you consider geographic ENSO affects? What were the lattitude and longitude coordinates of the dropped stations and how did they sit geographically within the affects of ENSO patterns known to exist? And what were their artifact/degradation parameters? What are the coordinates of the remaining and added stations and how do they sit geographically within the affects of ENSO patterns known to exist? What are their artifact/degradation parameters?
Why is this possible conflagration between sensor location and ENSO parameters important? At least I know part of the answer to this last question but do you have thoughts on this?

June 25, 2012 11:54 am

Zeke, why do the colors switch in the 1992 graph?
For the most part, blue on top or equal to red until about 1950 and then they switch places with red on top clearly at the end – 1992.

June 25, 2012 12:03 pm

Bill Illis,
Let me clarify a tad. GHCN v3 adjustments (vs. raw) are ~0.1 C per century. They are not an increase from GHCN v2 pre se: http://i53.tinypic.com/23l1bb7.png

David
June 25, 2012 12:31 pm

John Doe says:
June 25, 2012 at 7:01 am
steven mosher says:
June 23, 2012 at 6:20 am
“Tobs is the single largest adjustment made to most records. It happens to be a warming adjustment.”
How convenient.
============================================
This does not conern me except that I have never read an “elevator” summary of this adjustment. It appears logical that past old records, that did not automatically record the days high and low, would, when changed to instruments that always get the high and low recorded, thne the high would go up, and the low would go down. The net affect ??, but their is apparently other aspects to this???

David
June 25, 2012 12:34 pm

FWIW, the article was a response to EM Smiths blog post, but most of Mosher’s comments appear to NOT address his comments? I assure you that one can have reasonable dialogu with EM Smith, but it may start by asking him some questions on what he means.

June 25, 2012 1:14 pm

Zeke, why did you pick 1992 for the cutoff when 1975 was the peak year for stations?
http://rankexploits.com/musings/wp-content/uploads/2010/09/Screen-shot-2010-09-17-at-3.32.09-PM.png

Manfred
June 25, 2012 1:32 pm

Zeke Hausfather says:
June 25, 2012 at 11:06 am
Bill Illis,
Version 3 adjustments did not increase the trend 0.15 C. Its more like 0.1 C per century, which is 0.01 C per decade.
See http://rankexploits.com/musings/2010/ghcn-version-3-beta/ and
http://moyhu.blogspot.com/2010/09/beta-version-of-ghcn-v3-is-out.html
The net effect of GHCN adjustments on global temps is rather small compared to the magnitude of the trends.
————————————–
Nobody would care if this would be a singular adjustment with an additional warming effect. But it appears to be a continuous process of adjustements all in the same direction – what is statistically very unlikel.
On top, points critizised elsewhere, as Mennian warming and UHI warming are still unresolved.
Land temperatures increasingly deviate to the upside from sea surface temperatures.
Land temeprature trends are not compatible with satellite observations and their trend is much too high.
The recent sea surface ajter 1940s adjustment in HadSST4 is an appalling joke.

gallopingcamel
June 25, 2012 3:31 pm

Zeke & Steve,
Zeke mentioned the issue of “Station Drop Off”.
Earlier in the comments, (Amino Acids in Meteorites, June 23, 2012 at 2:13 am) linked to videos that seem to say that at least for Russia it makes little difference how many weather stations you use as long as you choose them carefully. The presenter shows how the plot of temperature against time looks with 4 stations, 37 stations, 121 stations and 468 stations. The plots agree closely.
Which of these explanantions best fits the facts:
AA – It really does not matter how many weather stations there are.
BB – Someone is selecting stations carefully to get plots that match a CAGW hypothethis.
I ask the question in all humility having failed to figure it out for myself.

Bill Illis
June 25, 2012 5:41 pm

You can compare each individual month between Version 2 and Version 3 produced by the NCDC here.
http://www.ncdc.noaa.gov/ghcnm/time-series/
I would prefer if they would allow one to chart every month on one graph but the NCDC doesn’t like to give much away. You will get a chart that looks like this one for April. Green is Version 2 and Blue is Version 3.
http://img401.imageshack.us/img401/971/multigraph.png
While they look similar, the 1920s and 1930s are cooled by 0.1C and the recent periods are warmed by 0.05C in a systematic pattern (all months show this).

E.M.Smith
Editor
June 25, 2012 5:58 pm

@GallopingCamel:
It can be both…. ( that is, it can inclusive “or” as well as exclusive “or”…)
IMHO all it “really takes” to know if we are warming or cooling is something on the order of 3 or 4 stations per continent. As long as they are long lived stations with little change of location, instruments, and surrounds. TonyB (IIRC) has looked at that and found a fair number of stable long lived thermometers. They show cyclical warming and cooling and a climb out of the LIA and not much else. (Upsalla Sweden has a very long term graph and is a well tended instrument:
http://www.smhi.se/sgn0102/n0205/upps_www.pdf it shows 1710-30 as roughly the same as now. The writings of LInnaeus have some distractors who claim he could not grow the plants he said he grew in the cold climate of Sweden, but when he was around, the graph shows it was not so cold… so there is “anecdotal evidence” for the curve and a curve that supports the written record.
One of the first finding I had was that long lived records showed no warming while all the warming “signal” was concentrated in short lived segments. (Largely taking off in 1987-1990 on a change of modification flag / duplicate number in GHCN v2). So several folks have found the same thing. Warmer in the past, very cold in “1800 and Froze To Death / The Year Without A Summer” era, then back to about the same warm level now as in 1710-1730.
At the same time, that Russian Video that found all the curves matching went on to show that they all had UHI issues and that accounted for a large part of the warming ( the rest, IMHO, is that “dip” between 1700 and 1900).
So if you do a lousy job of correcting UHI, start your data in the middle 1800s, and then leave out the non-volatile stations (or swamp them with enough ‘averaged in’ stations at Airports and other rapidly heating places) “all your curves will match” and the “all will show warming”. That is also exactly what is done with codes like GIStemp and HadCRUT / CRUTEM. So “selecting to produce warming”. The only part that is not knowable is “Deliberate or accidental?”. It is not possible to attribute motive. ( It could simply be that “coverage” is “enough” in someone’s mind in the mid 1800s so that’s when they start and that they think Airports are rural as the “population” at an airport is typically zero on census figures and many turn off their lights at night (or are surrounded by bare dirt – hotter than grass btw – away from the city lights so darker than the urban core).
We’ve seen that UHI is very poorly fixed (and even very poorly demonstrated by all the folks who try doing averages based on the GHCN and similar “rural” classifications; even though anyone with bare feet and no hat can tell you it is one heck of a lot hotter on the flight line (where the thermometers are at airports, near the runway) than in the shady forest nearby. So endless “proof” that UHI is insignificant is produced by folks simply trust the metadata when the metadate tells lies. (THE largest US Marine Air Field is classed as “rural” despite having tens of thousands of Marines and huge numbers of flights and being the “Crossroads Of The Marines”… The barracks do not count as “population” and they do night ops with the lights off so it is a dark place. Just hot… And in GHCN.) So are the folks showing “No UHI, move along, move along.” being malicious or just too dumb to see their error? No way to tell. But IIRC it was Templehoff that started as a grass parade field in the 1700s, was a global jet port (and used during the Berlin Air Lift, and has now been converted to an urban shopping mall. It is much warmer as hectares of concrete and tarmac than when it was a grass field… And it is in the GHCN. ( It was still an A for Airstation last I looked, a few years after conversion to a mall…)
So could the data be selected to match the CAGW hypothesis? Certainly. In fact, it might even be “by definition” given all the airports in the record… and using airports was a conscious act. ( The Pacific Ocean coverage is to soon be 100 % Airports… one interesting sidelight: I found a ‘dip’ in the Pacific Temperatures that matched the 1970s major recession on the Arab Oil Embargo… Just sayin’…. ) But that does not tell you if the selection matches the result to CAGW by design, or as accidental consequence that lead to the theory being made…
In short: It doesn’t matter how many stations you have; as good clean long lived non-UHI stations show no warming and Airports show lots of warming. Any collection of either shows about the same pattern, just two different patterns between them. We get “Warming” in the aggregate due to there being very few long lived non-UHI stations and a whole lot more Airports, especially recently, in the average of the two ‘trends’.
:
I’m happy to have them arguing the merits of various orthogonal things, less stuff for me to deal with that way 😉
Don’t know that there are many questions to ask / answer. The methods, data, code, etc are all public. It does what it does. I’ve described what it does. Don’t see much I can add after that… And questions like “what software was used to download the ftp dataset” as held up a a ‘flaw’ on my part (not ‘publishing it’) are really just silly. The answer is “whatever ftp software you like” and “from the NCDC data repository”… so less of those makes my day easier.
Heck, I’m still trying to catch up with comments (and won’t get to them all now, just have a moment for a couple of quick ones).. so fewer for me to answer is a feature right now 😉
Heck, nobody even said much about the posted “not much difference between FD and dT” graph and that was good solid day of work to recreate the whole run with Classical FD and put it on a graph with the dT results. At a day per significant question (even if I think the difference between FD and dT is insignificant) it can take a lot to deal with questions… So most of a day and night of work gets the sound of crickets in response… I guess that’s a good thing.
BTW, given how Mosher is all hot and bothered about FD having found to have limitations, one must ask: Have all the climate science papers that use it been withdrawn now?” Last I looked, it was used in several of the papers that justify many of the tools and adjustments in use. It was in some paper that underlay the RSM IIRC (but it was a while ago I read that batch… so not sure just which ones…)
Finally, yes, one can have a reasonable conversation with me. BUT, I generally ignore folks who rant, play “Gotcha! Games”, and are generally Trollish. Or are just too grumpy. “Life is too short to drink bad wine.” and some folks are just a keg of vinegar… I’m a softy, though, so anyone with a clear and honest need for help usually will get anything I can provide.
At any rate, that’s all I have time for at the moment. Still catching up after a night of no sleep and time to make dinner… But now I have the code written to do some nice A/B compares of Classical FD and dT to measure strengths and weaknesses. I’m still pretty sure mine is demonstrably better; now I just need to go demonstrate 😉

June 25, 2012 6:18 pm

Bill Illis: “You can compare each individual month between Version 2 and Version 3”
The classic seesaw around 1950. Blue version 3 colder until 1950 and then warmer after 1950.

June 25, 2012 9:19 pm

E.M.Smith
I, for one, do appreciate your comments.
And no one can say you shy away from replying. Also, no one can say your posts that make it into WUWT are poorly thought out and presented.

Venter
June 25, 2012 9:51 pm

Thanks Chiefo,
Your explanations have been simple, concise and clear for anyone with open eyes and an open mind to see.
The verbal semantics of the ” experts ” here did not address your post and are discussing things tangential to what your post was. In effect this post as a rebuttal to your post, does not address any of the key issues in your post and is dancing around unnecessary fringe arguments. Effectively the response is a strawman copout.

June 26, 2012 1:44 am

Mike (ChiefIO),
Firstly that you for the compliment of confusng me with TonyB. Tony has done superb work in highlighting just how ignorant of history our current (as opposed to Hubert Lamb’s) generation of so called climate scientists are. I’m proud to have my name mentioned in the same post as him any day.
As you know some of us (former non-questioners of CAGW) have been around along time in the climate debate. In my case, I first started my (now) skeptical of CAGW education with John Daly, thanks to Tim Lambert, calling John Brignell of Numberwatch a ‘crank’. I am what I call a ClimateAudit ‘lifer’ in that I’ve been following Steve Mc’s threads (with John A’s assistance) on CA almost from its birth. Consequently I probably know ‘Moshpit’ better than most and I’ve certainly had many a debate with him (and his fellow (luke) warmista mates like Zeke H, Nick S, Carrick and Ron B over at Lucia’s Blackboard,
Moshpit often gives the impression to people that he is an accomplished software developer when in fact if you seek him out on LinkedIn, you see that he is much more of a sofwtare project manager. He has nothing like the software development (particularly with complex enterprise level IT systems) and experience you and I have Mike. FWIW, he is however an accomplished R programmer with a self proclaimed long memory. Along with Zeke, Ron and the man with a reversed well known sunglass brand name, he ha sdone lots of what I call ‘helicopter’ level analysis of the various temperature record datasets (in paticularly USHCN and GHCN) but has done very little compared to you and I and Verity and TonyB (to name only a few) ‘ground’ level forensic analyses of the individual station records.
Moshpit (Steve Mosher) is a firm believer that by ‘anomalising’ and ‘gridding’ the data you can cure it of all it’s problems (like its lack of spatial/temporal coverage, missing data etc), and come up with a mean global surface temperature that demonstates that man is having a significant effect on own planet’s climate through our continued emissions of CO2. IMO along with many other ‘warmers’ it is he who is the real ‘denier’ and not us skeptics. He is a denier of what is largely if not wholely overwhelming historical evidence of cyclic spatially varying significant multi-decadal natural climatic variability within our planet’s climate. I particularly get annoyed when this signiicant variation in temperature at any given weather station is referred to as ‘noise’ and when analysts like Moshpit are content to ‘adjust it’ and/or average it out.
Based on the historical evidence available, Do I believe that our planet has warmed in the last hundred years or so? IMO, yes it has. Has man been the primary cause of that warming? Possibly yes but not because of our emissions of an odourless, tasteless trace gas essential to all life on our planet but rather mor elikely due to the effect we have on temperature monitoring instruments. Should we be concerned about this ‘warming’? Most definitely not as common sense shoud dictate that we should do a better (less sloppy) job of observing our planet and full account we any changes we observe within it before jumping to the conclusion that they are man-made.
KevinUK

gallopingcamel
June 26, 2012 9:52 pm

Kevin UK writes:
“…..historical evidence of cyclic spatially varying significant multi-decadal natural climatic variability within our planet’s climate.”
I hope he means what I think he does. Historically (= reasonably reliable for the last 3,000 years but with dimininshing credibilty before that) the climate has varied. Warm periods have been characterized by the rise of empires (Egyptian, Minoan, Roman etc.) the cold periods have corresponded with incredible human suffering and de-population. While historical climate variations have been quite dramatic one cannot attribute them to CO2 which varied very little until the last 70 years.
Given that climate has such huge consequences for our species it seems stange to witness a bus load of statisticians (Tamino, Zeke, RomanM, JeffId, McIntyre and many many more) who can’t agree about the global temperature since Fahrenheit invented his thermometer in 1714.
Even stranger is the fact that Michael Mann with his tree-mometers that defy history is given a moment’s consideration by “Scientists”.

Arno Arrak
June 27, 2012 3:48 pm

It is possible to compare these data with satellite measured temperatures from 1979 on. When you do that you discover that all of the temperature values above 1980 have been given a false upward trend that does not exist in the real world. You can see this falsification with the naked eye even on the small scale graph of Figure 4. All you need to do to see this fakery is to locate the super El Nino of 1998. It is the high peak at 2000 in their graph. On both sides of it are two V notches. They are La Ninas that on the satellite record are even but in this graph the right hand notch is one third of a degree higher than the left one. This is a fake increase of temperature of 0.3 degrees in two years or 15 degrees per century. If you then look to the right of it you see that there are two peaks higher than the super El Nino of 1998. They are a spurious peak at 2007 and the El Nino of 2010. According to satellite records they are both 0.2 degrees lower than the super El Nino. but here they are shown 0.1 degree higher instead. That is the same phony 0.3 degree boost we observed with the two La Nina periods, applied across the remaining twenty-first century. Since we are dealing with digital records it is clear that this kind of bias has to be built into the computer program whose output we are looking at. The technique of the Big Lie, first introduced in Mein Kampf, is well and alive in the climate change world. It is a colossal fraud and must be stopped and investigated. In the meantime, the only believable global temperature values are those produced by satellite temperature measurements. I suggest that only satellite temperatures should be used when global temperature values are needed. There are two sources: UAH and RSS. They are slightly different but both can serve as an approximation to a real global temperature that can be believed in.

E.M.Smith
Editor
July 1, 2012 10:34 pm

@Amino Acids in Meteorites:
Thanks! I try.
@Venter:
Thanks! I try to be clear. It isn’t hard to translate opaque jargon to straight language. I usually find the exercise clears a lot of fuzzy thinking from the bafflegab in the process 😉
@KevinUK:
You are being very kind.
IMHO, if you don’t know what is happening at the individual station level, you don’t know what’s happening. That is why I started with just looking AT temperatures. For individual stations and for aggregations. (For which I took a lot of heat from folks accusing me of not knowing to do anomalies when my purpose was to ‘measure the data’ not find a GAT.)
I first got that habit back in my old FORTRAN IV class. We were deliberately fed crap data to burn into our brains that checking the data was STEP ONE. If you don’t know your data, you don’t know crap… So I started with looking AT the temperature data. Never thinking anyone would expect that to be the last step… or toss rocks over it. That was were I first started seeing “Odd Things” like some going up and others going down and some months rising while others fall for the same instrument… Just not ‘generalized warming’…
So it is a long slow incremental process of building from the lowest level to the highest, one brick at a time.
IMHO, the major fault of all the codes I’ve seen from “Climate Scientists” has been to assume they have a working theory and good data and run with it. Starting from the “helicopter view” and building backwards to justifications. I start with the data and ask it what it has to say. No theory, just open ears and eyes… Later, after it has spoken, I might come up with a theory (like that ’87-90 point where the Duplicate Number changes and everything takes a 1/2 C jump at just that transition… my theory is equipment and processing changes with onset then. Because THAT is what the individual data look like and what the aggregates look like.)
Ah, yes, the “Anomaly Grid / Box Magic Sauce” cure all… If only they DID do anomalies prior to doing all the adjusting and homogenizing et. al. If only they DID keep thermometers in their own grid /box and not smear them out 1200 km to 3600 km. Theory, meet reality…
@GallopingCamel:
I think he’s saying there are a lot of natural cycles going on, some very long.
As GAT is a ‘polite fiction’ it will be subject to which bits of fiction one chooses to use and how the story is written… so not that surprised that each author gets a different story 😉
And don’t get me started on Treemometers… and where the bear does what bears do when a bear does his do do… (Nitrogen transport from salmon runs via bears is the largest factor in fertilizing in the Pacific North West IIRC the paper. A “favorite tree” will grow more than one less suited to bear attracting…)
@Arno Arrak:
Interesting… I’d been all set to say that Satellite record didn’t overlap enough to be useful ( all of 12 years I think) but that ‘peak matching’ says otherwise… Not looking at trend lines, looking at individual relative data point positions. Hmmm…. Good catch!
FWIW, I’ve done a bit longer evaluation of the use of First Differences here:
http://chiefio.wordpress.com/2012/06/26/wip-on-first-differences/
which is more in depth, but confirms what I’d said earlier: It is the right tool for what I want to do, which is compare two sets of data over very long term trends and with minimal dependency on any given time period (no “baseline” with excess weight) and with anomaly creation as the very first step.

Which raises the rather amusing question of “Have all those papers been withdrawn for using First Differences? Hmmmmm?”. Perhaps those authors will wish to ‘have a conversation’ with Steven Moser about his claims and decide how best to withdraw THEIR works. Right after that, I’ll consider it… /sarcoff>;
Near as I can tell, FD is still used, the papers not withdrawn, and it is just that the limitations of the method and any quirks it might have ought to be kept in mind when you use it (rather like all methods of doing things…)
[…]
http://climateaudit.org/2010/08/19/the-first-difference-method/

Changes that are likely to cause a level shift in the series, such as a TOBS or equipment change or a station move, should simply be treated as the closing of the old station and the creation of a new one, thereby eliminating the need for the arcane TOBS adjustment program or a one-size-fits-all MMTS adjustment.

Missing observations may simply be interpolated for the purposes of computing first differences (thereby splitting the 2 or more year observed difference into 2 or more equal interpolated differences). When these differences are averaged into the composite differences and then cumulated, the valuable information the station has for the long-run change in temperature will be preserved.

So closing a series and opening a new one as in Classical FD would, in fact, HIDE exactly the thing I’m trying to measure. How much impact comes from those changes. It also looks like interpolation as a gap spanning technique is acceptable and gives BETTER long term trends. The only difference between interpolation and what I do is that I put the ‘span’ into the exact date where the temperature shows up again. Preserving the actual structure of the data. Trend ought to be unaffected.
Again, my stated purpose was exactly that; to find a better long term trend representation in the data and to NOT lose that trend due to data drop outs.
There is further discussion of CAM methods and the need to toss out data of short segments, and to use a baseline (thus giving those stations used in forming that set added influence in the results). All things I specifically set out to avoid. And succeeded at avoiding.

And is the suggested ‘better’ method (“Plan B”) free of all issues?

As stated, it gives equal weights to all stations. But estimating it by GLS with an appropriate covariance matrix would be straightforward.
One small drawback is that adding new data can change earlier estimates in the combined series because the latest values will add new information on station differences. However, these differences will generally be relatively small.
True — but in live time this just means that you have to settle on a set of stations (with at least 10 years or so of readings), compute offsets, and then go with that formula for several years. 5 or 10 years later, you come out with Version 2 of your index, with new stations added and slightly modified offsets for the old stations.
And so on.

Now that doesn’t sound so good if your goal is the absolute minimal changes in the data used and wish to use EVERY data item in the set that it is possible to use. “Measuring the data” is NOT the same as “making up a number that I think matches reality the best” especially when it comes to things like “must have 10 years of data” or it doesn’t get used as part of the comparison base set.
That, BTW, is one of my complaints about the various CAM based methods. Overweight is given to some stations that are held to have some special merit due to a particular length of coverage in a particular span of time. That presents opportunities for those data to have “special effect” and for changes in a given instrument during those years to have greater impact on the comparison.

So having gone back and revisited the reasoning that lead to my choice of First Differences and a decision to ‘span the gap’ I find that I’m quite happy with it. It avoids specifically the issues of CAM and it does not hide parts of the impact of short segments, missing data, or the actual long term trends in the data. It does not ‘leave out’ some stations and does not give some stations more impact than others.
In short, it does just what I wanted it to do. Let me compare two sets of data (all of the data) directly and see how they are different from each other. NOT find a hypothetical “Best Global Average Temperature”. NOT create some “polite fiction” based on a theory. Just comparing the two sets of data using ALL the data minimally changed.