Much Ado About Very Little
Guest post by Zeke Hausfather and Steve Mosher
E.M. Smith has claimed (see full post here: Summary Report on v1 vs v3 GHCN ) to find numerous differences between GHCN version 1 and version 3, differences that, in his words, constitute “a degree of shift of the input data of roughly the same order of scale as the reputed Global Warming”. His analysis is flawed, however, as the raw data in GHCN v1 and v3 are nearly identical, and trends in the globally gridded raw data for both are effectively the same as those found in the published NCDC and GISTemp land records.
Figure 1: Comparison of station-months of data over time between GHCN v1 and GHCN v3.
First, a little background on the Global Historical Climatology Network (GHCN). GHCN was created in the late 1980s after a large effort by the World Meteorological Organization (WMO) to collect all available temperature data from member countries. Many of these were in the form of logbooks or other non-digital records (this being the 1980s), and many man-hours were required to process them into a digital form.
Meanwhile, the WMO set up a process to automate the submission of data going forward, setting up a network of around 1,200 geographically distributed stations that would provide monthly updates via CLIMAT reports. Periodically NCDC undertakes efforts to collect more historical monthly data not submitted via CLIMAT reports, and more recently has set up a daily product with automated updates from tens of thousands of stations (GHCN-Daily). This structure of GHCN as a periodically updated retroactive compilation with a subset of automatically reporting stations has in the past led to some confusion over “station die-offs”.
GHCN has gone through three major iterations. V1 was released in 1992 and included around 6,000 stations with only mean temperatures available and no adjustments or homogenization. Version 2 was released in 1997 and added in a number of new stations, minimum and maximum temperatures, and manually homogenized data. V3 was released last year and added many new stations (both in the distant past and post-1992, where Version 2 showed a sharp drop-off in available records), and switched the homogenization process to the Menne and Williams Pairwise Homogenization Algorithm (PHA) previously used in USHCN. Figure 1, above, shows the number of stations records available for each month in GHCN v1 and v3.
We can perform a number of tests to see if GHCN v1 and 3 differ. The simplest one is to compare the observations in both data files for the same stations. This is somewhat complicated by the fact that station identity numbers have changed since v1 and v3, and we have been unable to locate translation between the two. We can, however, match stations between the two sets using their latitude and longitude coordinates. This gives us 1,267,763 station-months of data whose stations match between the two sets with a precision of two decimal places.
When we calculate the difference between the two sets and plot the distribution, we get Figure 2, below:
Figure 2: Difference between GHCN v1 and GHCN v3 records matched by station lat/lon.
The vast majority of observations are identical between GHCN v1 and v3. If we exclude identical observations and just look at the distribution of non-zero differences, we get Figure 3:
Figure 3: Difference between GHCN v1 and GHCN v3 records matched by station lat/lon, excluding cases of zero difference.
This shows that while the raw data in GHCN v1 and v3 is not identical (at least via this method of station matching), there is little bias in the mean. Differences between the two might be explained by the resolution of duplicate measurements in the same location (called imods in GHCN version 2), by updates to the data from various national MET offices, or by refinements in station lat/lon over time.
Another way to test if GHCN v1 and GHCN v3 differ is to convert the data of each into anomalies (with baseline years of 1960-1989 chosen to maximize overlap in the common anomaly period), assign each to a 5 by 5 lat/lon grid cell, average anomalies in each grid cell, and create a land-area weighted global temperature estimate. This is similar to the method that NCDC uses in their reconstruction.
Figure 4: Comparison of GHCN v1 and GHCN v3 spatially gridded anomalies. Note that GHCN v1 ends in 1990 because that is the last year of available data.
When we do this for both GHCN v1 and GHCN v3 raw data, we get the figure above. While we would expect some differences simply because GHCN v3 includes a number of stations not included in GHCN v1, the similarities are pretty remarkable. Over the century scale the trends in the two are nearly identical. This differs significantly from the picture painted by E.M. Smith; indeed, instead of the shift in input data being equivalent to 50% of the trend, as he suggests, we see that differences amount to a mere 1.5% difference in trend.
Now, astute skeptics might agree with me that the raw data files are, if not identical, overwhelmingly similar but point out that there is one difference I did not address: GHCN v1 had only raw data with no adjustments, while GHCN v3 has both adjusted and raw versions. Perhaps the warming the E.M. Smith attributed to changes in input data might in fact be due to changes in adjustment method?
This is not the case, as GHCN v3 adjustments have little impact on the global-scale trend vis-à-vis the raw data. We can see this in Figure 5 below, where both GHCN v1 and GHCN v3 are compared to published NCDC and GISTemp land records:
Figure 5: Comparison of GHCN v1 and GHCN v3 spatially gridded anomalies with NCDC and GISTemp published land reconstructions.
If we look at the trends over the 1880-1990 period, we find that both GHCN v1 and GHCN v3 are quite similar, and lie between the trends shown in GISTemp and NCDC records.
1880-1990 trends
GHCN v1 raw: 0.04845 C (0.03661 to 0.06024)
GHCN v3 raw: 0.04919 C (0.03737 to 0.06100)
NCDC adjusted: 0.05394 C (0.04418 to 0.06370)
GISTemp adjusted: 0.04676 C (0.03620 to 0.05731)
This analysis should make it abundantly clear that the change in raw input data (if any) between GHCN version 1 and GHCN version 3 had little to no effect on global temperature trends. The exact cause of Smith’s mistaken conclusion is unknown; however, a review of his code does indicate a few areas that seem problematic. They are:
1. An apparent reliance on station Ids to match stations. Station Ids can differ between versions of GHCN.
2. Use of First Differences. Smith uses first differences, however he has made idiosyncratic changes to the method, especially in cases where there are temporal lacuna in the data. The method which used to be used by NCDC has known issues and biases – detailed by Jeff Id. Smith’s implementation and his method of handling gaps in the data is unproven and may be the cause.
3. It’s unclear from the code which version of GHCN V3 that Smith used.
STATA code and data used in creating the figures in this post can be found here: https://www.dropbox.com/sh/b9rz83cu7ds9lq8/IKUGoHk5qc
Playing around with it is strongly encouraged for those interested.
Zeke Hausfather says:
June 25, 2012 at 7:28 am
The differences between NCDC’s record in GHCN v2 and v3 relates primarily to the adjustments. This post (and analysis) deals solely with unadjusted data.
——————-
That’s why you have a chart above showing that GHCN V3 Raw, GHCN V1 Raw, NCDC and Gistemp that are virtually identical.
NCDC adjusted the record in May 2011 increasing the trend by 0.15C yet somehow all the Raw and adjusted records (all versions through time) are all identical. The adjustments go higher but all the Raw and adjusted versions are still identical. Not physically possible (unless the Raw data files were also changed).
Bill Illis,
Version 3 adjustments did not increase the trend 0.15 C. Its more like 0.1 C per century, which is 0.01 C per decade.
See http://rankexploits.com/musings/2010/ghcn-version-3-beta/ and
http://moyhu.blogspot.com/2010/09/beta-version-of-ghcn-v3-is-out.html
The net effect of GHCN adjustments on global temps is rather small compared to the magnitude of the trends.
sunshinehours1/Bruce,
See this thread: http://rankexploits.com/musings/2010/effect-of-dropping-station-data/
And this in particular: http://i81.photobucket.com/albums/j237/hausfath/Picture55-2.png
Adjustments in data have other uses besides altering the trend. Adjustments in data can provide new record breaking temps which are useful fodder for headlines. Adjustments in data can also be carefully done in order to bring regional temp. curves into some resemblance to “global temperature averages.”
Zeke
Adjustments in data have other uses besides altering the trend. Adjustments in data can provide new record breaking temps which are useful fodder for headlines. Adjustments in data can also be carefully done in order to bring regional temp. curves into some resemblance to “global temperature averages.”
Still not understanding how you determine what to adjust TO. From everything I’m seeing, you need to have the end in mind in order to make the adjustment, which inherently biases the adjustment in the direction you perceive it should be. Again, why not just dump the bad data instead of adjusting it?
Zeke … I love that graph. I call it the 1950 seesaw.
Adjusted data is cooler before 1950 and warmer after 1950. You can see the colors change from bottom to top.
Cool the past … warm the present. Change the trend.
All this would be fun if it was not serious. Adjustments based on effective algorithms are systematically important in the regional case studies (about 0.5 ° C for the twentieth century) but conveniently disappear for global comparisons. There is obviously something wrong somewhere. At this point, I think the problem lies in the implicit homogenization. What is not adjusted in the series is in cells with aggregation of segments. It is quite different when the regional averages are obtained by aggregated long time series. In these cases the extent of homogenization is fully visible. That’s the whole point of CruTem when you can dispose of the raw and adjusted data.
Zeke, you misinterpret my thinking. I am not focused on trying to show that dropped stations had cooled or warmed or did neither (I have no biased thinking along those lines except I think the null hypothesis has not been convincingly proved to be negative regardless of whether or not dropped or current stations have warmed or cooled).
I am focused on what affect ENSO parameters may have had on station drop and have several unanswered questions. Was station drop random? Yes or no, and what analysis did you do to consider this? Did you consider geographic ENSO affects? What were the lattitude and longitude coordinates of the dropped stations and how did they sit geographically within the affects of ENSO patterns known to exist? And what were their artifact/degradation parameters? What are the coordinates of the remaining and added stations and how do they sit geographically within the affects of ENSO patterns known to exist? What are their artifact/degradation parameters?
Why is this possible conflagration between sensor location and ENSO parameters important? At least I know part of the answer to this last question but do you have thoughts on this?
Zeke, why do the colors switch in the 1992 graph?
For the most part, blue on top or equal to red until about 1950 and then they switch places with red on top clearly at the end – 1992.
Bill Illis,
Let me clarify a tad. GHCN v3 adjustments (vs. raw) are ~0.1 C per century. They are not an increase from GHCN v2 pre se: http://i53.tinypic.com/23l1bb7.png
John Doe says:
June 25, 2012 at 7:01 am
steven mosher says:
June 23, 2012 at 6:20 am
“Tobs is the single largest adjustment made to most records. It happens to be a warming adjustment.”
How convenient.
============================================
This does not conern me except that I have never read an “elevator” summary of this adjustment. It appears logical that past old records, that did not automatically record the days high and low, would, when changed to instruments that always get the high and low recorded, thne the high would go up, and the low would go down. The net affect ??, but their is apparently other aspects to this???
FWIW, the article was a response to EM Smiths blog post, but most of Mosher’s comments appear to NOT address his comments? I assure you that one can have reasonable dialogu with EM Smith, but it may start by asking him some questions on what he means.
Zeke, why did you pick 1992 for the cutoff when 1975 was the peak year for stations?
http://rankexploits.com/musings/wp-content/uploads/2010/09/Screen-shot-2010-09-17-at-3.32.09-PM.png
Zeke Hausfather says:
June 25, 2012 at 11:06 am
Bill Illis,
Version 3 adjustments did not increase the trend 0.15 C. Its more like 0.1 C per century, which is 0.01 C per decade.
See http://rankexploits.com/musings/2010/ghcn-version-3-beta/ and
http://moyhu.blogspot.com/2010/09/beta-version-of-ghcn-v3-is-out.html
The net effect of GHCN adjustments on global temps is rather small compared to the magnitude of the trends.
————————————–
Nobody would care if this would be a singular adjustment with an additional warming effect. But it appears to be a continuous process of adjustements all in the same direction – what is statistically very unlikel.
On top, points critizised elsewhere, as Mennian warming and UHI warming are still unresolved.
Land temperatures increasingly deviate to the upside from sea surface temperatures.
Land temeprature trends are not compatible with satellite observations and their trend is much too high.
The recent sea surface ajter 1940s adjustment in HadSST4 is an appalling joke.
Zeke & Steve,
Zeke mentioned the issue of “Station Drop Off”.
Earlier in the comments, (Amino Acids in Meteorites, June 23, 2012 at 2:13 am) linked to videos that seem to say that at least for Russia it makes little difference how many weather stations you use as long as you choose them carefully. The presenter shows how the plot of temperature against time looks with 4 stations, 37 stations, 121 stations and 468 stations. The plots agree closely.
Which of these explanantions best fits the facts:
AA – It really does not matter how many weather stations there are.
BB – Someone is selecting stations carefully to get plots that match a CAGW hypothethis.
I ask the question in all humility having failed to figure it out for myself.
You can compare each individual month between Version 2 and Version 3 produced by the NCDC here.
http://www.ncdc.noaa.gov/ghcnm/time-series/
I would prefer if they would allow one to chart every month on one graph but the NCDC doesn’t like to give much away. You will get a chart that looks like this one for April. Green is Version 2 and Blue is Version 3.
http://img401.imageshack.us/img401/971/multigraph.png
While they look similar, the 1920s and 1930s are cooled by 0.1C and the recent periods are warmed by 0.05C in a systematic pattern (all months show this).
@GallopingCamel:
It can be both…. ( that is, it can inclusive “or” as well as exclusive “or”…)
IMHO all it “really takes” to know if we are warming or cooling is something on the order of 3 or 4 stations per continent. As long as they are long lived stations with little change of location, instruments, and surrounds. TonyB (IIRC) has looked at that and found a fair number of stable long lived thermometers. They show cyclical warming and cooling and a climb out of the LIA and not much else. (Upsalla Sweden has a very long term graph and is a well tended instrument:
http://www.smhi.se/sgn0102/n0205/upps_www.pdf it shows 1710-30 as roughly the same as now. The writings of LInnaeus have some distractors who claim he could not grow the plants he said he grew in the cold climate of Sweden, but when he was around, the graph shows it was not so cold… so there is “anecdotal evidence” for the curve and a curve that supports the written record.
One of the first finding I had was that long lived records showed no warming while all the warming “signal” was concentrated in short lived segments. (Largely taking off in 1987-1990 on a change of modification flag / duplicate number in GHCN v2). So several folks have found the same thing. Warmer in the past, very cold in “1800 and Froze To Death / The Year Without A Summer” era, then back to about the same warm level now as in 1710-1730.
At the same time, that Russian Video that found all the curves matching went on to show that they all had UHI issues and that accounted for a large part of the warming ( the rest, IMHO, is that “dip” between 1700 and 1900).
So if you do a lousy job of correcting UHI, start your data in the middle 1800s, and then leave out the non-volatile stations (or swamp them with enough ‘averaged in’ stations at Airports and other rapidly heating places) “all your curves will match” and the “all will show warming”. That is also exactly what is done with codes like GIStemp and HadCRUT / CRUTEM. So “selecting to produce warming”. The only part that is not knowable is “Deliberate or accidental?”. It is not possible to attribute motive. ( It could simply be that “coverage” is “enough” in someone’s mind in the mid 1800s so that’s when they start and that they think Airports are rural as the “population” at an airport is typically zero on census figures and many turn off their lights at night (or are surrounded by bare dirt – hotter than grass btw – away from the city lights so darker than the urban core).
We’ve seen that UHI is very poorly fixed (and even very poorly demonstrated by all the folks who try doing averages based on the GHCN and similar “rural” classifications; even though anyone with bare feet and no hat can tell you it is one heck of a lot hotter on the flight line (where the thermometers are at airports, near the runway) than in the shady forest nearby. So endless “proof” that UHI is insignificant is produced by folks simply trust the metadata when the metadate tells lies. (THE largest US Marine Air Field is classed as “rural” despite having tens of thousands of Marines and huge numbers of flights and being the “Crossroads Of The Marines”… The barracks do not count as “population” and they do night ops with the lights off so it is a dark place. Just hot… And in GHCN.) So are the folks showing “No UHI, move along, move along.” being malicious or just too dumb to see their error? No way to tell. But IIRC it was Templehoff that started as a grass parade field in the 1700s, was a global jet port (and used during the Berlin Air Lift, and has now been converted to an urban shopping mall. It is much warmer as hectares of concrete and tarmac than when it was a grass field… And it is in the GHCN. ( It was still an A for Airstation last I looked, a few years after conversion to a mall…)
So could the data be selected to match the CAGW hypothesis? Certainly. In fact, it might even be “by definition” given all the airports in the record… and using airports was a conscious act. ( The Pacific Ocean coverage is to soon be 100 % Airports… one interesting sidelight: I found a ‘dip’ in the Pacific Temperatures that matched the 1970s major recession on the Arab Oil Embargo… Just sayin’…. ) But that does not tell you if the selection matches the result to CAGW by design, or as accidental consequence that lead to the theory being made…
In short: It doesn’t matter how many stations you have; as good clean long lived non-UHI stations show no warming and Airports show lots of warming. Any collection of either shows about the same pattern, just two different patterns between them. We get “Warming” in the aggregate due to there being very few long lived non-UHI stations and a whole lot more Airports, especially recently, in the average of the two ‘trends’.
@David:
I’m happy to have them arguing the merits of various orthogonal things, less stuff for me to deal with that way 😉
Don’t know that there are many questions to ask / answer. The methods, data, code, etc are all public. It does what it does. I’ve described what it does. Don’t see much I can add after that… And questions like “what software was used to download the ftp dataset” as held up a a ‘flaw’ on my part (not ‘publishing it’) are really just silly. The answer is “whatever ftp software you like” and “from the NCDC data repository”… so less of those makes my day easier.
Heck, I’m still trying to catch up with comments (and won’t get to them all now, just have a moment for a couple of quick ones).. so fewer for me to answer is a feature right now 😉
Heck, nobody even said much about the posted “not much difference between FD and dT” graph and that was good solid day of work to recreate the whole run with Classical FD and put it on a graph with the dT results. At a day per significant question (even if I think the difference between FD and dT is insignificant) it can take a lot to deal with questions… So most of a day and night of work gets the sound of crickets in response… I guess that’s a good thing.
BTW, given how Mosher is all hot and bothered about FD having found to have limitations, one must ask: Have all the climate science papers that use it been withdrawn now?” Last I looked, it was used in several of the papers that justify many of the tools and adjustments in use. It was in some paper that underlay the RSM IIRC (but it was a while ago I read that batch… so not sure just which ones…)
Finally, yes, one can have a reasonable conversation with me. BUT, I generally ignore folks who rant, play “Gotcha! Games”, and are generally Trollish. Or are just too grumpy. “Life is too short to drink bad wine.” and some folks are just a keg of vinegar… I’m a softy, though, so anyone with a clear and honest need for help usually will get anything I can provide.
At any rate, that’s all I have time for at the moment. Still catching up after a night of no sleep and time to make dinner… But now I have the code written to do some nice A/B compares of Classical FD and dT to measure strengths and weaknesses. I’m still pretty sure mine is demonstrably better; now I just need to go demonstrate 😉
Bill Illis: “You can compare each individual month between Version 2 and Version 3”
The classic seesaw around 1950. Blue version 3 colder until 1950 and then warmer after 1950.
E.M.Smith
I, for one, do appreciate your comments.
And no one can say you shy away from replying. Also, no one can say your posts that make it into WUWT are poorly thought out and presented.
Thanks Chiefo,
Your explanations have been simple, concise and clear for anyone with open eyes and an open mind to see.
The verbal semantics of the ” experts ” here did not address your post and are discussing things tangential to what your post was. In effect this post as a rebuttal to your post, does not address any of the key issues in your post and is dancing around unnecessary fringe arguments. Effectively the response is a strawman copout.
Mike (ChiefIO),
Firstly that you for the compliment of confusng me with TonyB. Tony has done superb work in highlighting just how ignorant of history our current (as opposed to Hubert Lamb’s) generation of so called climate scientists are. I’m proud to have my name mentioned in the same post as him any day.
As you know some of us (former non-questioners of CAGW) have been around along time in the climate debate. In my case, I first started my (now) skeptical of CAGW education with John Daly, thanks to Tim Lambert, calling John Brignell of Numberwatch a ‘crank’. I am what I call a ClimateAudit ‘lifer’ in that I’ve been following Steve Mc’s threads (with John A’s assistance) on CA almost from its birth. Consequently I probably know ‘Moshpit’ better than most and I’ve certainly had many a debate with him (and his fellow (luke) warmista mates like Zeke H, Nick S, Carrick and Ron B over at Lucia’s Blackboard,
Moshpit often gives the impression to people that he is an accomplished software developer when in fact if you seek him out on LinkedIn, you see that he is much more of a sofwtare project manager. He has nothing like the software development (particularly with complex enterprise level IT systems) and experience you and I have Mike. FWIW, he is however an accomplished R programmer with a self proclaimed long memory. Along with Zeke, Ron and the man with a reversed well known sunglass brand name, he ha sdone lots of what I call ‘helicopter’ level analysis of the various temperature record datasets (in paticularly USHCN and GHCN) but has done very little compared to you and I and Verity and TonyB (to name only a few) ‘ground’ level forensic analyses of the individual station records.
Moshpit (Steve Mosher) is a firm believer that by ‘anomalising’ and ‘gridding’ the data you can cure it of all it’s problems (like its lack of spatial/temporal coverage, missing data etc), and come up with a mean global surface temperature that demonstates that man is having a significant effect on own planet’s climate through our continued emissions of CO2. IMO along with many other ‘warmers’ it is he who is the real ‘denier’ and not us skeptics. He is a denier of what is largely if not wholely overwhelming historical evidence of cyclic spatially varying significant multi-decadal natural climatic variability within our planet’s climate. I particularly get annoyed when this signiicant variation in temperature at any given weather station is referred to as ‘noise’ and when analysts like Moshpit are content to ‘adjust it’ and/or average it out.
Based on the historical evidence available, Do I believe that our planet has warmed in the last hundred years or so? IMO, yes it has. Has man been the primary cause of that warming? Possibly yes but not because of our emissions of an odourless, tasteless trace gas essential to all life on our planet but rather mor elikely due to the effect we have on temperature monitoring instruments. Should we be concerned about this ‘warming’? Most definitely not as common sense shoud dictate that we should do a better (less sloppy) job of observing our planet and full account we any changes we observe within it before jumping to the conclusion that they are man-made.
KevinUK
Kevin UK writes:
“…..historical evidence of cyclic spatially varying significant multi-decadal natural climatic variability within our planet’s climate.”
I hope he means what I think he does. Historically (= reasonably reliable for the last 3,000 years but with dimininshing credibilty before that) the climate has varied. Warm periods have been characterized by the rise of empires (Egyptian, Minoan, Roman etc.) the cold periods have corresponded with incredible human suffering and de-population. While historical climate variations have been quite dramatic one cannot attribute them to CO2 which varied very little until the last 70 years.
Given that climate has such huge consequences for our species it seems stange to witness a bus load of statisticians (Tamino, Zeke, RomanM, JeffId, McIntyre and many many more) who can’t agree about the global temperature since Fahrenheit invented his thermometer in 1714.
Even stranger is the fact that Michael Mann with his tree-mometers that defy history is given a moment’s consideration by “Scientists”.
It is possible to compare these data with satellite measured temperatures from 1979 on. When you do that you discover that all of the temperature values above 1980 have been given a false upward trend that does not exist in the real world. You can see this falsification with the naked eye even on the small scale graph of Figure 4. All you need to do to see this fakery is to locate the super El Nino of 1998. It is the high peak at 2000 in their graph. On both sides of it are two V notches. They are La Ninas that on the satellite record are even but in this graph the right hand notch is one third of a degree higher than the left one. This is a fake increase of temperature of 0.3 degrees in two years or 15 degrees per century. If you then look to the right of it you see that there are two peaks higher than the super El Nino of 1998. They are a spurious peak at 2007 and the El Nino of 2010. According to satellite records they are both 0.2 degrees lower than the super El Nino. but here they are shown 0.1 degree higher instead. That is the same phony 0.3 degree boost we observed with the two La Nina periods, applied across the remaining twenty-first century. Since we are dealing with digital records it is clear that this kind of bias has to be built into the computer program whose output we are looking at. The technique of the Big Lie, first introduced in Mein Kampf, is well and alive in the climate change world. It is a colossal fraud and must be stopped and investigated. In the meantime, the only believable global temperature values are those produced by satellite temperature measurements. I suggest that only satellite temperatures should be used when global temperature values are needed. There are two sources: UAH and RSS. They are slightly different but both can serve as an approximation to a real global temperature that can be believed in.
@Amino Acids in Meteorites:
Thanks! I try.
@Venter:
Thanks! I try to be clear. It isn’t hard to translate opaque jargon to straight language. I usually find the exercise clears a lot of fuzzy thinking from the bafflegab in the process 😉
@KevinUK:
You are being very kind.
IMHO, if you don’t know what is happening at the individual station level, you don’t know what’s happening. That is why I started with just looking AT temperatures. For individual stations and for aggregations. (For which I took a lot of heat from folks accusing me of not knowing to do anomalies when my purpose was to ‘measure the data’ not find a GAT.)
I first got that habit back in my old FORTRAN IV class. We were deliberately fed crap data to burn into our brains that checking the data was STEP ONE. If you don’t know your data, you don’t know crap… So I started with looking AT the temperature data. Never thinking anyone would expect that to be the last step… or toss rocks over it. That was were I first started seeing “Odd Things” like some going up and others going down and some months rising while others fall for the same instrument… Just not ‘generalized warming’…
So it is a long slow incremental process of building from the lowest level to the highest, one brick at a time.
IMHO, the major fault of all the codes I’ve seen from “Climate Scientists” has been to assume they have a working theory and good data and run with it. Starting from the “helicopter view” and building backwards to justifications. I start with the data and ask it what it has to say. No theory, just open ears and eyes… Later, after it has spoken, I might come up with a theory (like that ’87-90 point where the Duplicate Number changes and everything takes a 1/2 C jump at just that transition… my theory is equipment and processing changes with onset then. Because THAT is what the individual data look like and what the aggregates look like.)
Ah, yes, the “Anomaly Grid / Box Magic Sauce” cure all… If only they DID do anomalies prior to doing all the adjusting and homogenizing et. al. If only they DID keep thermometers in their own grid /box and not smear them out 1200 km to 3600 km. Theory, meet reality…
@GallopingCamel:
I think he’s saying there are a lot of natural cycles going on, some very long.
As GAT is a ‘polite fiction’ it will be subject to which bits of fiction one chooses to use and how the story is written… so not that surprised that each author gets a different story 😉
And don’t get me started on Treemometers… and where the bear does what bears do when a bear does his do do… (Nitrogen transport from salmon runs via bears is the largest factor in fertilizing in the Pacific North West IIRC the paper. A “favorite tree” will grow more than one less suited to bear attracting…)
@Arno Arrak:
Interesting… I’d been all set to say that Satellite record didn’t overlap enough to be useful ( all of 12 years I think) but that ‘peak matching’ says otherwise… Not looking at trend lines, looking at individual relative data point positions. Hmmm…. Good catch!
FWIW, I’ve done a bit longer evaluation of the use of First Differences here:
http://chiefio.wordpress.com/2012/06/26/wip-on-first-differences/
which is more in depth, but confirms what I’d said earlier: It is the right tool for what I want to do, which is compare two sets of data over very long term trends and with minimal dependency on any given time period (no “baseline” with excess weight) and with anomaly creation as the very first step.
So closing a series and opening a new one as in Classical FD would, in fact, HIDE exactly the thing I’m trying to measure. How much impact comes from those changes. It also looks like interpolation as a gap spanning technique is acceptable and gives BETTER long term trends. The only difference between interpolation and what I do is that I put the ‘span’ into the exact date where the temperature shows up again. Preserving the actual structure of the data. Trend ought to be unaffected.
Again, my stated purpose was exactly that; to find a better long term trend representation in the data and to NOT lose that trend due to data drop outs.
There is further discussion of CAM methods and the need to toss out data of short segments, and to use a baseline (thus giving those stations used in forming that set added influence in the results). All things I specifically set out to avoid. And succeeded at avoiding.
So having gone back and revisited the reasoning that lead to my choice of First Differences and a decision to ‘span the gap’ I find that I’m quite happy with it. It avoids specifically the issues of CAM and it does not hide parts of the impact of short segments, missing data, or the actual long term trends in the data. It does not ‘leave out’ some stations and does not give some stations more impact than others.
In short, it does just what I wanted it to do. Let me compare two sets of data (all of the data) directly and see how they are different from each other. NOT find a hypothetical “Best Global Average Temperature”. NOT create some “polite fiction” based on a theory. Just comparing the two sets of data using ALL the data minimally changed.