Guest post by Frank Lansner
IPCC – How not to compare temperatures – if you seek the truth.
There are numerous issues discussed intensely when it comes to IPCC-illustrations of historic temperatures, here for example the illustration from IPCC Third Assessment Report:

Fig 1. Taken from IPCC TAR
In short we have heard of problems with 1) the Mann material, 2) the Briffa material, 3) The cherry picking done by IPCC to predominantly choose data supporting colder Medieval Warm Period, 4) Problems joining proxy data with temperature data mostly obtained from cities or airports etc, 5) Cutting proxy data of when it doesn’t fit temperatures from cities, 6) Creating and Using programs that induces global warming to the data and finally 7) reusing for example Mann and Briffa data endlessly (Moberg, Rutherford, Kaufmann, AR4 etcetcetc).
But, as I believe another banal error needs more attention:
8) Wrong compare.
Imagine for a moment that none of the above mentioned problems 1) – 7) has any impact and then lets just focus on the comparing itself. The data of the proxies suffer from 2 impacts:
A) “Technical averaging” – Impact of many series of date summarized.
Check out what happens when summarizing many datasets of temperatures, an example, the cooling episode 8200 years ago:

Fig 2.
Data taken from: http://wattsupwiththat.com/2009/04/11/making-holocene-spaghetti-sauce-by-proxy/
The white graph with the red squares are the resulting average graph: More temperature sets added together tends to flatten the average. Notice for example how many datasets certainly has a down peak between 8000 and 9000 years ago, but the timing for these datasets are slightly off, and so the down peak is almost gone.
So, to some degree we can expect multi proxies to yield an averaged overall graph.
B) “Direct averaging” – on top of the technical averaging, the data series are often averaged further to some degree using 30, 40 and 50 years Gaussian filters.
The result of averaging by A) and B) is, that the variability of the IPCC graphs on a decadal ´timescale are limited to just tenths of a degree K. But in reality, if there where any real temperature peaks on decadal time scale in the Medieval period, we will would not see these much in the typical data series IPCC shows.
Is this a problem?
Well, it certainly becomes a problem if these “super averaged” data are compared with data that is NOT quite as “super averaged”. And this faulty compare is just what IPCC do.
IPCC “Super averaged” data from proxies, are typically compared to “Observed” temperatures, that is, recent temperatures not at all submitted to the same degree of averaging.
Technical averaging – type A) – is to some degree not happening for observed temperatures, so how about type B), the direct averaging, filtering?
Well, For the IPCC graphis shown in fig 1 above, the IPCC text says: “All series were smoothed with a 40-year Hamming-weights lowpass filter, with boundary constraints imposed by padding the series with its mean values during the first and last 25 years.”
Explanation: If your data ends in year 2000, then the last genuine 40-year averaged/filtered point on the graph would be a point for 1980 with average of 1960-2000 near +0,2K anomaly. But the IPCC graph for observed temperatures ends at +0,43 K around year 2000. This more resembles the normal five years average of GISS year 2000 data:

Fig 3. Giss temperatures illustrated in year 2001.
So for IPCC/Mann etc. to get a year 2000 temperature as high as +0,43K, they must have used just normal 5 yr avg. A longer average period would yield lower temperature for the last year.
So, when IPCC wrote “with boundary constraints imposed by padding the series with its mean values during the first and last 25 years.” – they mean: “We don’t use 40 year average/filter in the last 25 years…!”
So the bottom line is: IPCC compares “super averaged” temperatures of the medieval period with a peak in modern temperatures only submitted to 5 years average.
IPCC basically compares a peak in temperatures in recent years with super averaged medieval data where peaks are more suppressed to conclude how much it is warmer today than in the MWP.
This is a problem !
From this illustration it appears that the peak after 1998 to some degree appears related to the big El Nino 1998 peak, here from appinsys:

Fig 4.
So, where there no El Nino peaks in the medieval period that could have affected the compare with recent temperatures? Yes, there where: http://co2science.org/articles/V12/N5/C2.php
So we have every reason to believe that there where also temperature peaks in the medieval period – peaks that just might resemble the recent El Nino Peak.
So no excuse for the IPCC to compare a modern temperature peak with medieval average temperatures.
This is banal, of course, and even IPCC must have been aware of this, one should think.
Here: An illustration where the single year 2004 for observed temperature data explicitly is used in comparison with the super averaged medieval temperature data.

Fig 5. (from here)
Cheers!
depending on the scales you use for each axis you could make that graph look as flat as a pancake or a silhouette of the alps.
Somebody needs to proofread this stuff before posting it. Legitimate points get lost in poor grammar, spelling, and use of language.
Good point!
Small note: A few places after figure 4 the text said “where” instead of “were”.
Dang. Now I want spaghetti.
Everyone needs to read the books “How to Lie with Statistics” by Darrell Huff and “Freakonomics” by Dubner and Levitt.
As I recall, that last wikipedia chart above had 8 of the 10 curves generated by card-carrying hockey team members.
Not to mention using Mike’s Trick, and clipping off the unhelpful data.
Nicely done, Frank! Thanks!
Pretty much indecipherable to this reader. My quick read leaves me with the impression that the IPCC and the usual cast of characters have done bad things with the numbers. This is new news? Maybe I shouldn’t have read the thing immediately after struggling for hours with my balky computer which left me “crabby”, but I didn’t quite get it and the piece didn’t excite enough energy to cause me to work at it.
Good going, Anthony –
Whether intentional or not, treating recent data differently is not cool, not without mentioning that this is the case. 40-year averages is a really curve-smoothing operation in itself. If that is done for most of the years on top of the blending of many proxies like Anthony shows, yes, you’re going to get a super-smoothed curve, with VERY few peaks and valleys.
The extreme example of long-year averaging would be averaging ALL the years into one value, which would be a flat horizontal line. The more years for the rolling averages, the flatter it gets. The fewer years, the spikier it gets.
It IS pretty heinous to then tack on the less-blended (i.e., multiple data types) and shorter rolling average data from the later years and then act like the spikes mean anything.
I’m no expert, but when I plot even single source data with rolling averages, I don’t even TRY to include the last years. If I do a 10-year average, I stop the rolling average graphs 5 years short – the last year that has real data covering its entire 10-year span. AND I always show both non-rolling and rolling graphs, so people can see what is going on.
Half the 40-year average is just the 20 years that are the years they say the temps have been rising.
From my experience, I’d rate this post VERY high in real world graphing.
But I have to wonder why Steve M hasn’t pointed this out.
.
OOPS! I thought this was from Anthony. Sorry!
My kudos should have gone to Frank.
This post is written in good quality Danglish, which should not be difficult for native English speakers to understand.
As native speaker of a world language, one must be able to decipher its use by non-native speakers.
I have seen far worse examples in reports and blog entries linked-to from WUWT.
/Mikkel
That’s an aspect I hadn’t considered, thanks Frank.
This Hamming filtering with its 25 period mean padding on either end means it matters a whole lot whether it was near a high point or low point at either end. That leaves a hidden way to artificially bias data if the endpoints themselves do not truly have a Gaussian distribution about the mean. That is, if beginning points of multiple sets tend to be up, they would tend to be artificially smoothed downward due to the mean padding, if ending points of all sets tend to be down, they would tend to be artificially smoothed up. That’s a great place to prove bias, if it exists, when many sets of data are merged. Thanks again Frank.
I’m suspicious that the 1998 El Nino was used as a smokescreen to jack up the global average temperatures. Worth keeping in mind when investigating the “adjustments” done on temperature records.
Additum to: wayne (23:00:15)
I think I just made a loose statement above. After some thought, I realized it matters greatly exactly which ‘mean’ the IPCC is speaking of. Is it the mean of each data set’s data of 25 years on either end or is it the mean of the endpoints of the collection of all data sets. I wonder exactly which one they were referring to in:
Mikkel (22:51:19) :
[…] As native speaker of a world language, one must be able to decipher its use by non-native speakers. […]
I call it ‘try to make it be correct’ before ‘calling it wrong’.
I think the whole idea of averages is now suspect.
http://wmbriggs.com/blog/?p=195
Do not smooth times series, you hockey puck!
Try this one by William M. Briggs
I think the best thing that can be said about the IPCC, Mann, and so on, there has been no statistical review by high level Mathematicians, in their own organisations.
It has been up to external review, by sceptics, what ever that means, but by experienced math stat scientists.
Thank you Frank, Poxies must be submitted to the same filtering torture as other sets. Whether it’s a 40 or 5 year setup. Otherwise it’s gibberish.
Me and a mate once created a Roulette program, which we tried at the local casino, we made money but we argued all the way home about the program. I said I have an issue on our random number generator we used. I think it was set constant.
So we checked it we and fixed the bug, we were lucky not to have lost our trousers. However we never experimented at the casino again.
And climate seems to me to a very big complex casino.
AArgh.
Oops I mean proxies.
OT
Powerful earthquake shakes California
Two people have died and at leeast 100 have been injured in a powerful earthquake that hit California two countries and three states on Sunday, swaying buildings from Los Angeles to Phoenix to Tijuana.
http://earthquake.usgs.gov/earthquakes/recenteqsww/Maps/10/245_30.php
I think that Frank Lansner has touched upon something quite important, which explains to a certain degree how the IPCC has managed to remove the Medieval warm period from their temperature record. The persistent efforts to make the latter part of 20th century and the last ten years look warmer than any time over the last 1000 years is not just a scientific issue but very much a real problem with substantial economic consequences. For us living in United Kingdom it is now becoming an even more immediate and expensive problem. Just read Christopher Booker’s latest column in The Telegraph here to understand how the real cost of “Climate Change” will soon become unbearable for more and more industry and the flight to China will intensify with additional unemployment in the pipeline for no gain.
http://www.telegraph.co.uk/comment/columnists/christopherbooker/7550164/Climate-Change-Act-has-the-biggest-ever-bill.html
Looks like this is one to be filed under “weather is not climate”. Averaging the proxies will of course have a further smoothing effect, on top of the 40 year averaging and the use of “Hamming Weights lowpass filter”, to heighten the contrast with a few warm years between 1995 and 2004. Can anyone enlighten us as to the rationale for this gloriously named methodology, normally seen in digital filter processing? It seems to be a long way from home, and as far as I can see there is no innocent reason why it should have been used for weather observations.
It’s a very good spot – well done Frank.
If you calculate an average P/E ratio for a set of stocks you calculate:
sum(price * shares) / sum(EPS * shares) where all prices and EPS are brought to the same currency unit. You do NOT use average(price/EPS). I know this because I wrote the software for one of the largest investments banks to perform these kind of calculations.
Now look at Fig. 2. All the lines are temperature proxies which are calculated as (something / temp conversion factor) where ‘something’ might be dO18 or tree ring width or whatever. The temperature conversion factor might be a single value or a lookup value from a table if the conversion is non-linear. An example might be (tree ring width/(width per degC)) giving you a proxy temperature series.
Although you have converted everything to a temperature line, you can’t just average these temperatures. The ‘proper’ calculation would be sum(measure)/sum(conversion factors) which would be nonsense because the units are different for each proxy. So the average in Fig 2 is nonsense. The fact the proxies don’t all change at the same time tells you that the temperature proxies are not very good. In fact, I think the only thing you could learn from Fig 2 is how the proxies are correlated.
Capn Jack, that was a Fruedian slip. I’m sure you meant ‘poxies!’
Frank,
if I am reading this correctly would we expect the temperature plateau shown for the decade that we are just now concluding in Fig. 4 to be subsumed as time rolls on, and this decade is then treated the same as those prior to 1980?
The raw data for my region in New Zealand obtained from NIWA does not support the claims of this current decade being a hot one, in fact this summer has become the third one from this decade to join the ranks of the 20 coldest summers of the past 55 years. I only have data from 1954 onwards, but none of the southern summers from the 1950’s make the list. There is a fairly even spread for all of the following decades. The top two coldest summers are the Pinatubo affected ones of 1991-92 and 1992-93. No surprises there. 1975-76 comes in a close third.
The fact that 1975-76 comes in at 3rd coldest is quite interesting to me when I see that the previous summer is the clear winner in the hottest summer stakes ahead of 1998-99, and 2007-08. As people will no doubt comment, just another fine example of natural variation.
Cheers
Coops.
Like Pete Olson, I have often taken exception to the atrocious spelling and use of grammar in the posts here, but my complaint is limited to the Americans who obviously spent their time in grammar school picking their noses and failing to learn spelling, punctuation, etc. When some fairly obviously non-native English speaker posts here, I ignore all of the obvious non-native English usages and try to reach the mind of the poster.
People who live in grass houses shouldn’t stow thrones.
Our ‘world’ would be much poorer without the foreigners here who do their level best to get their points across. And, while we’re talking about this, who among the critics I refer to can actually communicate in any language other that, which I will, for sake of fairness, call “Americanese?” Hey, folks, some of the ‘Strines’ here can’t even communicate in that language. Anyone want to see them gone? Didn’t think so. Reserve your vituperation for the American speakers that somehow can’t speak American, leave the English and others alone. L
[Agree. But the most egregious offenders are those who misuse idioms. ☺ ~dbs]