# Crowdsourcing A Full Kernel Cascaded Triple Running Mean Low Pass Filter, No Seriously…

Image Credit: Climate Data Blog

By Richard Linsley Hood  – Edited by Just The Facts

The goal of this crowdsourcing thread is to present a 12 month/365 day Cascaded Triple Running Mean (CTRM) filter, inform readers of its basis and value, and gather your input on how I can improve and develop it. A 12 month/365 day CTRM filter completely removes the annual ‘cycle’, as the CTRM is a near Gaussian low pass filter. In fact it is slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ whereas true Gaussian leaves a small residual of that still in the data. This new tool is an attempt to produce a more accurate treatment of climate data and see what new perspectives, if any, it uncovers. This tool builds on the good work by Greg Goodman, with Vaughan Pratt’s valuable input, on this thread on Climate Etc.

Before we get too far into this, let me explain some of the terminology that will be used in this article:

—————-

Filter:

“In signal processing, a filter is a device or process that removes from a signal some unwanted component or feature. Filtering is a class of signal processing, the defining feature of filters being the complete or partial suppression of some aspect of the signal. Most often, this means removing some frequencies and not others in order to suppress interfering signals and reduce background noise.” Wikipedia.

Gaussian Filter:

A Gaussian Filter is probably the ideal filter in time domain terms. That is, if you consider the graphs you are looking at are like the ones displayed on an oscilloscope, then a Gaussian filter is the one that adds the least amount of distortions to the signal.

Full Kernel Filter:

Indicates that the output of the filter will not change when new data is added (except to extend the existing plot). It does not extend up to the ends of the data available, because the output is in the centre of the input range. This is its biggest limitation.

Low Pass Filter:

A low pass filter is one which removes the high frequency components in a signal. One of its most common usages is in anti-aliasing filters for conditioning signals prior to analog-to-digital conversion. Daily, Monthly and Annual averages are low pass filters also.

A cascade is where you feed the output of the first stage into the input of the next stage and so on. In a spreadsheet implementation of a CTRM you can produce a single average column in the normal way and then use that column as an input to create the next output column and so on. The value of the inter-stage multiplier/divider is very important. It should be set to 1.2067. This is the precise value that makes the CTRM into a near Gaussian filter. It gives values of 12, 10 and 8 months for the three stages in an Annual filter for example.

Triple Running Mean:

The simplest method to remove high frequencies or smooth data is to use moving averages, also referred to as running means. A running mean filter is the standard ‘average’ that is most commonly used in Climate work. On its own it is a very bad form of filter and produces a lot of arithmetic artefacts. Adding three of those ‘back to back’ in a cascade, however, allows for a much higher quality filter that is also very easy to implement. It just needs two more stages than are normally used.

—————

With all of this in mind, a CTRM filter, used either at 365 days (if we have that resolution of data available) or 12 months in length with the most common data sets, will completely remove the Annual cycle while retaining the underlying monthly sampling frequency in the output. In fact it is even better than that, as it does not matter if the data used has been normalised already or not. A CTRM filter will produce the same output on either raw or normalised data, with only a small offset in order to address whatever the ‘Normal’ period chosen by the data provider. There are no added distortions of any sort from the filter.

Let’s take a look at at what this generates in practice.The following are UAH Anomalies from 1979 to Present with an Annual CTRM applied:

Fig 1: UAH data with an Annual CTRM filter

Note that I have just plotted the data points. The CTRM filter has removed the ‘visual noise’ that a month to month variability causes. This is very similar to the 12 or 13 month single running mean that is often used, however it is more accurate as the mathematical errors produced by those simple running means are removed. Additionally, the higher frequencies are completely removed while all the lower frequencies are left completely intact.

The following are HadCRUT4 Anomalies from 1850 to Present with an Annual CTRM applied:

Fig 2: HadCRUT4 data with an Annual CTRM filter

Note again that all the higher frequencies have been removed and the lower frequencies are all displayed without distortions or noise.

There is a small issue with these CTRM filters in that CTRMs are ‘full kernel’ filters as mentioned above, meaning their outputs will not change when new data is added (except to extend the existing plot). However, because the output is in the middle of the input data, they do not extend up to the ends of the data available as can be seen above. In order to overcome this issue, some additional work will be required.

The basic principles of filters work over all timescales, thus we do not need to constrain ourselves to an Annual filter. We are, after all, trying to determine how this complex load that is the Earth reacts to the constantly varying surface input and surface reflection/absorption with very long timescale storage and release systems including phase change, mass transport and the like. If this were some giant mechanical structure slowly vibrating away we would run low pass filters with much longer time constants to see what was down in the sub-harmonics. So let’s do just that for Climate.

When I applied a standard time/energy low pass filter sweep against the data I noticed that there is a sweet spot around 12-20 years where the output changes very little. This looks like it may well be a good stop/pass band binary chop point. So I choose 15 years as the roll off point to see what happens. Remember this is a standard low pass/band-pass filter, similar to the one that splits telephone from broadband to connect to the Internet. Using this approach, all frequencies of any period above 15 years are fully preserved in the output and all frequencies below that point are completely removed.

The following are HadCRUT4 Anomalies from 1850 to Present with a 15 CTRM and a 75 year single mean applied:

Fig 3: HadCRUT4 with additional greater than 15 year low pass. Greater than 75 year low pass filter included to remove the red trace discovered by the first pass.

Now, when reviewing the plot above some have claimed that this is a curve fitting or a ‘cycle mania’ exercise. However, the data hasn’t been fit to anything, I just applied a filter. Then out pops some wriggle in that plot which the data draws all on its own at around ~60 years. It’s the data what done it – not me! If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there. That’s what the DATA says is there.

Note that the extra ‘greater than 75 years’ single running mean is included to remove the discovered ~60 year line, as one would normally do to get whatever residual is left. Only a single stage running mean can be used as the data available is too short for a full triple cascaded set. The UAH and RSS data series are too short to run a full greater than 15 year triple cascade pass on them, but it is possible to do a greater than 7.5 year which I’ll leave for a future exercise.

And that Full Kernel problem? We can add a Savitzky-Golay filter to the set,  which is the Engineering equivalent of LOWESS in Statistics, so should not meet too much resistance from statisticians (want to bet?).

Fig 4: HadCRUT4 with additional S-G projections to observe near term future trends

We can verify that the parameters chosen are correct because the line closely follows the full kernel filter if that is used as a training/verification guide. The latest part of the line should not be considered an absolute guide to the future. Like LOWESS, S-G will ‘whip’ around on new data like a caterpillar searching for a new leaf. However, it tends to follow a similar trajectory, at least until it runs into a tree. While this only a basic predictive tool, which estimates that the future will be like the recent past, the tool estimates that we are over a local peak and headed downwards…

And there we have it. A simple data treatment for the various temperature data sets, a high quality filter that removes the noise and helps us to see the bigger picture. Something to test the various claims made as to how the climate system works. Want to compare it against CO2. Go for it. Want to check SO2. Again fine. Volcanoes? Be my guest. Here is a spreadsheet containing UAH and a Annual CTRM and R code for a simple RSS graph. Please just don’t complain if the results from the data don’t meet your expectations. This is just data and summaries of the data. Occam’s Razor for a temperature series. Very simple, but it should be very revealing.

Now the question is how I can improve it. Do you see any flaws in the methodology or tool I’ve developed? Do you know how I can make it more accurate, more effective or more accessible? What other data sets do you think might be good candidates for a CTRM filter? Are there any particular combinations of data sets that you would like to see? You may have noted the 15 year CTRM combining UAH, RSS, HadCRUT and GISS at the head of this article. I have been developing various options at my new Climate Data Blog and based upon your input on this thread, I am planning a follow up article that will delve into some combinations of data sets, some of their similarities and some of their differences.

About the Author: Richard Linsley Hood holds an MSc in System Design and has been working as a ‘Practicing Logician’ (aka Computer Geek) to look at signals, images and the modelling of things in general inside computers for over 40 years now. This is his first venture into Climate Science and temperature analysis.

Article Rating
Inline Feedbacks
Kirk c
March 16, 2014 1:30 pm

That is so cool!

Lance Wallace
March 16, 2014 1:42 pm

The CO2 curve (seasonally detrended monthly) can be fit remarkably closely by a quadratic or exponential curve, each with <1% error for 650 or so consecutive months:
http://wattsupwiththat.com/2012/06/02/what-can-we-learn-from-the-mauna-loa-co2-curve-2/
The exponential has a time constant (e-folding time) on the order of 60 years (i.e., doubling time for the anthropogenic additions from the preindustrial level of 260 ppm of about 60*0.69 = 42 years).
Can the Full Kernel Triple thingamajig provide any further insight into the characteristics of the curve? For example, due to various efforts by governments to reduce CO2 emissions, can we see any effect on the curve to date? I tried fitting the exponential curve only up to the year 2005, then 2010, finally to September 2013) and there was a very small movement toward lengthened e-folding time (60.26, 60.94, 61.59 years). But also, theoretically, there is some relation between atmospheric CO2 and CO2 emissions, but one has to assume something about the lag time between CO2 emissions and CO2 concentrations. Can the Triple whatsis somehow compare the two curves and derive either a lag time or an estimate of how much of the CO2 emissions makes it into the observed atmospheric concentrations? Simply assuming that all the CO2 emissions make it into the atmosphere in the following year gives an R^2 on the order of 40% (IIRC).

geran
March 16, 2014 1:45 pm

Since this is your first foray into “climate science”, Richard, let me help with the basics you will need to know.
Climate modeling is about as close to science as are video games. (Except in the better video games, there are sound effects.) Climate modeling works like this: Enter input, run program, no catastrophic results, enter new input and re-start. Continue until you get catastrophic results. Then, publish results and request more funding.
If your program never achieves a catastrophic result, adjust input data until proper results are achieved. (Ever heard of a hockey stick?)
If someone questions your knowledge of science, produce numerous graphics, in color, of your results.
It’s a veritable gold mine!

Arno Arrak
March 16, 2014 1:45 pm

Why bother with HadCrut and GISS? They are worthless. Use UAH and RSS, but get rid of that annual version and use monthly versions. And draw the trend with a semi-transparent magic marker. See Figure 15 in my book “What Warming?”

RichardLH
March 16, 2014 1:58 pm

Lance Wallace says:
March 16, 2014 at 1:42 pm
“Can the Triple whatsis somehow compare the two curves and derive either a lag time or an estimate of how much of the CO2 emissions makes it into the observed atmospheric concentrations?”
Not really. Low pass filters are only going to show periodicity in the data and the CO2 figure is a continuously(?) rising curve.
It is useful to compare how the CO2 curve matches to the residual after you have removed the ~60 ‘cycle’ and there it does match quite well but with one big problem, you need to find something else before 1850 to make it all work out right.
Volcanos are the current favourite but I think that getting just the right number and size of volcanos needed is stretching co-incidence a little too far. Still possible though.

Eliza
March 16, 2014 2:00 pm

Arno 100% correct GISS, HADCRUT are trash data I don’t understand why any posting using that data can be taken seriously *”adjustments”, “UHI ect…”. This is just feeding the warmist trolls.

RichardLH
March 16, 2014 2:03 pm

Arno Arrak says:
March 16, 2014 at 1:45 pm
“Why bother with HadCrut and GISS? They are worthless. Use UAH and RSS, but get rid of that annual version and use monthly versions.”
They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.
They do match together quite well though with some odd differences.
What good would a monthly view do when trying to assess climate? Assuming you accept that climate is a long term thing, i.e. longer than 15 years. Monthly is down in the Weather range.

RichardLH
March 16, 2014 2:06 pm

Eliza says:
March 16, 2014 at 2:00 pm
“Arno 100% correct GISS, HADCRUT are trash data”
You might like to consider the fact that any long term adjustments will show up in the residual, greater than 75 years curve, and, if they have indeed occurred, would only serve to flatten that part of the output.
The ~60 year wriggle will still be there and needs explaining.

March 16, 2014 2:21 pm

Giss and hadcrut are not the only series.
Ncdc. Berkeley. Cowtan and way.
Im using ctrm in some work on gcrs and cloud cover. Thanks for the code richard.
Raise your objections now to the method…folks.

March 16, 2014 2:30 pm

Very nice elucidation of the 60 year cycle in the temperature data, Now do the same for the past 2000 years using a suitable filter. A review of candidate proxy data reconstructions and the historical record of climate during the last 2000 years suggests that at this time the most useful reconstruction for identifying temperature trends in the latest important millennial cycle is that of Christiansen and Ljungqvist 2012 (Fig 5)
http://www.clim-past.net/8/765/2012/cp-8-765-2012.pdf
For a forecast of the coming cooling based on the 60 and 1000 year quasi periodicities in the temperatures and the neutron count as a proxy for solar “activity ”
see http://climatesense-norpag.blogspot.com
this also has the Christiansen plot see Fig3. and the 1000 year cycle from ice cores Fig4
The biggest uncertainty in these forecasts is the uncertainty in the timing of the 1000 year cycle peak.
In the figure in Richards post it looks like it is about 2009. From the SST data it looks like about 2003. See Fig 7 in the link and NOAA data at
ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/annual.ocean.90S.90N.df_1901-2000mean.dat
It is time to abandon forecasting from models and for discussion and forecasting purposes use the pattern recognition method seen at the link
http://climatesense-norpag.blogspot.com .

J Martin
March 16, 2014 2:31 pm

What is the point of all this, what is the end result, projection / prediction ?

Arno Arrak
March 16, 2014 2:31 pm

Lance Wallace says on March 16, 2014 at 1:42 pm:
“The CO2 curve (seasonally detrended monthly) can be fit remarkably closely by a quadratic or exponential curve, each with <1% error for 650 or so consecutive months…"
So what. CO2 is not the cause of any global warming and is not worth that expenditure of useless arithmetic. The only thing important about it is that it is completely smooth (except for its seasonal wiggle) during the last two centuries. It follows from this that it is physically impossible to start any greenhouse warming during these two centuries. We already know that there has been no warming for the last 17 years despite the fact that there is more carbon dioxide in the air now than ever before. That makes the twenty-first century greenhouse free. Since this constant addition of CO2 is not causing any warming it follows that the theory of enhanced greenhouse warming is defective. It does not work and should be discarded. The only theory that correctly describes this behavior is the Miskolczi theory that ignorant global warming activists hate. But the twentieth century did have warming. It came in two spurts. The first one started in 1910, raised global temperature by half a degree Celsius, and stopped in 1940. The second one started in 1999, raised global temperature by a third of a degree in only three years, and then stopped. Here is where the smoothness of the CO2 curve comes in. Radiation laws of physics require that in order to start an enhanced greenhouse warming you must simultaneously add carbon dioxide to the atmosphere. That is because the absorbency of CO2 for infrared radiation is a fixed property of its molecules that cannot be changed. To get more warming, get more molecules. This did not happen in 1910 or in 1999 as shown by the Keeling curve and its extension. Hence, all warming within the twentieth century was natural warming, not enhanced greenhouse warming. Cobsequently we now have the twentieth and the twenty-first centuries both entirely greenhouse free. Hence that anthropogenic global warming that is the life blood of IPCC simply does not exist. To put it in other words: AGW has proven to be nothing more than a pseudo-scientific fantasy, result of a false belief that Hansen discovered greenhouse warming in 1988.

Hlaford
March 16, 2014 2:37 pm

The N-path filters make the best notch filters in sampled domains. They fell under the radar for the other kinds of filtering, but here you’d have 12 paths and a high pass at each of them for the monthly data.
I’ve seen a paper where a guy explains how he removed the noise by vuvuzele trumpets using N-path filtering.

RichardLH
March 16, 2014 2:39 pm

Steven Mosher says:
March 16, 2014 at 2:21 pm
“Giss and hadcrut are not the only series. Ncdc. Berkeley. Cowtan and way.
Im using ctrm in some work on gcrs and cloud cover. Thanks for the code richard.”
No problem – it’s what Greg and Vaughan thrashed out translated into R (for the CTRM).
I know there are other series but I wanted to start with the most commonly used ones first. I have treatments for some of the others as well. At present I am working on a set of Global, Land, Ocean triples as that has thrown up some interesting observations.
It is a great pity that the temperatures series are not available in very in easy to read into R form but require, sometimes, a lot of code just to turn them into data.frames. Makes it a lot more difficult to post the code to the ‘net in an easy to understand form.
The ~60 years ‘cycle’ shows up strongly in all of them as far. 🙂

jai mitchell
March 16, 2014 2:39 pm

Since this constant addition of CO2 is not causing any warming it follows that the theory of enhanced greenhouse warming is defective. It does not work and should be discarded.
. . .hilarious. . .
http://www.skepticalscience.com//pics/oceanheat-NODC-endof2013.jpg

RichardLH
March 16, 2014 2:41 pm

J Martin says:
March 16, 2014 at 2:31 pm
“What is the point of all this, what is the end result, projection / prediction ?”
It allows you to look at how the system responded to the inputs over the last 150+ years with less mathematical errors hiding the details. You can use the S-G trend as a loose guide to the immediate future trend if you like.

Gary
March 16, 2014 2:42 pm

How much useful information is being lost by filtering out the high frequency “noise” ? In other words, how do you judge what the most effective bandwidth of the filter is?

RichardLH
March 16, 2014 2:44 pm

Hlaford says:
March 16, 2014 at 2:37 pm
“The N-path filters make the best notch filters in sampled domains.”
There are many good notch filters out there, but that is not what this is. In fact it is the exact opposite. A broadband stop/pass filter that will allow ALL of the frequencies above 15 years in length to be present in the output. No need to choose what to look for, anything that is there, will be there.

RichardLH
March 16, 2014 2:49 pm

Gary says:
March 16, 2014 at 2:42 pm
“How much useful information is being lost by filtering out the high frequency “noise” ? In other words, how do you judge what the most effective bandwidth of the filter is?”
Well what you lose is, Daily, Weather, Monthly, Yearly and Decadal. What you keep is multi-decadal and longer. Which do you consider to be relevant to Climate?
If you are interested in the other stuff you can run a high pass version instead and look at that only if you wish. Just subtract this output signal from the input signal and away you go.
No data is truly lost as such, it is just in the other pass band. The high and low pass added together will, by definition, always be the input signal.

David L. Hagen
March 16, 2014 3:01 pm

Thanks Richard for an insightful way of exposing the 60 year cycle.
1) Has an algebraic factor been found behind the 1.2067 factor or is this still empirical per von Pratt?
2) Does using the more accurate year length of 365.26 days make any difference?
3) Suggest showing the derivative of the CTRM curve.
That would show more clearly that the rate of warming is declining. If that derivative goes though zero, that would give evidence that we are now entering the next “cooling” period vs just flattening in warming.
PS suggest amending “with a 15 CTRM and” to “with a 15 CTRM year and” to read more clearly.
Interesting interaction you had with TallBloke.

Mike McMillan
March 16, 2014 3:10 pm

Well, it certainly gets rid of the 1998 peak problem. GISS would be appreciative.

Bernie Hutchins
March 16, 2014 3:14 pm

Is it possible to humor some of us old-timer digital filter designers by specifically saying what the (Finite) Impulse Response of the CTRM is; in basic terms such as what simpler elements are being cascaded (convolved for the IR response, multiplied for the frequency response, etc.)? This is the key to understanding filtering – old or new. Thanks.

RichardLH
March 16, 2014 3:15 pm

Dr Norman Page says:
March 16, 2014 at 2:30 pm
“Very nice elucidation of the 60 year cycle in the temperature data, Now do the same for the past 2000 years using a suitable filter.”
Bit of a challenge finding a thermometer series going back that far. 😉
Proxy series all come with their own problems. They are rarely in a resolution that will allow the ~60 year signal to be seen.
I do have one which is the Shen PDO reconstruction from rainfall which does it quite well back to the 1400s.
Shen, C., W.-C. Wang, W. Gong, and Z. Hao. 2006.
A Pacific Decadal Oscillation record since 1470 AD reconstructed
from proxy data of summer rainfall over eastern China.
Geophysical Research Letters, vol. 33, L03702, February 2006.
ftp://ftp.ncdc.noaa.gov/pub/data/paleo/historical/pacific/pdo-shen2006.txt
Looks like the ~60 year is present a long way back then. As to any longer cycles, there the problem is resolution and data length. In most case the ‘noise’ is so great you can conclude almost anything and not be proved wrong by the data unfortunately.

Graeme W
March 16, 2014 3:16 pm

I have a couple of questions regarding the initial filter.
1. A 365 day filter has an ongoing problem due to leap-years. There are, roughly, 25 leap days in a century, which pushes a 365 day filter almost a month out of alignment. How is this catered for? It’s not a problem with a 12 month filter, but then you hit the problem that each month is not equal in length.
2. You’re doing filtering on anomalies to remove the seasonal component. Aren’t the anomalies supposed to do that themselves, since they’re anomalies from the average for each part of the season. That is, the anomalies themselves are trying to remove the seasonal component, and the filter is also trying to remove the seasonal component. How do ensure that any result we get isn’t an artifact of these two processes interacting?
Please forgive me if these questions show my ignorance, but they’ve been a concern of mine for awhile.

Steve Taylor
March 16, 2014 3:17 pm

Richard, are you related to the late, great John Linsley Hood, by any chance ?

Clay Marley
March 16, 2014 3:20 pm

I am not too fond of the triple running mean filter, mainly because it is not easy to implement generically in Excel. By that I mean if I wanted to change the number of points in one stage, it is a fair amount of work. Also there are three smoothing parameters to tinker with rather than a single number of regression points, and I have not seen any good theory on how to pick the three ranges to average over.
I prefer either a LOESS or Gaussian filter for this kind of data. Both can be implemented easily in Excel with macros. Both have the advantage of being able to handle points near either end of the data set. LOESS also works when the points are not equally spaced. By selecting the proper number of regression points, either one will be able to produce results virtually indistinguishable from the triple running mean filter. Also with LOESS it is possible to estimate the derivative at any point.
I have not experimented much with the Savitzky-Golay filter yet because it requires recalculating the coefficients with each change in the order or number of regression points. Can be done of course, just makes the macros more complicated. And I am not sure it is worth the effort. One of these days I’ll probably get around to it.
Any smoothing like this is mainly just a data visualization tool. How it looks is largely subjective. Rarely do I see a smoothed signal being used in later analysis, such as correlating it with something; and when I do it raises a red flag.

March 16, 2014 3:21 pm

RichardLH
“They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.”
like I said. wrong

RichardLH
March 16, 2014 3:23 pm

David L. Hagen says:
March 16, 2014 at 3:01 pm
“Thanks Richard for an insightful way of exposing the 60 year cycle.”
Lots of people claim it is not there. This just shows that you cannot really ignore it that easily.
“1) Has an algebraic factor been found behind the 1.2067 factor or is this still empirical per von Pratt?”
Vaughan has a precise set of reasoning which he gave on Greg’s thread over at JCs. It is not empirical but pure mathematical.
” 2) Does using the more accurate year length of 365.26 days make any difference?”
I have found that there may be a 4 year signal in some data which would suggest that it may well be. I haven’t looked in detail at the high pass portion of the signal yet.
“3) Suggest showing the derivative of the CTRM curve.”
I hate Linear Trends and all their cousins. They only hold true over the range they are drawn from and hide rather than reveal as far as I am concerned. Use the residual, greater than 75 years as a better alternative for any long term values. You can run an S-G at that length but please be aware of S-G tendencies to be strongly influenced by the ‘end’ data..
“That would show more clearly that the rate of warming is declining. If that derivative goes though zero, that would give evidence that we are now entering the next “cooling” period vs just flattening in warming.”
I think that the S-G 15 year shows that quite well in any case.

RichardLH
March 16, 2014 3:24 pm

Steve Taylor says:
March 16, 2014 at 3:17 pm
“Richard, are you related to the late, great John Linsley Hood, by any chance ?”
My Uncle. He got me interested in Audio (and Science) a long time back and was the originator of the 1.3371 value I was originally using in my first CTRM digital work.

March 16, 2014 3:27 pm

RLH The Christiansen data I linked to are I think archived. Read the paper and see why I think they do provide a temperature record which is good enough to be useful for analysis. It should replace the Mann hockey stick in the public perception of the NH climate over the last 1000 years.
Again check Figs 3 and 4 at http://climatesense-norpag.blogspot.com.
Forecasting is fairly straightforward if you know where you are on the 1000 year quasi-periodicity.
If you don’t take it into consideration you are lost in a maze of short term noise.

RichardLH
March 16, 2014 3:30 pm

Bernie Hutchins says:
March 16, 2014 at 3:14 pm
“Is it possible to humor some of us old-timer digital filter designers by specifically saying what the (Finite) Impulse Response of the CTRM is”
Basically the triple cascade allows for a near to Gaussian response curve. This can be seen in greater detail on Greg’s thread where the various curves can be compared. CTRM is just slightly better than Gaussian in that it completely removes the corner frequency whereas a true Gaussian does leave a small portion still in the output.

RichardLH
March 16, 2014 3:33 pm

Mike McMillan says:
March 16, 2014 at 3:10 pm
“Well, it certainly gets rid of the 1998 peak problem. GISS would be appreciative.”
Does throw up another question though. Why is it that GISS and HadCRUT are so far apart in the middle? They are close together at the start and the finish, why so different in the middle? I am not sure GISS (or HadCRUT) will thank me that much.

garymount
March 16, 2014 3:34 pm

You use the term signal but you haven’t provided a definition for “signal”. Is this data? Is “signal” an entity of some kind? Is there a signal in the probabilities of rolling a dice? I just haven’t come across the word “signal” very often in any of the studies over all the decades of my personal studies and work, perhaps because other terminology is used. My studies have included electronics, but it didn’t involve signal processing. I also spent 10 years full time work in drawing electrical schematic diagrams, about 3 decades of computer science, and I somewhat recently spent several years studying calculus every single day, so now I call myself a mathematician.
If we are using computers to work with a signal, that means it is probably converted to discretized data, consisting of finite numbers, though you can use semantic computation, though I don’t think that’s what I see happening.
Sorry for being somewhat pedantic, but I’m just not conformable with calling the temperature of the global surface a signal, so I’m asking for a better description of “signal”.

david moon
March 16, 2014 3:36 pm

It’s hard to tell what this is from the description in words. Can you show a block diagram or provide a spreadsheet? Also just showing the time response to a step input, or the frequency response would be helpful.
It sounds like it’s an FIR filter with no feedback of the output. If so, you should be able to multiply out the nested means and the coefficient and get 12 coefficients, one for each tap of an FIR.
One quibble: 75 years is period, not a frequency. It would be more correct to say “…frequencies with periods greater than 75 years…”

RichardLH
March 16, 2014 3:39 pm

Graeme W says:
March 16, 2014 at 3:16 pm
“I have a couple of questions regarding the initial filter.
1. A 365 day filter has an ongoing problem due to leap-years. There are, roughly, 25 leap days in a century, which pushes a 365 day filter almost a month out of alignment. How is this catered for? It’s not a problem with a 12 month filter, but then you hit the problem that each month is not equal in length.”
True. The advantage of a continuous CTRM filter is that the full sampling frequency is preserved in the output so any error is very tiny. It you were to sub-sample to Annual then the problem you see would be present. This does not have that problem.
“2. You’re doing filtering on anomalies to remove the seasonal component. Aren’t the anomalies supposed to do that themselves, since they’re anomalies from the average for each part of the season. That is, the anomalies themselves are trying to remove the seasonal component, and the filter is also trying to remove the seasonal component. How do ensure that any result we get isn’t an artifact of these two processes interacting?”
Fortunately it doesn’t matter if you do this on Temperature or Anomalies, you end up with the same curve, just a different label to the left. The problem with ‘normals’ is that there are only 30 additions to make the normal. That will leave some error term in that which will then be present in all of the Anomaly outputs. Using a CTRM removes that residual as well rather nicely.
“Please forgive me if these questions show my ignorance, but they’ve been a concern of mine for awhile.”
Mine too which is why I use CTRM now as it is much mathematically purer.

RichardLH
March 16, 2014 3:46 pm

Steven Mosher says:
March 16, 2014 at 3:21 pm
“RichardLH
“They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.”
like I said. wrong”
OK. To be precise, there is only one set of thermometers from which the various treatments are drawn.
Some included more thermometers than the others, some less. I have comparisons between most of them and none come out particularly favourably. I have BEST, C&W, NCDC and others. All show surprising disparities with each other with no easy explanations.
I’ll get round to them in time but the next set is Land, Ocean and General triples for the sets I do have to hand right now.

RichardLH
March 16, 2014 3:48 pm

Dr Norman Page says:
March 16, 2014 at 3:27 pm
“RLH The Christiansen data I linked to are I think archived.”
Thanks, I will. DO you know if the data is on the ‘net and the url? Or is it just the paper?

Bernie Hutchins
March 16, 2014 3:51 pm

RichardLH says: March 16, 2014 at 3:30 pm
“Bernie Hutchins says:
March 16, 2014 at 3:14 pm
“Is it possible to humor some of us old-timer digital filter designers by specifically saying what the (Finite) Impulse Response of the CTRM is”
Basically the triple cascade allows for a near to Gaussian response curve. This can be seen in greater detail on Greg’s thread where the various curves can be compared. CTRM is just slightly better than Gaussian in that it completely removes the corner frequency whereas a true Gaussian does leave a small portion still in the output.”
Fine – thanks. But we need a “formula” or design procedure that leads to h(n) = …. so that we can calculate the frequency response in a known way (sum of frequency domain cosines in this case). It seems to me that SG with its optimal flatness around DC couldn’t be beaten. Because the data is noisy, spurious resonances (if any) matter. Hence the advantage of SG.

RichardLH
March 16, 2014 3:51 pm

garymount says:
March 16, 2014 at 3:34 pm
“You use the term signal but you haven’t provided a definition for “signal”. Is this data?”
You are quite right. I do tend to use the term loosely. As far as I am concerned a ‘signal’ is any variation in the temperature data, long term. I tend to class Weather, Monthly, Annual as ‘noise’.
Just my background. Sorry if it confuses.

david moon
March 16, 2014 3:54 pm

Assuming monthly data points: 1, 2, etc. Is this how it works?
M1 = mean(1,2,3,4)
M2 = mean(5,6,7,8)
M3 = mean(9,10,11,12)
M4 = 1.2067 x mean(M!,M2)
M5 = 1.2067 x mean(M2,M3)
OUTPUT = 1.2067 x mean(M4,M5)
Shift data one place and continue…

garymount
March 16, 2014 3:55 pm

RichardLH says: March 16, 2014 at 3:51 pm
– – –
Thanks Richard, now I feel better 🙂

RichardLH
March 16, 2014 3:57 pm

david moon says:
March 16, 2014 at 3:36 pm
“It’s hard to tell what this is from the description in words. Can you show a block diagram or provide a spreadsheet?”
…Here is a spreadsheet (http://wattsupwiththat.files.wordpress.com/2010/07/uah-with-annual-ctrm.xlsx) containing UAH and a Annual CTRM.. from above
“Also just showing the time response to a step input, or the frequency response would be helpful.”
If you want freqency response graphs and the like then Greg’s thread on JCs is a better place. I left out a lot of the technical as the post got way too long elsewise.
Remove bits
…all this is at Judith Curry’s site.
http://judithcurry.com/2013/11/22/data-corruption-by-running-mean-smoothers/
Data corruption by running mean ‘smoothers’
Posted on November 22, 2013 by Greg Goodman
or visit Greg’s own site for the same article.
http://climategrog.wordpress.com/2013/05/19/triple-running-mean-filters/

“One quibble: 75 years is period, not a frequency. It would be more correct to say “…frequencies with periods greater than 75 years…””
Again, sorry for the sloppy terminology. I tend to use the two without considering the exact wording required. I think in one and have to write in the other!

RichardLH
March 16, 2014 4:00 pm

david moon says:
March 16, 2014 at 3:54 pm
“Assuming monthly data points: 1, 2, etc. Is this how it works?”
The spreadsheet with its colour coding may help you understand what is going on better.
http://wattsupwiththat.files.wordpress.com/2010/07/uah-with-annual-ctrm.xlsx
Basically you run a single mean over the data. then you use that column as the input to the next stage and so on. Hence the cascaded bit.

A Crooks of Adelaide
March 16, 2014 4:02 pm

Hi,
Just a note on my own observations with respect to the top graph
What I see is a sixty year sine curve
What I also see is a pattern of two peaks followed by a deep trough – two peaks followed by a deep trough etc. The troughs repeat every 7.5 years next one due in 2016 the peaks are every 3.75 years
I use http://www.climate4you.com/images/AllCompared%20GlobalMonthlyTempSince1979.gif
as my satelite source.
If you plot El Chichon and Pinatubo on the graph you will see how much volcanic events disrupt the this pattern ( http://www.esrl.noaa.gov/gmd/grad/mloapt.html) knocking the shoulders of the plateaus of the running average
Sure there is an oddity around 2005 where the peaks ahve their own little troughs too bad
The graphs I use are:
A = 0.18*SIN(((YEAR-1993)/60)*2*3.14159)+0.2
B = 0.1*COS(((YEAR-1982)/7.5)*2*3.14159
C = 0.25*COS(((YEAR -1980)/3.75)*2*3.14159
See you in 2016 to see how close I get!

Arno Arrak
March 16, 2014 4:07 pm

RichardLH on March 16, 2014 at 2:03 pm
Richard, you are hung up on useless arithmetic and are defending falsified temperature curves. Both HadCRUT and GISS showed presence of a non-existent warming in the eighties and nineties. My nose tells me that Hansen is behind this. I pointed it out in my book and two years later both sources changed their data retroactively to make it conform to satellites which do not show this warming. They did this secretly and no explanation was given. To coordinate it there had to be a trans-Atlantic communication link between them. This group also includes NCDC which I did not mention before.
Another thing you talk about is volcanos: ” Volcanos are the current favourite but I think that getting just the right number and size of volcanos needed is stretching co-incidence a little too far.” Forget about volcanoes, there is no volcanic cooling. The hot gases from an eruption ascend to the stratosphere where they warm it at first. In about two years time it turns to cooling but this stays up there too and never descends into the troposphere. The alleged “volcanic cooling” is caused by an accidental coincidence of expected cooling with the location of a La Nina cooling that is part of ENSO. The entire global temperature curve is a concatenation of El Nino peaks with La Nina valleys in between. It has been that way ever since the Panamanian Seaway closed. Many so-called “experts” don’t know this and would like nothing better than to detrend it out of existence. But by so doing they are destroying information and information about the exact location of El Nino peaks is required to evaluate the effect of volcanoes. In general, if an eruption is coincident with an El Nino peak it will be followed by a La Nina valley which promptly gets appropriated for a “volcanic cooling” incident. This happened with Pinatubo. But ENSO and volcanism are not in phase and it can happen that an eruption coincides with a La Nina valley. It will then be followed by an El Nino peak exactly where the volcanic cooling is supposed to happen and volcanologists are still scratching their heads about it. This is what happened with El Chichon in Mexico that was equally as strong as Pinatubo. This lack of understanding leads to more stupidity with models. Looking at a CMIP5 feather duster I noticed that all of the threads that completely disagree with one another about the future dip down together at the locations of El Chichon and Pinatubo volcanoes, no doubt because they have volcanic cooling coded into their software. This story is actually no secret because it has been in my book all along but these “climate” scientists just don’t do their homework and are still ignorant about it.

A Crooks of Adelaide
March 16, 2014 4:08 pm

I guess I needed to add the important bit on how I combine the formulae Sorry about that:
The overarching trend is a sixty year cycle: = A
The moving 20-month average adds a 7.5 year cycle attenuated by the truncation of
the positive peaks of the 7.5 year cycle : = A + (IF B>0, 0, ELSE = B)
The monthly average combines a 7.5 year cycle with a 3.75 year cycle (i.e. twice the
7.5 year cycle) to capture the pattern where every second trough in the 3.75 year COS
function is significantly deeper the others : = A + (3/4) * B + C

RichardLH
March 16, 2014 4:09 pm

Bernie Hutchins says:
March 16, 2014 at 3:51 pm
“Fine – thanks. But we need a “formula” or design procedure that leads to h(n) = …. so that we can calculate the frequency response in a known way (sum of frequency domain cosines in this case).”
Well for Gaussian then Greg refers to this for the full mathematical derivation.
http://www.cwu.edu/~andonie/MyPapers/Gaussian%20Smoothing_96.pdf
For frequency responses then
http://climatedatablog.files.wordpress.com/2014/02/fig-1-gaussian-simple-mean-frequency-plots.png
and
http://climatedatablog.files.wordpress.com/2014/02/fig-2-low-pass-gaussian-ctrm-compare.png
given the various filter responses graphically.

RichardLH
March 16, 2014 4:11 pm

A Crooks of Adelaide says:
March 16, 2014 at 4:02 pm
The point of this is that is NOT a curve fit of any form. It is a simple low pass treatment of the data. Any ‘cycle’ you see, you need to explain. This just shows what is there.

RichardLH
March 16, 2014 4:13 pm

Arno Arrak says:
March 16, 2014 at 4:07 pm
“Richard, you are hung up on useless arithmetic and are defending falsified temperature curves. Both HadCRUT and GISS showed presence of a non-existent warming in the eighties and nineties.”
Low pass filters are not useless arithmetic. They are the way that you can deal with the data you have all the time. You use them all the time in Daily, Monthly and Yearly form. This is just an extension of that.
If you believe that the data has been fiddled, then this should be an ideal tool for you to demonstrate that.

Bernie Hutchins
March 16, 2014 4:39 pm

RichardLH says:
March 16, 2014 at 4:09 pm
Thanks Richard – that’s what I thought.
But when you cascade Moving Averages (running means, rectangles, “boxcars”) you get all the nulls of the original rectangles in the cascade. When you cascade Gaussians, you get essentially the smallest bandwidth of the set (longest average). With SG, you get an attractive wide low-pass bandwidth (optimally flat) at the expense of less high-pass rejection (never a free lunch).
SG is easy to do. By coincidence I posted an application note on it just a month ago. Note my Fig. 6b for example compared to the two frequency responses you posted for me.
http://electronotes.netfirms.com/AN404.pdf
Without due regard for the frequency response, you COULD BE seeing “colored noise” due to a spurious resonance. You are certainly NOT seeing something totally spurious, because for one thing, we can “see” the 60 year periodicity already in the original data. Less noise – but no new information?

AlexS
March 16, 2014 4:42 pm

One more instance where people want to produce omelets without eggs.
This answer below show everything is wrong about the author’s attitude:
“They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.”

March 16, 2014 4:47 pm

It would be helpful to know the following:
1. What does this method tell us that a linear trend does not tell us?
2. How clear is it from the method that there is a ~60-year cyclicity in the data?
3. What tests have you performed to see whether this ~60-year cycle is synchronous with the ~60-year cycles of the great ocean oscillations? Or with the ~60-year planetary beat?
4. What are the main reasons why this method is better than other methods now in use?
5. Given that orbital characteristics have already removed any seasonal influence from the data, what need is there to remove it again?
6. Are you advocating this method as a replacement for the linear trends now used by the IPCC etc.?
7. Do you propose to get a paper on the applicability of this method to climate measurements into the learned journals, and to persuade the IPCC to adopt it?
Many thanks

RichardLH
March 16, 2014 4:49 pm

Bernie Hutchins says:
March 16, 2014 at 4:39 pm
“Thanks Richard – that’s what I thought.
But when you cascade Moving Averages (running means, rectangles, “boxcars”) you get all the nulls of the original rectangles in the cascade.”
But that is the reason for the 1.2067 inter-stage value. It places the nulls into the centre of the previous errors and thus the cascade flattens the whole thing to Gaussian, or very nearly so.
Vaughn was kind enough to agree with me when I pointed out that digitisation and range errors dominate over any residual errors with just the three stages.
S-G is all very well, but it does have a tendency to be quite ‘whippy’ at the ends. Full kernel filters never changed when new data s added, just extend. Thus certainty comes from their output, not uncertainty.
So I use CTRM for the majority of the data with S-G only for the ends and verified one against each other for the parameters.

RichardLH
March 16, 2014 4:50 pm

AlexS says:
March 16, 2014 at 4:42 pm
“One more instance where people want to produce omelets without eggs.
This answer below show everything is wrong about the author’s attitude:
“They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.””
Show me a Global data set that extends further back and I will use it.

RichardLH
March 16, 2014 5:10 pm

Monckton of Brenchley says:
March 16, 2014 at 4:47 pm
“It would be helpful to know the following:
1. What does this method tell us that a linear trend does not tell us?”
Linear trends are to my mind almost the most useless of statistics. They are not really valid outside of the range from which they are drawn. A continuous function, such as a filter, is a much better guide as to what is actually happening in the available data. CTRM filters are the most accurate, simple, continuous function you can use.
“2. How clear is it from the method that there is a ~60-year cyclicity in the data?”
If there IS a ~60 year cycle in the data, then it will be present and demonstrable, by measuring peak to peak or mid point to mid point. With only two such cycles available in the data it is right at the edge of any such decision though. Nearly in ‘toss a coin’ land but signal decoders work like this all the time.
” 3. What tests have you performed to see whether this ~60-year cycle is synchronous with the ~60-year cycles of the great ocean oscillations? Or with the ~60-year planetary beat?”
I have quite a few graphs of PDO, AMO/NAO on http://climatedatablog.wordpress.com/. I am in the early days of drawing them all together and that is hopefully going to be part of my next article.
“4. What are the main reasons why this method is better than other methods now in use?”
Mathematically purer than single running means and not much more difficult to create/use.
“5. Given that orbital characteristics have already removed any seasonal influence from the data, what need is there to remove it again?”
Normal/Anomaly only has 30 additions so leaves a surprisingly large error term in that sort of work. This removes all of the quite cleanly. Can be run on either normalised or raw data and will produce the same high quality output for both.
” 6. Are you advocating this method as a replacement for the linear trends now used by the IPCC etc.?”
Absolutely.
Linear Trend = Tangent to the Curve = Flat Earth.
” 7. Do you propose to get a paper on the applicability of this method to climate measurements into the learned journals, and to persuade the IPCC to adopt it?”
My academic days are well past now. It may well be worth while trying to get a paper published though.

davidmhoffer
March 16, 2014 5:11 pm

Since temperature doesn’t vary linearly with w/m2, this method is as useless as calculating an average temperature is in the first place. Convert all the temperature data to w/m2 and then look at all the trends with all the filters you want. Since averaging temperature data is absurd in the first place, all this accomplishes is a sophisticated analysis of an absurd data set.
The fascination which so many seem to have with trends of averaged temperature data is beyond me. It tells us precisely nothing about the energy balance of the earth, no matter how you filter it.

Bernie Hutchins
March 16, 2014 5:13 pm

RichardLH says:
March 16, 2014 at 4:49 pm
Thanks again Richard
True enough you may be approximating Gaussian. Convolving many rectangles tends Gaussian of course. So? You are then using low-pass region that is rapidly falling from the start instead of one (SG) that could be optimally flat. What is your reason for choosing one over the other?
And I don’t understand your “S-G is all very well, but it does have a tendency to be quite ‘whippy’ at the ends.” What is “whippy”? Are you saying there are end effects on the ends – that is no surprise. Nobody really knows how to handle ends – just to caution end interpretation. Then you say: “So I use CTRM for the majority of the data with S-G only for the ends…” So you want the whippy behavior?
(Someone around here at times says smoothed data is not THE data. )

March 16, 2014 5:20 pm

RLH
Here’s the link . You will have to read the paper carefully to see which data was used in the paper and how Check on Christiansen
at
ftp://ftp.ncdc.noaa.gov/pub/data/paleo/contributions_by_author/
To look at the longer wavelengths you really need FFT and wavelet analysis of the Holocene data See Fig 4 at http://climatesense-norpag.blogspot.com
1000 year periodicity looks good at 10,000, 9000,8000,7000- then resonance fades out comes back in at 2000,1000 0.

RichardLH
March 16, 2014 5:31 pm

davidmhoffer says:
March 16, 2014 at 5:11 pm
“Since temperature doesn’t vary linearly with w/m2, this method is as useless as calculating an average temperature is in the first place.”
Since matter integrates the incoming power over its volume/density/composition and alters its temperature to suit I disagree.

RichardLH
March 16, 2014 5:40 pm

Bernie Hutchins says:
March 16, 2014 at 5:13 pm
“True enough you may be approximating Gaussian. Convolving many rectangles tends Gaussian of course. So? You are then using low-pass region that is rapidly falling from the start instead of one (SG) that could be optimally flat. What is your reason for choosing one over the other?”
S-G alters over the whole of its length with new data. A CTRM will never alter, only extend. CTRM is very simple to do, S-G is a lot more complex.
Mathematically CTRM is much, much purer than a single running mean and only requires two extra stages, with reducing windows to get near Gaussian. With just three stages digitisation/rounding errors are larger than then error terms left when compared to a true Gaussian response.
I try to avoid placing too much hard decision making on filters other than full kernel ones and use S-G for outline guidance only.

RichardLH
March 16, 2014 5:40 pm

Dr Norman Page says:
March 16, 2014 at 5:20 pm
“RLH Here’s the link .”
Thanks.

davidmhoffer
March 16, 2014 5:47 pm

RichardLH;
Since matter integrates the incoming power over its volume/density/composition and alters its temperature to suit I disagree.
>>>>>>>>>>>>>.
If the temperature of the earth were uniform, you’d be right. But it isn’t. I suggest you read Robert G Brown’s various articles on the absurdity of calculating an average temperature in the first place. It can easily be demonstrated that the earth can be cooling while exhibiting a warming average temperature, and vice versa.

March 16, 2014 5:53 pm

Another way to filter or factor out a strong annual signal is to do a running annual difference (jan 1978-jan 1977 etc). This has the advantage of preserving a signal-to-noise ratio and reveals the relative strengths of longer cycles. If you have hourly data, you can do day to day differences. The limitation is the time resolution of the data.

March 16, 2014 6:08 pm

I think your blog is one of the top conservatarian blogs out there and I put you in my links. Keep up the good work. http://the-paste.blogspot.com/ , http://thedailysmug.blogspot.com/

Bernie Hutchins
March 16, 2014 6:11 pm

RichardLH says:
March 16, 2014 at 5:40 pm
Thanks Richard –
(1) Somehow I don’t think you and I may not be talking about the same thing. My idea of a FIR filter (which is standard) is that it is fixed length, and convolves with the signal. A moving average and a SG do this. for example. New data is what enters the window one step at a time. The filter is not “redesigned” with each step. Isn’t this what you do? Is your system LTI? Your spreadsheet doesn’t tell me much. Diagrams and equations are often useful in signal processing of course – not just output graphs.
(2) SG is NOT “a lot more complex.” You didn’t say why the SG flat passband was not something to admire. Neither did you tell me what “whippy” meant. I have never heard that term used.
(3) Since you are smoothing (averaging) you are destroying data. If you tell me you see a periodicity of 60 years, I say – yes – I see it in the data itself. If I can’t see it already, but it comes out of the filter, I would want you to show that it emerges in excess of what would be expected just by resonance. So the larger issue is perhaps that if you want to use smoothing and therefore DESTROY information, are we wrong to ask you do define in detail what your smoothing involves, and to demonstrate why it is better than something better understood?

bones
March 16, 2014 6:12 pm

Lance Wallace says:
March 16, 2014 at 1:42 pm
The CO2 curve (seasonally detrended monthly) can be fit remarkably closely by a quadratic or exponential curve, each with <1% error for 650 or so consecutive months:
http://wattsupwiththat.com/2012/06/02/what-can-we-learn-from-the-mauna-loa-co2-curve-2/
The exponential has a time constant (e-folding time) on the order of 60 years (i.e., doubling time for the anthropogenic additions from the preindustrial level of 260 ppm of about 60*0.69 = 42 years).
——————————————————
Last time I looked, the atmospheric CO2 concentration was increasing at about 5% per decade, which gives it a doubling time of about 140 years.

RichardLH
March 16, 2014 6:38 pm

Bernie Hutchins says:
March 16, 2014 at 6:11 pm
“Thanks Richard –
(1) Somehow I don’t think you and I may not be talking about the same thing. My idea of a FIR filter (which is standard) is that it is fixed length, and convolves with the signal. A moving average and a SG do this. for example. New data is what enters the window one step at a time. ”
A CTRM is just an extension of the standard moving average. Nothing more. The same rules apply. New data enters, old data leaves. The window is a stacked set which has a nearly Gaussian distribution to its weighting values if you work them all out.
“(2) SG is NOT “a lot more complex.” You didn’t say why the SG flat passband was not something to admire. Neither did you tell me what “whippy” meant. I have never heard that term used.”
If you compare one set with another as time evolves then you will see how the S-G has a tendency for its outer end to move around in an almost animal like manner with new data. That Wiki link has a nice animation which shows the underlying function as its passes up a data set which displays it quite well.Too many animations of images in the work I have done elsewhere I suppose, it just looks like a caterpillar to me 🙂 Sorry.
http://en.wikipedia.org/wiki/File:Lissage_sg3_anim.gif
(3) Since you are smoothing (averaging) you are destroying data.
Wrong. You are assigning stuff to a pass band or a stop band. In fact if you take the output and subtract if from the original data (as in 1-x) you end up with the high pass filter version. Data is never destroyed. High Pass output plus Low Pass output always equals the original data set.

RichardLH
March 16, 2014 6:40 pm

davidmhoffer says:
March 16, 2014 at 5:47 pm
Pulsed input is also integrated by matter as well so I will still differ.

RichardLH
March 16, 2014 6:43 pm

fhhaynie says:
March 16, 2014 at 5:53 pm
“Another way to filter or factor out a strong annual signal is to do a running annual difference (jan 1978-jan 1977 etc). This has the advantage of preserving a signal-to-noise ratio and reveals the relative strengths of longer cycles. If you have hourly data, you can do day to day differences. The limitation is the time resolution of the data.”
A single stage filter, either running average or difference, has horrible mathematical errors in its out output.
One look at the frequency response will tell you that.
http://climatedatablog.files.wordpress.com/2014/02/fig-1-gaussian-simple-mean-frequency-plots.png

A Crooks of Adelaide
March 16, 2014 6:45 pm

RichardLH says:
March 16, 2014 at 4:11 pm
A Crooks of Adelaide says:
March 16, 2014 at 4:02 pm
The point of this is that is NOT a curve fit of any form. It is a simple low pass treatment of the data. Any ‘cycle’ you see, you need to explain. This just shows what is there.
I’m more interested in the whats left after you take the low pass out. Its the short term cycles that determine if the monthly “Arghhh its getting hotter!” or “Arrgh its getting colder!” is significant or not.
And I dont go along with this … “if you cant explain the cycle, it doesnt exist” story They were predicting eclipses in the Bronze Age and very useful it proved too. You have to find the cycles first – then you start to think what causes them

RichardLH
March 16, 2014 6:50 pm

Dr Norman Page says:
March 16, 2014 at 5:20 pm
” To look at the longer wavelengths you really need FFT and wavelet analysis of the Holocene data ”
The problem with FFTs and Wavelets is two fold.
1) Large amounts of noise make longer wavelengths very difficult to pick out. You end up with very broad peaks which all COULD be signal.
2) They are both very bad if the signal is made up of ‘half wave’ mixtures. Such as a mixture of 2, 3 and 4 years half waves in some random combination. Or, say, 55, 65, 75 year half waves mixed up to make a 65 year average. Nature, when not operating as a tuned string which is most of the time, has a habit of falling into just such combinations.
The noise is the problem really. In both value and phase. There are no nice clean signals which we can work with.

March 16, 2014 6:52 pm

Off the subject:
Most of the time, just read here go elsewhere on the net and seek out those who need to join in and link back here. The first word of this post “Crowdsourceing” struck me.
What we are missing is a “Crowdsourceing” person from the warming side to focus more voters to this site and others like this site and or to pull people out of their daily lives and into the struggle.
Back in say 2006 to 2008 old Al Gore was all about and never ever shut his yap.
Seems we need to find a way to get someone as polarizing Al Gore together with others like him to once more and get them all back into the fray and go all bombastic and over the top with wild claims. Think up a way to gig his or some other ones of their over sized egos and get them in the press and then use that to pull in more people of reason to the battle ground.
Not that these types will ever go away but we need to get this lie based fraud redistribution of wealth pulled way down to earth.

RichardLH
March 16, 2014 6:53 pm

A Crooks of Adelaide says:
March 16, 2014 at 6:45 pm
“You have to find the cycles first – then you start to think what causes them”
Well I suspect that we have the Daily, Monthly and Yearly cycles pinned down quite well now. Its the natural cycles longer than 15 years that I am interested in. Nothing much except something big at ~60 years between 15 and 75 years as far as I can tell.

Bart
March 16, 2014 7:00 pm

RichardLH says:
March 16, 2014 at 1:58 pm
“It is useful to compare how the CO2 curve matches to the residual after you have removed the ~60 ‘cycle’ and there it does match quite well but with one big problem, you need to find something else before 1850 to make it all work out right.”
There’s a bigger problem. It doesn’t really match well at all. The match is between the integral of temperature and CO2.
And, that’s GISS. The match with the satellite record is even better.
CO2 is not driving temperatures to any level of significance at all. It is, instead, itself accumulating due to a natural process which is modulated by temperatures. Once you remove the trend and the ~60 year periodicity, which have been around since at least 1880, well before CO2 is believed to have increased significantly, there is very little left of temperature to be driven.

March 16, 2014 7:03 pm

Richard. They all use different stations

RichardLH
March 16, 2014 7:15 pm

Steven Mosher says:
March 16, 2014 at 7:03 pm
“Richard. They all use different stations”
As I said, they are all drawn from the same set of thermometers. The selection differs, set to set. When looking at the oldest ones at greater than 150 years or so, the choices available are fairly restricted though.
I have comparisons between the various offerings. They do differ and in some surprising ways. Lots of work still to do 🙂

RichardLH
March 16, 2014 7:21 pm

Bart says:
March 16, 2014 at 7:00 pm
I was pointing out that whilst there may be a superficial match, it fails when comparing to earlier data so therefore may be suspect on those grounds alone.
In any case, unless the temperature starts rising quite quickly fairly soon, the CO2 case gets weaker by the month.

Bernie Hutchins
March 16, 2014 7:22 pm

RichardLH said in part March 16, 2014 at 6:38 pm
“Too many animations of images in the work I have done elsewhere I suppose, it just looks like a caterpillar to me 🙂 Sorry.
http://en.wikipedia.org/wiki/File:Lissage_sg3_anim.gif
Richard – Thanks yet again. That helps. Glad we are LTI.
But you are completely wrong (misled by that animation I think) about whipping. None of those yellow polynomials in the animation you posted whipping about are ever even computed. Instead the SG fit is a FIXED FIR filter. Now, to be honest, the first time I learned this, the instructor’s words were exactly: “Astoundingly, this is an FIR filter.” I WAS astounded – still am. No whipping.
[ Aside: In general, polynomials as models of signals are going to be unsuited. Signals are mainly horizontal, while polynomials are vertically in a hurry to get to + or – infinity. Your whipping. ]
All you need do is populate a non-square matrix with integers (like -5 to +5 for length 11), raise these integers to successive powers for the rows below, take the least-squares (pseudo) inverse for the non-square matrix, and use the bottom row as the impulse response. (It is basically “sinc like”.) That’s the nuts and bolts – no polynomial fitting, and always the same answer for a given length and order. See my previous app note link. Yea – magic!
So it is no more complicated (once rather easily computed) than Gaussian. And it’s a much better filter.

Matthew R Marler
March 16, 2014 7:23 pm

Does anybody know what the result will look like when the “correct” smoothing method is used?
Is this better than piecewise polynomial regression or smoothing by projection on b-splines with knots estimated from the data (Bayesian adaptive regression splines as used by R. E. Kass with neuronal data analysis)?
It might be a useful pedagogical exercise to have a booklet of graphs, with each of 4 or more temperature series smoothed by many different methods, not excluding polynomials and trig polynomials (with fixed and estimated periods) of up to order 10..

davidmhoffer
March 16, 2014 7:34 pm

RichardLH
Pulsed input is also integrated by matter as well so I will still differ.
>>>>>>>>>>>>>>>.
It most certainly is not. The tropics are net absorbers of energy, the arctic regions are net emitters. Massive amounts of energy are moved from tropics to arctic regions by means of air and ocean currents, and these are not integrated by matter as they would be if the sole mechanism was conductance. You’ve got an interesting way of looking at the data, just apply it to data that is meaningful.

Stephen Rasey
March 16, 2014 7:45 pm

A beautiful example of frequency content that I expect to be found in millennial scale uncut temperature records is found in Lui-2011 Fig. 2. In China there are no hocky sticks The grey area on the left of the Fig. 2 chart is the area of low frequency, the climate signal. In the Lui study, a lot of the power is in that grey area.

Lui-2011 Fig.2 Power Spectrum shows large power density at frequencies below 0.05/yr (i.e. cycle times >= 20 years). They note significant power spikes at 1324, 800, 199, 110 Years (and by visual estimate, ~70 and 27 yrs, too).

Copied from a post at ClimateAudit Oct. 31, 2011 Best Menne Slices
Lui sees a minor 60 (or 70) year cycle. But it seems less significant that the others.

Bernie Hutchins
March 16, 2014 7:46 pm

Matthew R Marler said in part: March 16, 2014 at 7:23 pm
“Does anybody know what the result will look like when the “correct” smoothing method is used?”
No. Nobody knows! Unless you have a physical argument of other evidence, curve fitting is near useless. And the physics – let’s call it “incomplete”. Again, data fit to a curve is not THE data. Curve fitting destroys information. You can always get the proposed curve (again), but you can’t go back.
Many years ago I did a physics lab experiment and plotted data on every possible type of graph paper I could find in the campus store. My professor (Herbert Mahr) was kind enough to compliment my artistic efforts, but then assured me that none of it necessarily MEANT ANYTHING. Indeed.

pat
March 16, 2014 8:06 pm

meanwhile, the MSM/CAGW crowd are sourcing this today:
16 March: Phys.org: Greenland implicated further in sea-level rise
An international team of scientists has discovered that the last remaining stable portion of the Greenland ice sheet is stable no more.
The finding, which will likely boost estimates of expected global sea level rise in the future, appears in the March 16 issue of the journal Nature Climate Change…
“Northeast Greenland is very cold. It used to be considered the last stable part of the Greenland ice sheet,” explained GNET lead investigator Michael Bevis of The Ohio State University. “This study shows that ice loss in the northeast is now accelerating. So, now it seems that all of the margins of the Greenland ice sheet are unstable.”
Historically, Zachariae drained slowly, since it had to fight its way through a bay choked with floating ice debris. Now that the ice is retreating, the ice barrier in the bay is reduced, allowing the glacier to speed up—and draw down the ice mass from the entire basin…
http://phys.org/news/2014-03-greenland-implicated-sea-level.html

John West
March 16, 2014 8:17 pm

Sorry guys but this is absolutely worthless. There’s no data-set worth an iota pre-satellite era and post-satellite era lacks enough data to be useful. Even if there were a data-set that was accurate back to circa 1850 that’s still an incredibly short snippet of time wrt Milankovitch Cycle lengths and even the supposed Bond Cycle lengths.

March 16, 2014 8:33 pm

John West says:
March 16, 2014 at 8:17 pm
Sorry guys but this is absolutely worthless.
In that case, how would you prove that the warming that is occurring now is not catastrophic?
NOAA and Santer suggested straight lines should be used. Perhaps curves should be used, but it may be harder to define when models are useless in terms of sine waves.

Truthseeker
March 16, 2014 8:42 pm

jai mitchell says:
March 16, 2014 at 2:39 pm
. . .hilarious. . .
—————————————————-
You quote skepticalscience, and you laugh at other people?
Now that is hilarious.

DocMartyn
March 16, 2014 8:50 pm

Richard, does the fact that months are of unequal length screw the plot.

Lance Wallace
March 16, 2014 9:17 pm

bones says:
March 16, 2014 at 6:12 pm
“Last time I looked, the atmospheric CO2 concentration was increasing at about 5% per decade, which gives it a doubling time of about 140 years.”
Bones, recall we are looking at the doubling time of the anthropogenic contribution. One must subtract the constant value of preindustrial times, as I mentioned. The best-fit exponential says that value is about 256.7 ppm. It has been more than 42 years since the first Mauna Loa measurements in March of 1958 so we should see the doubling occurring around the year 2000. In fact that first observation in 1958 was about 314.4 ppm–subtracting 256.7 we get 57.7 ppm added to the preindustrial. When that hits 115.4 we have our doubling. That happened in December 2001, when the observed value was 372.2 ppm. (subtracting 115.4 gets us back to the preindustrial level.) Ergo, observed doubling time is 43 years, pretty close to the calculated 42.
If you want to know the time until the preindustrial concentration is doubled, extending this present best-ft exponential into the future shows it crossing a value of 512 ppm CO2 in 2050. Some say the preindustrial concentration was 280 ppm, so the doubling to 560 ppm occurs in 2061. Note this is not a prediction, just a report on what the present best-fit 3-parameter exponential through Sept 2013 shows. On the other hand,considering the track record of this fit, (<1% error for some 650 consecutive months), I wouldn't bet against it.

Richard
March 16, 2014 9:18 pm

Hello Richard,
I have had an opportunity to look at your spreadsheet, btw, thanks for sharing it. I have some observations and questions about the methods used.
1. You use three non causal cascaded FIR filters, with 8, 10, and 12 taps respectively.
2. These are just plain averaged filters without any weighting.
3. While doing this will provide you a pseudo Gaussian response, why not use some weighting parameters for a Gaussian response?
4. By using even number taps you are weighting the averages to one side or the other in the time series, and this will affect your phase relationship.
5. Were any windowing corrections considered?
6. Were there any Fourier analysis performed on the data before and after filtering?
I use filters on data all of the time in my work, and am always concerned when someone does filtering and fails to mention Fco, Order, and Filtering Function (i.e. Bessel, Butterworth, Gaussian, etc).
Thanks again,
Richard

FrankK
March 16, 2014 9:55 pm

jai mitchell says:
March 16, 2014 at 2:39 pm
“Since this constant addition of CO2 is not causing any warming it follows that the theory of enhanced greenhouse warming is defective. It does not work and should be discarded.”
. . .hilarious. . .
——————————————————————————————————–
Even more hilarious is if you do the calcs for the volume of ocean down to the depth quoted then the graph you quote represents only a few hundredths of 1 degree Centigrade rise in temperature. i.e. 2/5ths of bugger-all. And there is no evidence that the CO2 bogey is responsible.

Bernie Hutchins
March 16, 2014 9:56 pm

RichardLH
Regarding our Savitzky-Golay Discussion: Here is a less cryptic but just single page outline of the SG.m program (first 20 trivial lines at most!) , offered with the Matlab pinv function, and without it, making it possible with any math program that will invert an ordinary square matrix. The example is order three (order 3 polynomial), length seven (7 points moving through smoother). The URL of the app note is on the jpg and in the thread above.
http://electronotes.netfirms.com/SGExample.jpg
Bernie

March 16, 2014 10:02 pm

In examining share (stock) prices I use a multiple moving average tool (in the Metastock program). I have given the short term moving averages (less than 30 days) a different color to the long term moving averages (30 to 60 days). When the long term and short term cross to is a signal of a change which can be used as a buy or sell (with other signals). Have you considered other time periods for your filter and putting them on the same graph. maybe that will point to the unwarranted adjustments made to data or a significant natural shift which many say occurred around 1975 (at least in the Pacific & nothing to do with CO2)
Talking of CO2, there were measurements going back to the early 1800’s see here http://www.biomind.de/realCO2/realCO2-1.htm . There many be questions about accuracy but results show variations and throw doubt on the ASSUMED constant CO2 levels prior to 1966.

Jeef
March 16, 2014 10:02 pm

On the Y axis: CO2 concentration
On the X axis: temperature over time
X and Y are not necessarily connected. Unless funding depends on a shallow statement in your research paper conclusion to allow this.

Bernie Hutchins
March 16, 2014 10:27 pm

wbrozek says in part March 16, 2014 at 8:33 pm
“In that case, how would you prove that the warming that is occurring now is not catastrophic?”
You ask a good question and since no one else is apparently still up, I will give it a shot.
In physics, particularly with regard to chaotic non-linear systems (like the climate), but still constrained, there is really no such thing as “proof”. There is instead evidence. The various so-call “global temperature” series, although surely imperfect, are evidence (not proof) that the catastrophic “hockey stick” warming is likely wrong.
The closest thing to establishing the truth is almost certainly the 2nd Law of Thermodynamics. Essentially heat moves from where there is more to where there is less. If the path of flow is not obviously in place, Nature is OBLIGED to provide it. In consequence, certain negative feedback thermostatting mechanisms are mandatory. No engineer or HVAC technician required. The great physicist (is there any other kind of physicist) Eddington famously and amusingly said, often quoted here I believe:
“If someone points out to you that your pet theory of the universe is in disagreement with Maxwell’s equations—then so much the worse for Maxwell’s equations. If it is found to be contradicted by observation—well these experimentalists do bungle things sometimes. But if your theory is found to be against the second law of thermodynamics I can give you no hope; there is nothing for it but to collapse in deepest humiliation.”
Any folks we know!

Editor
March 17, 2014 12:22 am

First, Richard, thanks for your work. Also, kudos for the R code, helped immensely.
My first question regarding the filter is … why a new filter? What defect in the existing filters are you trying to solve?
Now, be aware I have no problem with seeking better methods. For example, I use my own method for dealing with the end points problem, as I discussed here.
However, you say:

A 12 month/365 day CTRM filter completely removes the annual ‘cycle’, as the CTRM is a near Gaussian low pass filter. In fact it is slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ whereas true Gaussian leaves a small residual of that still in the data.

Mmmm … if that’s the only advantage, I’d be hesitant. I haven’t run the numbers but it sounds like for all practical purposes they would be about identical if you choose the width of the gaussian filter to match … hang on … OK, here’s a look at your filter versus a gaussian filter:

As you can see, the two are so similar that you cannot even see your filter underneath the gaussian filter … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?
Next, you go on to show the following graphic and comment:

Fig 3: HadCRUT4 with additional greater than 15 year low pass. Greater than 75 year low pass filter included to remove the red trace discovered by the first pass.
Now, when reviewing the plot above some have claimed that this is a curve fitting or a ‘cycle mania’ exercise. However, the data hasn’t been fit to anything, I just applied a filter. Then out pops some wriggle in that plot which the data draws all on its own at around ~60 years. It’s the data what done it – not me! If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there. That’s what the DATA says is there.

There is indeed a “wiggle” in the data, which incidentally is a great word to describe the curve. It is a grave mistake, however, to assume or assert that said wiggle has a frequency or a cycle length or a phase. Let me show you why, using your data:

The blue dashed vertical lines show the troughs of the wiggles. The red dashed vertical lines show the peaks of the wiggles.
As tempting as it may be to read a “cycle” into it, there is no “~ 60 year cycle”. It’s just a wiggle. Look at the variation in the lengths of the rising parts of the wiggle—18 years, 40 years, and 41 years. The same is true of the falling parts of the wiggle. They are 29 years in one case and 19 years in the other. Nothing even resembling regular.
The problem with nature is that you’ll have what looks like a regular cycle … but then at some point, it fades out and is replaced by some other cycle.
To sum up, you are correct that “what you can’t do is say the wriggle is not there”. It is there.
However it is not a cycle. It is a wiggle, from which we can conclude … well … nothing, particularly about the future.
Best regards,
w.

Greg Goodman
March 17, 2014 12:34 am

Bart : http://s1136.photobucket.com/user/Bartemis/media/CO2GISS.jpg.html?sort=3&o=0
That is very interesting.The short term (decadal) correlation of d/dt(CO2) to SST has been discussed quite at bit and is irrefutable. The ice core records seem to show direct CO2 vs temp correlation. So the question is where in between does it change? There will be a sliding mix as the response changes from one extreme to the other.
If earlier data diverges it may be the first indication of the centennial relationship or equally likely an indication of spurious data adjustments. That would be worth investigating.
Do you have a similar graph that goes back to say 1850?

Greg Goodman
March 17, 2014 12:48 am

Willis: “As you can see, the two are so similar that you cannot even see your filter underneath the gaussian filter … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?”
It is not indistinguishable , as Richard correctly says it totally removes an annual signal. You attempt to show this is “indistiguishable” by using them to filter temperature “anomaly” data that has already had most of the annual signal removed.
Now do the same with actual temps and you will see the difference. The advantage of having a filter than can fully remove the huge annual cycle is that you don’t need to mess about with “anomaly” data which themselves leak , possibly inverted 12mo signal as soon as the annual cycle changes from that of the reference period.
The one defect I see with triple RM filters is that they are a three pole filter and thus start to attenuate the signal you want to keep ( gaussian being similar has the same defect).
SG has a nice flat pass band, as does the lanczos filter that I also provided code and graphs for in my article.
Perhaps Richard could convert the lanczos into R code too 😉

Greg Goodman
March 17, 2014 12:57 am

Here is an example of the lanczos filter used on daily satellite TLS data, to remove annual signal.
http://climategrog.wordpress.com/?attachment_id=750

Greg Goodman
March 17, 2014 1:10 am

Comparing Lanczos to two (very similar) variations of the triple running mean : http://climategrog.wordpress.com/?attachment_id=659

Greg Goodman
March 17, 2014 1:25 am

Willis: “As tempting as it may be to read a “cycle” into it, there is no “~ 60 year cycle”. It’s just a wiggle. ”
There is a long term change in there too Willis. Cooling at end of 19th c , warming since beginning of 20th. When you add a slope to a pure cosine you will find that it shifts the peaks and troughs, when you have differing slopes behind it they will get moved back and forth. That is the cause of the irregularity of the intervals you have noted.
Once again, you display your lack of understanding and over-confidence in your own abilities, then start telling others the way it is. You, like our host, seem to have adopted “the science is settled” attitude to cycles. Just remember that science is never settled and keep an open mind.
It is true that there is no guarantee that this will continue but so for it seems to fit quite well to the “pause” now becoming a down turn since 2005.

Greg Goodman
March 17, 2014 1:48 am

Nice article Richard.
Once criticism is the 75y RM . Having roundly accepted how bad and distorting these damn things are you still put one in there. And what is shows is not that much use even if we could believe it.
In general, when the kernel length appears to be a problem you will get a “smoother”, non distorted result with 3RM of half the period. That will not help remove a 60 periodicity.
BTW, I think this ‘cycle’ is closer to a full-wave rectified cosine : abs(cos(x)) than a fully harmonic oscillation. Heavy filtering tends to round things off a bit two much and everything ends up looking like cosines.
This seems to fit arctic ice, though it’s way too soon to see how this up tick will go.
http://climategrog.wordpress.com/?attachment_id=783
Similar abs_cos form in AMO and cyclone ACE:
http://climategrog.wordpress.com/?attachment_id=215
My money is on circa 128y abs_cos rather than 60y full cosine. Fourier techniques will tend to mask this too unless the analyst looks carefully at the presence of harmonics.
Great article though. The traffic on WUWT should be helpful in getting greater awareness of the defects and distortions of runny means.

Greg Goodman
March 17, 2014 1:49 am

two much = too much 😉

Kelvin vaughan
March 17, 2014 1:59 am

Filter it enough and you will end up with a straight line showing a slow rise in temperature.

george e. smith
March 17, 2014 2:05 am

Filters throw away information.
The raw data, is the most information you can ever have.
Filters simply delude one into believing that something else is happening; other that what the measured sampled data values tell.

Greg Goodman
March 17, 2014 2:16 am

“Filters throw away information.”
No, filters separate information. What you “throw” away is your choice.
“Filters simply delude one into believing that…”
Delusion is in the eye of the beholder. That some may draw wrong conclusions or make inappropriate analyses is not a reason never to analyse data.
This whole “the data IS the data” meme is totally ignorant. You are not going to learn much by staring at noisy dataset with a massive annual signal except that the data is noisy and has a strong annual cycle. If that is all you feel competent to do them by all means stay where you feel comfortable. That does not means that no one ever got any useful information and that centuries of expertise in data processing are nothing but delusional.
Get real.

Greg Goodman
March 17, 2014 2:24 am

Bernie Hutchins. “… the instructor’s words were exactly: “Astoundingly, this is an FIR filter.” I WAS astounded ”
So am I.
I’ve always been dubious of LOESS filters because they have a frequency response that changes with the data content. I’ll have to have a closer look at SG.
It does have a rather lumpy stop band though. I tend to prefer Lanczos for a filter with a flat pass band ( the major defect of gaussian and the triple RM types ).

Orson
March 17, 2014 2:38 am

The back-and-forth comments from our more mathematically learned friends is quite excellent for us noobs. All too often, technical analysis goes under the radar of ordinary minds. Thanks, Anthony, for giving us a chance to see some ‘worksheets’ being used…up?

March 17, 2014 2:45 am

I am very grateful to RichardLH for his courteous answers to my questions.
But I should like more detail on why the particular function he has chosen is any better at telling us what is happening in the data than the linear trends used by the IPCC etc. No trend on stochastic data has any predictive value. And we already know from the linear trend on global temperature that the rate of warming is very considerably below what was predicted – indeed, that there has been no global warming to speak of for about a quarter of a century. So, what does your curve tell us about global mean surface temperature that we do not already know?

Editor
March 17, 2014 2:52 am

Greg Goodman says:
March 17, 2014 at 12:48 am

Willis:

“As you can see, the two are so similar that you cannot even see your filter underneath the gaussian filter … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?”

It is not indistinguishable , as Richard correctly says it totally removes an annual signal. You attempt to show this is “indistiguishable” by using them to filter temperature “anomaly” data that has already had most of the annual signal removed.
Now do the same with actual temps and you will see the difference. The advantage of having a filter than can fully remove the huge annual cycle is that you don’t need to mess about with “anomaly” data which themselves leak , possibly inverted 12mo signal as soon as the annual cycle changes from that of the reference period.

Not true in the slightest, Greg. You really should try some examples before uncapping your electronic pen.

Once again, the CTRM filter underneath the gaussian filter is scarcely visible … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?
w.

cd
March 17, 2014 3:04 am

Richard or others
Nice concise post. I have three questions – sorry.
1) Are the cascade filters based on recursive operations using some basis functions? If yes, then is this not akin to curve fitting? I’ve probably misunderstood but I think you’ve taken issue with this before.
2) In the field I work in we tend to use the Butterworth filter extensively; but in light of the recent post on “signal stationarity” in WUWT, does such an approach seem inappropriate if the composed data series is the sum of many non-stationary signals. I suspect that there will be major issues with significant localised phase shifts (something I understand to be a problem with the Butterworth filter even with “perfect” data sets).
3) Finally, for the purposes of quant operations, is the filtered data any use beyond signal analysis? For example, even if one uses the best filtering method, does the resulting processed series not take with it significant bias: when one tries to statistically quantity something like correlation between two filtered data sets. This seems a little open ended I know, but you see this type of analysis all the time.

Editor
March 17, 2014 3:12 am

Greg Goodman says:
March 17, 2014 at 1:25 am

Willis: “As tempting as it may be to read a “cycle” into it, there is no “~ 60 year cycle”. It’s just a wiggle. ”
There is a long term change in there too Willis. Cooling at end of 19th c , warming since beginning of 20th. When you add a slope to a pure cosine you will find that it shifts the peaks and troughs, when you have differing slopes behind it they will get moved back and forth. That is the cause of the irregularity of the intervals you have noted.

And you know that this “is the cause of the irregularity” exactly how?

Once again, you display your lack of understanding and over-confidence in your own abilities, then start telling others the way it is.

Me? I’m the one saying it’s premature to claim cycles with periods in the temperature record. You’re the one trying to tell us the way it is. You’re the one claiming that there are regular cycles in the temperature data. I’m the one saying we don’t know the way it is, and until we do it’s foolish to ascribe it to some mythical “approximately sixty year” cycles …
w.
PS—A fully worked out example, showing the “differing slopes” and how they change a pure cosine wave into the shape shown by the HadCRUT4 data would advance your cause immensely … remember, according to you, you can do it using only a single pure cosine wave plus “differing slopes behind it”. Note that you need to have upswings of 18 years, 40 years, and 41 years. You also need to have downswings of 29 years in one case and 19 years in the other.
I say you can’t do that with only the things you described, a single pure cosine wave plus differing slopes. I say you’re just waving your hands and making things up in an unsuccessful attempt to show that I’m wrong.
But hey, I’m willing to be convinced … break out the pure cosine wave and the differing slopes and show us how it’s done!

steveta_uk
March 17, 2014 3:12 am

Richard, can you design a filter to remove comments from the people who don’t believe the data in the first place?
Why they bother commenting is a mystery to me ;(

RichardLH
March 17, 2014 3:32 am

davidmhoffer says:
March 16, 2014 at 7:34 pm
“RichardLH
Pulsed input is also integrated by matter as well so I will still differ.
>>>>>>>>>>>>>>>.
It most certainly is not.”
It most certainly is. Both inputs and outputs are. That was you original point you made and I still dispute your claim.
“The tropics are net absorbers of energy, the arctic regions are net emitters. Massive amounts of energy are moved from tropics to arctic regions by means of air and ocean currents, and these are not integrated by matter as they would be if the sole mechanism was conductance.”
Mass transport, phase change and the like are different beasts. They have their own rules but none of them alter what I said.
“You’ve got an interesting way of looking at the data, just apply it to data that is meaningful.”
Thank you, I do.

D. Cohen
March 17, 2014 3:33 am

What happens when you apply the same filters to the sunspot data over the last 150 or so years?

RichardLH
March 17, 2014 3:42 am

Stephen Rasey says:
March 16, 2014 at 7:45 pm
“A beautiful example of frequency content that I expect to be found in millennial scale uncut temperature records is found in Lui-2011 Fig. 2.”
Ah – the power spectrum graphs.
“Lui sees a minor 60 (or 70) year cycle. But it seems less significant that the others.”
The problem with all these sort of studies is the same as with FFTs and Wavelets (see above). Great if you have noise free data. The more the noise you have, the less the usefulness.
Also that point about ‘half wave’ mixing applies. Nature rarely does things with full sine waves (unless it is a tuned string or the like). Most stuff is a lot more complex and very, very noisy/erratic.
Stuff comes and goes into than noise and makes it very difficult to see reliably. Proxy data is usually worse as each one adds its own variety of noise to pollute the answer.
So with a long proxy record there are some with the resolution required to see a ~60 year record. Shen is one ftp://ftp.ncdc.noaa.gov/pub/data/paleo/historical/pacific/pdo-shen2006.txt which is a rainfall re-construction of the PDO. http://climatedatablog.wordpress.com/pdo/
I am always looking for other with the required resolution. That is one of the points of this thread.

RichardLH
March 17, 2014 3:45 am

Bernie Hutchins says:
March 16, 2014 at 7:46 pm
“Unless you have a physical argument of other evidence, curve fitting is near useless. And the physics – let’s call it “incomplete”. Again, data fit to a curve is not THE data. ”
But this is most definitely NOT a curve fitting exercise! Indeed this is the exact opposite. If the curve the data draws by low pass filtering does not match your theory then your theory is wrong. This helps (I hope) in that endeavour.

RichardLH
March 17, 2014 3:47 am

John West says:
March 16, 2014 at 8:17 pm
“Sorry guys but this is absolutely worthless. ”
IYHO presumably. This does allow quite a detailed look at the temperature series that are available. It allows for meaningful comparisons between those series.

RichardLH
March 17, 2014 3:50 am

DocMartyn says:
March 16, 2014 at 8:50 pm
“Richard, does the fact that months are of unequal length screw the plot.”
True I adjust them all to a 1/12 spacing (these are really tiny bar graphs with a width of the sample rate) so there is some jitter associated with that. I don’t think it will affect the outcome in any detectable way.

RichardLH
March 17, 2014 3:57 am

Richard says:
March 16, 2014 at 9:18 pm
“I have had an opportunity to look at your spreadsheet, btw, thanks for sharing it.”
Thank you.
“I have some observations and questions about the methods used.
1. You use three non causal cascaded FIR filters, with 8, 10, and 12 taps respectively.”
Correct
” 2. These are just plain averaged filters without any weighting.”
Correct
” 3. While doing this will provide you a pseudo Gaussian response, why not use some weighting parameters for a Gaussian response?”
Because creating a function that provides a true Gaussian Kernel and then iterating that kernel over the input doesn’t provide any more accuracy. Occam’s Razor really.
“4. By using even number taps you are weighting the averages to one side or the other in the time series, and this will affect your phase relationship.”
True but for 12 months there is no choice. The spreadsheet does take account of this as the middle column is one row higher to help reduce this. In the end it is only a small phase shift which you can correct for if you believe it is required.
” 5. Were any windowing corrections considered?”
The 1.2067 inter-stage multiplier/divider corrects all/most of the square wave sampling errors if that is what you are asking.
” 6. Were there any Fourier analysis performed on the data before and after filtering?”
No.
“I use filters on data all of the time in my work, and am always concerned when someone does filtering and fails to mention Fco, Order, and Filtering Function (i.e. Bessel, Butterworth, Gaussian, etc).”
As I said, a CTRM is a near (very, very near) Gaussian function.
For frequency responses see Greg’s work
http://climatedatablog.files.wordpress.com/2014/02/fig-2-low-pass-gaussian-ctrm-compare.png
http://climatedatablog.files.wordpress.com/2014/02/fig-1-gaussian-simple-mean-frequency-plots.png

RichardLH
March 17, 2014 4:00 am

Bernie Hutchins says:
March 16, 2014 at 9:56 pm
“Regarding our Savitzky-Golay Discussion: Here is a less cryptic but just single page outline of the SG.m program (first 20 trivial lines at most!) ”
Well R already contains a Savitzky-Golay which I use in a mutli-pass form

#”I ran a 5 pass-multipass with second order polynomials on 15 year data windows as per the Savitzky-Golay method.” Nate Drake PhD
SavitzkyGolay <- function(data, period=12)
{
f1 = period * 2 + 1
SavitzkyGolay = signal::sgolayfilt(data,n=f1)
SavitzkyGolay = signal::sgolayfilt(SavitzkyGolay,n=f1)
SavitzkyGolay = signal::sgolayfilt(SavitzkyGolay,n=f1)
SavitzkyGolay = signal::sgolayfilt(SavitzkyGolay,n=f1)
SavitzkyGolay = signal::sgolayfilt(SavitzkyGolay,n=f1)
}

RichardLH
March 17, 2014 4:14 am

Willis Eschenbach says:
March 17, 2014 at 12:22 am
“First, Richard, thanks for your work. Also, kudos for the R code, helped immensely.”
Thank you.
“My first question regarding the filter is … why a new filter? What defect in the existing filters are you trying to solve?”
Simplicity and accuracy.
“Mmmm … if that’s the only advantage, I’d be hesitant. I haven’t run the numbers but it sounds like for all practical purposes they would be about identical if you choose the width of the gaussian filter to match … hang on … OK, here’s a look at your filter versus a gaussian filter:…As you can see, the two are so similar that you cannot even see your filter underneath the gaussian filter … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?”
Actually it is just slightly better than a Gaussian. It completely removes the 12 month cycle rather than leaving a small sliver of that still in the output.
http://climatedatablog.files.wordpress.com/2014/02/fig-2-low-pass-gaussian-ctrm-compare.png
“There is indeed a “wiggle” in the data, which incidentally is a great word to describe the curve. It is a grave mistake, however, to assume or assert that said wiggle has a frequency or a cycle length or a phase.”
The choice of words was because I know I can’t prove a ‘cycle’ with what we have. Doesn’t mean you cannot observe what is there though.
“Let me show you why, using your data: The blue dashed vertical lines show the troughs of the wiggles. The red dashed vertical lines show the peaks of the wiggles. As tempting as it may be to read a “cycle” into it, there is no “~ 60 year cycle”. It’s just a wiggle. Look at the variation in the lengths of the rising parts of the wiggle—18 years, 40 years, and 41 years. The same is true of the falling parts of the wiggle. They are 29 years in one case and 19 years in the other. Nothing even resembling regular.”
Hmmm. I would question your choice of inflexion points. To do it properly it would probably be best to de-trend the curve first with the greater than 75 years line (not a straight line!) to get the central crossing points and then do any measurements. Peak to Peak is always subject to outliers so is usually regarded as less diagnostic. But as there are only two cycles this is all probably moot anyway. If there is anything else mixed in with this other than pure sine waves then all bets are off for both period, phase and wave shape.
I just display what is there and see where it goes.
“The problem with nature is that you’ll have what looks like a regular cycle … but then at some point, it fades out and is replaced by some other cycle. ”
The interesting thing is when you do comparisons to some proxy data with the required resolution.
Then out pops some possible correlation that does need addressing.
“To sum up, you are correct that “what you can’t do is say the wriggle is not there”. It is there. However it is not a cycle. It is a wiggle, from which we can conclude … well … nothing, particularly about the future. ”
Well the 15 year S-G trend says the immediate future is downwards. You could conclude that, if the ~60 year ‘cycle’ repeats, then the downwards section is going to be 30 years long. Time alone will tell.

March 17, 2014 4:18 am

I have used Fourier analysis successfully to detect periodic and pseudo-periodic effects on data. When viewed in the frequency space, the peaks in the Fourier Transform identify such effects, while shallows, regions in the transform space where peaks are notably absent, identify ‘sweet spots’ where the cutoffs for filters can be tuned.
When applied to sunspot numbers. for example, there is a group of peaks corresponding to 11-13 year periods and a secondary group at about a 100 year period, but relatively little amplitude in between.
That is about the limit of applicability of the Fourier transform, however. The periodicity inherent in the mathematics makes it useless for extrapolations or for dealing with secular trends.This will allow optimization of tuned filters, however.

davidmhoffer
March 17, 2014 4:19 am

RichardLH;
It most certainly is.
>>>>>>>>>>>>>>>>>>>>>
How sad that both warmists and skeptics have become so obsessed with the measurement of average temperature and the analysis of it that we’ve completely forgotten what the original theory was that we were trying to prove or disprove. The theory is that doubling of CO2 increases downward LW by 3.7 w/m2. So, to determine if the theory is correct or not, we run right out and analyze a metric that has no linear relationship to w/m2 at all, and rationalize that somehow this is OK because the earth integrates it out for us.
One of the first things one ought to learn about understanding a problem is to use data that is as closely related to the problem as possible. Temperature is one order of data removed from the actual problem. Supposing that it is “integrated” is just such a supposition. It cannot be verified except by looking at the root data first and see if that theory holds. If we designed planes and bridges using second order data, there’s be a lot of engineers with their @ss on the line trying to explain why they did such an incredibly stupid thing and wound up responsible for the deaths of so many people.
But this is climate science so wild @ss guesses based on 2nd order data are OK for both advancing and refuting the theory.
pfffffft.

RichardLH
March 17, 2014 4:21 am

Greg Goodman says:
March 17, 2014 at 1:48 am
“Nice article Richard.
Once criticism is the 75y RM . Having roundly accepted how bad and distorting these damn things are you still put one in there. And what is shows is not that much use even if we could believe it. ”
Thanks. You and Vaughan got me into using the correct 1.2067 value in the first place.
The single mean for the 75 year is Hobson’s choice really. The data is just not long enough for a full triple pass. Most of the error terms fall outside the pass band hopefully. I could use a 75 year S-G and, if I were to do any de-trending work, that is what I would probably use.
What it does show quite clearly that all the ‘cycle’ is gone at 75 years which was the main point. Whatever is there, it lies between 15 and 75 years and looks to be ~60 years long.
“Great article though. The traffic on WUWT should be helpful in getting greater awareness of the defects and distortions of runny means.”
Thanks again. Single means are just SO bad.

RichardLH
March 17, 2014 4:23 am

Kelvin vaughan says:
March 17, 2014 at 1:59 am
“Filter it enough and you will end up with a straight line showing a slow rise in temperature.”
Actually you don’t. You end up with the greater than 75 year curve which is at the limit of what the data shows.

RichardLH
March 17, 2014 4:26 am

george e. smith says:
March 17, 2014 at 2:05 am
“Filters throw away information. The raw data, is the most information you can ever have. Filters simply delude one into believing that something else is happening; other that what the measured sampled data values tell.”
Not true. Low pass/High pass filters/splitters always add together to produce the whole of the input signal. No data is ever ‘lost’. You just get to look at a sub-section of it.
If you want the High pass values do a 1-x (i.e. take the output from the 15 year low pass and subtract it from the input signal making sure you get the phase right – which will be a challenge for the 12 month) and you will get that data instead.

RichardLH
March 17, 2014 4:32 am

Monckton of Brenchley says:
March 17, 2014 at 2:45 am
“I am very grateful to RichardLH for his courteous answers to my questions. ”
No problem.
“But I should like more detail on why the particular function he has chosen is any better at telling us what is happening in the data than the linear trends used by the IPCC etc.”
A linear trend is the ultimate in removing useful information.
Decide for yourself if this treatment of the UAH data for Land, Ocean and Combined provides a better insight into what happened in the last 34 years or a set of OLS straight lines.
http://climatedatablog.files.wordpress.com/2014/02/uah-global.png
“So, what does your curve tell us about global mean surface temperature that we do not already know?”
That things are much more naturally cyclic than the IPCC would like to concede. The biggest thing that the IPPC try to avoid is “how much of all of this is natural?”

RichardLH
March 17, 2014 4:33 am

Willis Eschenbach says:
March 17, 2014 at 2:52 am
“Once again, the CTRM filter underneath the gaussian filter is scarcely visible … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?”
Try that again without the full data points to make the difference almost two flat lines and you will see.

RichardLH
March 17, 2014 4:38 am

cd says:
March 17, 2014 at 3:04 am
“Nice concise post. I have three questions – sorry.”
Thanks and no problem.
“1) Are the cascade filters based on recursive operations using some basis functions? If yes, then is this not akin to curve fitting? I’ve probably misunderstood but I think you’ve taken issue with this before.”
No – this is most definitely NOT curve fitting. It is just a high quality low pass filter that is simple to implement. Much purer mathematically than the single running means often used.
“2) In the field I work in we tend to use the Butterworth filter extensively; but in light of the recent post on “signal stationarity” in WUWT, does such an approach seem inappropriate if the composed data series is the sum of many non-stationary signals. I suspect that there will be major issues with significant localised phase shifts (something I understand to be a problem with the Butterworth filter even with “perfect” data sets).”
This is pure Gaussian – or even slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ without any additional distortions.
“3) Finally, for the purposes of quant operations, is the filtered data any use beyond signal analysis? For example, even if one uses the best filtering method, does the resulting processed series not take with it significant bias: when one tries to statistically quantity something like correlation between two filtered data sets. This seems a little open ended I know, but you see this type of analysis all the time.”
Well the option exists to do straight comparisons with other similarly filtered signals to see what differences, co-incidence exists without moving to statistics. You know what they say about statisticians 🙂

RichardLH
March 17, 2014 4:39 am

steveta_uk says:
March 17, 2014 at 3:12 am
“Richard, can you design a filter to remove comments from the people who don’t believe the data in the first place? Why they bother commenting is a mystery to me ;(”
Beyond my capabilities I’m afraid.

RichardLH
March 17, 2014 4:40 am

D. Cohen says:
March 17, 2014 at 3:33 am
“What happens when you apply the same filters to the sunspot data over the last 150 or so years?”
I haven’t looked at the sunspot data yet.

RichardLH
March 17, 2014 4:43 am

March 17, 2014 at 4:18 am
“I have used Fourier analysis successfully to detect periodic and pseudo-periodic effects on data.”
FFTs suffer great problems with noisy data. Especially at the longer wavelengths as you approach the data series length. Given the noise in the temperature data series I would be very cautious with anything that has less than about 5 (or better still 10) cycles in it when using FFTs..

RichardLH
March 17, 2014 4:46 am

davidmhoffer says:
March 17, 2014 at 4:19 am
“How sad that both warmists and skeptics have become so obsessed with the measurement of average temperature and the analysis of it that we’ve completely forgotten what the original theory was that we were trying to prove or disprove. The theory is that doubling of CO2 increases downward LW by 3.7 w/m2. So, to determine if the theory is correct or not, we run right out and analyze a metric that has no linear relationship to w/m2 at all, and rationalize that somehow this is OK because the earth integrates it out for us.”
I notice that you kinda skipped the part where I refuted your claims. Matter integrates incoming and outgoing power, continuous and pulsed, and shows the result as temperature. Or are you claiming different?
“pfffffft.”
Indeed.

Edim
March 17, 2014 4:51 am

Richard, just used your woodfortrees temperature graph and changed to ssn:
http://www.woodfortrees.org/plot/sidc-ssn/plot/sidc-ssn/mean:220/mean:174/mean:144/plot/sidc-ssn/mean:720

davidmhoffer
March 17, 2014 4:52 am

RichardLH;
I notice that you kinda skipped the part where I refuted your claims. Matter integrates incoming and outgoing power, continuous and pulsed, and shows the result as temperature. Or are you claiming different?
>>>>>>>>>>>>>>>>>>
On a long enough time scale maybe. On the time scale you are using here, absolutely not.
A perturbation to the system which changes the transport of energy from tropics to poles will result in a smaller change in temperature to the tropics and a larger one in the arctic regions. Average the temperatures out on a time scale too short to integrate them out and you get a result that is over represented by changes to cold temperatures and under represented by changes to warm temperatures.
We’re talking about a perturbation to the system that is decades to centuries long by the most conservative estimates. So no, you cannot rely on matter to integrate the result for you on time scales that you are using.

RichardLH
March 17, 2014 5:10 am

davidmhoffer says:
March 17, 2014 at 4:52 am
“On a long enough time scale maybe. On the time scale you are using here, absolutely not.”
On any time scale from milliseconds (or below) on upwards actually. Individual chunks of matter do this all the time. Which was the point I was refuting and you keep dodging.
“A perturbation to the system which changes the transport of energy from tropics to poles will result in a smaller change in temperature to the tropics and a larger one in the arctic regions. Average the temperatures out on a time scale too short to integrate them out and you get a result that is over represented by changes to cold temperatures and under represented by changes to warm temperatures.”
I agree that to get to a steady state through mass transport with the varying inputs we have will take a very long time and probably be cyclic. ~60 years would fall into that category. The greater than 75 years would be another.
“We’re talking about a perturbation to the system that is decades to centuries long by the most conservative estimates. So no, you cannot rely on matter to integrate the result for you on time scales that you are using.”
Well as that is how it does it I am not sure you are supporting your case.

RichardLH
March 17, 2014 5:14 am

Edim says:
March 17, 2014 at 4:51 am
“Richard, just used your woodfortrees temperature graph and changed to ssn:
http://www.woodfortrees.org/plot/sidc-ssn/plot/sidc-ssn/mean:220/mean:174/mean:144/plot/sidc-ssn/mean:720
That would have been Greg’s graph with those values.
I would have used
http://www.woodfortrees.org/plot/sidc-ssn/plot/sidc-ssn/mean:360/mean:298/mean:204
which does raise some interesting questions.

Steve Taylor
March 17, 2014 5:15 am

Richard,
John Linsley Hood was one of the greatest analogue electronic design engineers the UK has ever produced. I wish I’d known him. I spent a lot of my time at university back in the 80’s building his designs when I should have been using my project budget on, well, my project.
I am sad he’s gone, but I don’t think he’ll be forgotten.

cd
March 17, 2014 5:20 am

RichardLH
With regard to your answer to my question 2. I appreciate that the Butterworth filter (frequency domain) can be seen as an approximation of a Gaussian convolution given complementary parameters. And I’m not talking about the relative merits of either. I guess what I’m really asking is: Is there a problem in designing a filter based on spectral profile (“hard” approach ), as opposed to eye-balling the parameter in the host domain (“soft” approach), if the composed signal (the raw data) is the sum of many non-stationary signals?
In my field of study, the Butterworth is effectively a passband filter where the higher frequencies are typically removed, according to the filter parameters which are determine from the series’ spectral profile. But the recent discussion, with respect to signal stationarity, would deem such an approach foolhardy rather than proper, and yet it’s used extensively.
I was interested in hearing your views on this if you have time.
I hope it doesn’t seem off topic.

davidmhoffer
March 17, 2014 5:21 am

RichardLH;
Well as that is how it does it I am not sure you are supporting your case.
>>>>>>>>>>>>
Doubling of CO2 changes the effective black body temperature of earth by precisely zero. There is your integration. Yet it increases downward LW flux. The latter is not integrated on the time scales that you are using, and you can’t measure it as an average by measuring and averaging temperature instead because temperature varies with the 4th root of P.

RichardLH
March 17, 2014 5:25 am

Steve Taylor says:
March 17, 2014 at 5:15 am
“John Linsley Hood was one of the greatest analogue electronic design engineers the UK has ever produced. I wish I’d known him. I spent a lot of my time at university back in the 80′s building his designs when I should have been using my project budget on, well, my project.
I am sad he’s gone, but I don’t think he’ll be forgotten.”
Between them he and my father (still alive) brought me up to be interested in engineering and science from a very early age. I doubt he will be forgotten. He used two very simple tools in his work, a precise notch [filter] and a very pure sine wave generator. Allowed him to look in great detail at the work he did with the results you saw.

RichardLH
March 17, 2014 5:26 am

EDIT; a precise notch filter

Edim
March 17, 2014 5:53 am

One is tempted to conclude,
“It’s the Sun stupid!”

Matthew R Marler
March 17, 2014 6:12 am

RichardLH: But this is most definitely NOT a curve fitting exercise! Indeed this is the exact opposite.
Using a word from Willis Eschenbach’s post (the word, I am not attributing an idea to him, though I agree with his main point), what you have done is indistinguishable from curve fitting. “Filtering” is nothing more than than fitting data by a method that uses a set of basis functions, and then separating the results into two components (as said by Greg Goodman), one fit by the method and the basis functions, the other not.
If you know the noise, you can design a filter to reveal the signal. If you know the signal, you can design a filter to reveal the noise. When, as with modern temperature series, you have no claim to know either, you can fit dozens, maybe hundreds (Donoho calls the method “basis pursuit”) of filters equally well to the same data set, each claiming some superiority in some cases. With some filters, “the wiggle is there”, as you wrote; with some filters the wiggle is not there. Whether the “wiggle” is generated by an underlying process that persists and can be described by physical laws can not be determined by looking at the filtering results on the extant data..

RichardLH
March 17, 2014 6:20 am

cd says:
March 17, 2014 at 5:20 am
“I guess what I’m really asking is: Is there a problem in designing a filter based on spectral profile (“hard” approach ), as opposed to eye-balling the parameter in the host domain (“soft” approach), if the composed signal (the raw data) is the sum of many non-stationary signals?”
I guess what I am really saying is that by only making a simple 15 years low pass corner frequency choice I am not making any (or very few) assumptions about all of this and what is or is not present.
If you run a standard ‘sweep’ up the available bandwidth, then there is a ‘sweet spot’ between 12-20 years where the output doesn’t change much.
Typically that is always going to be a good choice for a binary chop point, hence the 15 year corner.
The rest is what then comes out.

David L. Hagen
March 17, 2014 6:22 am

Richard LH
For a future endeavor, you might find it interesting to explore the Global Warming Prediction Project. and how their automatic model results compare with your filtered data.

RichardLH
March 17, 2014 6:23 am

davidmhoffer says:
March 17, 2014 at 5:21 am
“Doubling of CO2 changes the effective black body temperature of earth by precisely zero. There is your integration. Yet it increases downward LW flux. The latter is not integrated on the time scales that you are using, and you can’t measure it as an average by measuring and averaging temperature instead because temperature varies with the 4th root of P.”
I get you don’t think that measuring temperatures or seeing how they evolve over time is a useful exercise. But you logic is faulty as to the methodology of how energy is absorbed/emitted from an object and the temperature of that object over time. IMHO.

RichardLH
March 17, 2014 6:31 am

Matthew R Marler says:
March 17, 2014 at 6:12 am
“Using a word from Willis Eschenbach’s post (the word, I am not attributing an idea to him, though I agree with his main point), what you have done is indistinguishable from curve fitting.”
It is not. Curve fitting is deciding a function and then applying it to the data. No functions were harmed in the making of these graphs. 🙂
” “Filtering” is nothing more than than fitting data by a method that uses a set of basis functions, and then separating the results into two components (as said by Greg Goodman), one fit by the method and the basis functions, the other not.”
Bandpass splitting, which is what low/high pass filters do, is more a binary choice type of exercise. No decisions are made or needed about what is or isn’t three. Only that it should be longer or shorter than the corner frequency.
“If you know the noise, you can design a filter to reveal the signal. If you know the signal, you can design a filter to reveal the noise.”
Signal = Greater than 15 years
Noise = Less than 15 years
Filter = CTRM or Gaussian
“When, as with modern temperature series, you have no claim to know either, you can fit dozens, maybe hundreds (Donoho calls the method “basis pursuit”) of filters equally well to the same data set, each claiming some superiority in some cases.”
Individual frequencies can be targeted or even groups of frequencies. That is not what is being done here. A binary chop, nothing more, nothing less.
“With some filters, “the wiggle is there”, as you wrote; with some filters the wiggle is not there. Whether the “wiggle” is generated by an underlying process that persists and can be described by physical laws can not be determined by looking at the filtering results on the extant data.”
Well whatever your logic there is something with some power in the 15 to 75 year bracket that the data says is present and needs explaining.

RichardLH
March 17, 2014 6:33 am

Edim says:
March 17, 2014 at 5:53 am
“One is tempted to conclude, “It’s the Sun stupid!””
One would need a physical method to connect one to the other in a cyclic fashion. So far nothing has made it past the required scrutiny. Those that fit the cycle seem to lack the power and vice versa.

RichardLH
March 17, 2014 6:41 am

David L. Hagen says:
March 17, 2014 at 6:22 am
“For a future endeavor, you might find it interesting to explore the Global Warming Prediction Project. and how their automatic model results compare with your filtered data.”
As soon as I see Linear Trends you have lost me.
Linear Trend = Tangent to the curve = Flat Earth.
It is just that sort of narrow minded thinking that leads one down blind alleyways and into blind corners.
Nature never does things in straight lines, it is always curves and cycles and avalanches. Almost never in sine waves either unless we are talking about orbits and the like. Ordered/Constrained Chaos is at the heart of most things.

Henry Clark
March 17, 2014 6:44 am

Now the question is how I can improve it. Do you see any flaws in the methodology or tool I’ve developed?
Because HADCRUT4 data by CRU is greatly incorrect (such as within parts of the 1930-1980 period), as is that from Hansen’s GISS, unfortunately any analysis based on HADCRUT4 is also greatly incorrect. While such has the global cooling scare of the 1960s-1970s occur without substantial cooling beforehand in the global or Northern Hemisphere average, as if it just happened with little basis, I would challenge you or anyone (if inclined to defend it, though you might not be) to find any publication made prior to the CAGW movement of the 1980s onward which shows the 1960s-1970s without showing far more cooling relative to the warmer period of the 1930s-1950s.
For instance, a 1976 National Geographic’s publication of the temperature record of scientists of the time, in the following link, is one I have literally seen in paper form in a library, unlike those repeatedly rewritten-later electronic versions which by strange coincidence (not!) happen to be respectively by a group infamous in Climategroup and by a department which has been under the direction of someone so much an activist as to have been arrested repeatedly in protests:
See that and other illustrations of original temperature data around 40% of the way down in http://tinyurl.com/nbnh7hq
Are there any particular combinations of data sets that you would like to see?
While it would be a significant amount of work, digging up original non-rewritten temperature data for history up through the 1970s (prior to the political era, back when there wasn’t motivation to fudge it), digitalizing it, and then carefully joining it onto later temperature data (from more questionable sources but when no alternative) would be a better starting point. Joining together two datasets like that isn’t in theory ideal but the best option in practice; several years of overlap could be used to help check the method. The Northern Hemisphere average, the Southern Hemisphere average, and the global average could best be each generated, as there are reasons, for instance, Antarctic temperatures follow different patterns than the arctic (as the prior link implies).
Of course, like me, you might not have time to do so in the near future even if you desired. But that is something needed yet utterly lacking, as browsing enough skeptic websites indirectly illustrates. The ideal for convenience of other analysis would be both producing plots and producing a list of the values by year (e.g. a data spreadsheet).

davidmhoffer
March 17, 2014 6:45 am

RichardLH;
But you logic is faulty as to the methodology of how energy is absorbed/emitted from an object and the temperature of that object over time. IMHO.
>>>>>>>>>>>>>>>>>>
By your logic, there is no need to average temperature across the earth for analysis at all. The matter having done the integration, all that is required according to you is a single weather station which will over time be representative of the earth’s energy balance. Good luck with that.

Henry Clark
March 17, 2014 6:53 am

To add, regarding this:
I would challenge you or anyone (if inclined to defend it, though you might not be) to find any publication made prior to the CAGW movement of the 1980s onward which shows the 1960s-1970s without showing far more cooling relative to the warmer period of the 1930s-1950s.
The Berkeley BEST set made in the modern day by someone who pretended to be a skeptic without pro-CAGW-movement bias but was an environmentalist (as found out in looking at some of his writing beforehand IIRC) does not even remotely meet that challenge. Only original paper publications (or a clear scan not looking rewritten in any way) made prior to the existence of the global warming movement would count.

RichardLH
March 17, 2014 6:55 am

Henry Clark says:
March 17, 2014 at 6:44 am
“Because HADCRUT4 data by CRU is greatly incorrect (such as within parts of the 1930-1980 period), as is that from Hansen’s GISS, unfortunately any analysis based on HADCRUT4 is also greatly incorrect. ”
Well they all draw from the same set of physical thermometers. You can take a look at the BEST database, which draws from a wide range of providers, to get a wider picture if you like.
This is more intended to analyse what is there in those series and compare and contrast them. The head figure is one where I have aligned the 4 major series over the 1979 onwards era and shown how they agree and disagree.
A later set from just UAH (CRU is pending) which compares the Land, Ocean and Combined is quite revealing.
http://climatedatablog.files.wordpress.com/2014/02/uah-global.png

RichardLH
March 17, 2014 6:56 am

davidmhoffer says:
March 17, 2014 at 6:45 am
“By your logic, there is no need to average temperature across the earth for analysis at all. The matter having done the integration, all that is required according to you is a single weather station which will over time be representative of the earth’s energy balance. Good luck with that.”
Straw man alert. You can answer that one yourself.

davidmhoffer
March 17, 2014 6:59 am

RichardLH.
Straw man alert. You can answer that one yourself.
>>>>>>>>>>>>>
No, you answer it. Does matter integrate energy inputs and outputs as you have argued? If so, why is more than a single weather station required for analysis?

RichardLH
March 17, 2014 7:06 am

davidmhoffer says:
March 17, 2014 at 6:59 am
“No, you answer it. Does matter integrate energy inputs and outputs as you have argued? If so, why is more than a single weather station required for analysis?”
Bully or what?
Well if you are going to get all technical on that then you would require a statistically representative sample of the various sub-environments that are present on the Earths’ surface.
I’ll start with these area based graphs/sampling sets as a reasonable first pass.
http://climatedatablog.wordpress.com/uah/
Probably needs to augmented by some point sampling values as well (coming up with a CRU combined analysis).
Then you might be able to get close to the true picture of how fast/slow the integration methodology in the various materials are across the whole input surface, day to day and month to month.

Greg Goodman
March 17, 2014 7:08 am

Henry Clark: “See that and other illustrations of original temperature data around 40% of the way down in” http://tinyurl.com/nbnh7hq
Very interesting ! In that graph early 60’s is very much the same as early 20th c. Late 19th even cooler rather than warmer as now shown. That does not tell us either is correct.
since Hansen did his little air-con con trick in 1988 , I have no reason to think that he would not be equally misleading with his constant warming adjustments.
It would also not fit a 60 year cycle.
But what we can see is that the long term changes we are seeking to explain are primarily what the various adjustments have stuck in there rather that what the measurements actually were, Whether that is for better or for worse.

Greg Goodman
March 17, 2014 7:15 am

MR Marler. “Filtering” is nothing more than than fitting data by a method that uses a set of basis functions, and then separating the results into two components (as said by Greg Goodman),
I said nothing of the sort. Don’t use my name to back up your ignorant claims.

davidmhoffer
March 17, 2014 7:17 am

RichardLH;
Probably needs to augmented by some point sampling values as well (coming up with a CRU combined analysis).
>>>>>>>>>>>>>
Probably? In other words, you don’t know.
Willis ran an article some time back on the average temperature of the moon not matching the Stefan-Boltzmann black body temperature. Does this mean SB Law is wrong? No. It means that averaging temperature in such a manner as to accurately represent the energy balance of the moon is near impossible. That’s for a body that is airless and waterless. Doing the same for the earth is orders of magnitude more complex.
I suggest you read Willis’ article as well as the various musings of Robert G Brown on these matters.

Henry Clark
March 17, 2014 7:25 am

RichardLH says:
March 17, 2014 at 6:55 am
Well they all draw from the same set of physical thermometers.
So did the data published prior to the 1980s, but that can be seen to be drastically different. When depicting average changes of a tiny fraction of 1% in absolute temperature, of tenths of degree, they depend utterly on the interpolation between widely spaced apart stations, the choice of specific stations used, and, when applicable, hidden adjustments. The practical, convenient way to be certain of no bias in favor of the CAGW movement, without spending thousands of hours personally examining everything, is just to use data published prior to its existence.
In addition to the examples in my prior link, there are others, such as the Northern Hemisphere average history of the National Academy of Sciences, illustrated at http://stevengoddard.files.wordpress.com/2014/03/screenhunter_637-mar-15-11-33.gif in http://stevengoddard.wordpress.com/2014/03/15/yet-another-smoking-gun-of-data-fraud-at-nasa/ , which is utterly different from the CRU/BEST rewritten versions of NH average as well as global average history.
“The head figure is one where I have aligned the 4 major series over the 1979 onwards era and shown how they agree and disagree.”
Obviously, and similar has been seen before, like the undercover CAGW movement supporters on Wikipedia publish a similar plot. However, assuming that RSS or UAH are relatively accurate for the sake of argument, correspondence with them 1979-onwards has absolutely jack to do with disproving rewriting of the pre-1979 section easier to get away with.
If you wish to argue this, then try to meet my challenge:
Find (and link or upload) any publication made prior to the CAGW movement of the 1980s onward which shows the 1960s-1970s without showing far more cooling relative to the warmer period of the 1930s-1950s.
Since you’re arguing this is merely a matter of them all using the same thermometers, that should be easy rather than impossible. Again, what is shown must have been published back then, not a BEST publication of decades later for instance.

Greg Goodman
March 17, 2014 7:25 am

cd. “In my field of study, the Butterworth is effectively a passband filter where…”
Many of these filters, commonly used in electroincs are not really applicable to relatively short time series. They can be programmed but usually by recursive formulae. That means they need a long ‘spin-up’ period before they converge and give a reasonably stable result that is close to intended characteristics. The spin up in practice is usually nearly as long as the data for climate!
Also they mostly have really lumpy stop-band leakage.
In electronic applications it may take a few hundreths of a second to settling then be in continuous use. Not really the same thing as climate data. Hence they are not generally much help.
That is why FIR ( finite impulse response ) filters are usually applied here.
My favourite for this use is Lanczos, unless you need to get really close to the end of the data.

RichardLH
March 17, 2014 7:27 am

davidmhoffer says:
March 17, 2014 at 7:17 am
“Probably? In other words, you don’t know.”
Don’t put words incorrectly into my mouth.
I was pointing out that area (actually volume) sampling instruments are all very well but can properly be supplemented by point sampling instruments. They both have different time sampling methodologies so that integration alone is non trivial.
Again we are wandering off the original point where you claimed that measuring temperature was useless and I pointed out that integration by matter into temperature over time made your claim invalid.

Henry Clark
March 17, 2014 7:28 am

My reply is stuck in moderation at the moment, perhaps from some word in it, probably going to appear later.

davidmhoffer
March 17, 2014 7:40 am

RichardLH;
Again we are wandering off the original point where you claimed that measuring temperature was useless and I pointed out that integration by matter into temperature over time made your claim invalid.
>>>>>>>>>>>>>>>
No, we’re not wandering at all. I’m explaining why and you’re coming up with excuses that you can’t defend with anything other than explanations that begin with “probably” and muse about the integration being “non trivial” but never actually answering directly the points I’ve raised.
I suggest again that you read the information that I’ve pointed you to.

Greg Goodman
March 17, 2014 7:41 am

Richard: “This is pure Gaussian – or even slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ without any additional distortions.”
It does not have the ugly distortions of the inverted lobes in the stop band that make such a mess with simple running mean but to be accurate you should recognise that progressively attenuating all frequencies right from zero is a distortion (other than the intended “distortion” of removing higher frequencies).
Nothings perfect , it’s always a compromise. That is why at least knowing what the filter does is so important when choosing one. and why this article is so helpful. Just be realistic about how good gaussian or triple running means are.

Bart
March 17, 2014 7:46 am

Greg Goodman says:
March 17, 2014 at 12:34 am
“Do you have a similar graph that goes back to say 1850?”
I only really trust the CO2 data to 1958. Proxies are… proxies, i.e., not direct measurements. There are no means truly to verify them.
But, this was a hangup with Ferdinand Englebeen. He was keen to point out that, if you accept the ice core measurements as accurate, the relationship should predict too low a CO2 level before 1958, as here.
Besides the fact that I do not trust the ice core data to deliver a faithful reproduction of CO2 levels, I pointed out it was moot anyway, because knowing what happened since 1958 is enough to discount the influence of human inputs during the era of most significant modern rise.
But, if one really must insist on matching a dodgy set of data farther back, there is no reason that the relationship
dCO2/dt = k*(T – To)
must be steady. The fact that it has been since 1958 is actually quite remarkable. But, there could easily have been a regime change in, say, about 1945 which altered the parameters k and To. I showed the effect of such a regime change to To could make the records consistent here. What could that signify? Possibly an alteration in the CO2 content of upwelling waters at about that time. Maybe Godzilla emerging from hibernation stirred up a cache of latent CO2 at the bottom of the ocean. Who knows?
But, in any case, it is moot. Knowing what happened since 1958 is enough to discount the influence of human inputs during the era of most significant modern rise.

Bart
March 17, 2014 7:54 am

RichardLH: To the point of your post, I would say that you need a filter with near unity gain in the passband a little above 1/60 years^-1, which falls off rapidly thereafter. The Parks-McClellan algorithm was all the rage back when I was an undergraduate, and I think the Remez exchange algorithm still forms the kernel of many modern filter design algorithms. Free code for the algorithm may be readily found on the web, though it is generally in FORTRAN.

Greg Goodman
March 17, 2014 7:56 am

Thanks Bart, That was not a challenge to what you said, you just mentioned something about it varying earlier and I presumed you had a graph that went further back that may be been interesting. I agree pre-1958 is a different story in CO2 records.
Those who wish to infer something from ice-cores showing stable millennial scale relationships and modern climate are comparing apples to oranges. The magnitude of the long term , in-phase relationship does not tell us anything about the magnitude of the short term orthogonal relationship.
Ferdi did suggest some ice core data for circa 1500 with 50 year resolution, that may be relevant but I won’t digress too far into that discussion here.
Your graph was very interesting. I may have another look that, Thanks.

RichardLH
March 17, 2014 8:00 am

davidmhoffer says:
March 17, 2014 at 7:40 am
“No, we’re not wandering at all. I’m explaining why and you’re coming up with excuses that you can’t defend with anything other than explanations that begin with “probably” and muse about the integration being “non trivial” but never actually answering directly the points I’ve raised.
I suggest again that you read the information that I’ve pointed you to.”
Hmmm. I point out that matter integrates energy over time for both inward and outward flows and you never address that point.
Then you raise a straw man about how many sampling points are needed above one. I answer than in a reasonable way and you bluster on.
Read it. Not interested.

RichardLH
March 17, 2014 8:03 am

Greg Goodman says:
March 17, 2014 at 7:41 am
“It does not have the ugly distortions of the inverted lobes in the stop band that make such a mess with simple running mean but to be accurate you should recognise that progressively attenuating all frequencies right from zero is a distortion (other than the intended “distortion” of removing higher frequencies).
Nothings perfect , it’s always a compromise. That is why at least knowing what the filter does is so important when choosing one. and why this article is so helpful. Just be realistic about how good gaussian or triple running means are.”
Well as you can always get the ‘other half’ out by doing 1-x to get the high pass filter it is pretty good for a few lines of code 🙂
True there will always be some blurring of frequencies around the corner value and those will show up in some measure in both pass and stop outputs instead of just one but as you say, nothing is perfect.

RichardLH
March 17, 2014 8:08 am

Bart says:
March 17, 2014 at 7:54 am
“RichardLH: To the point of your post, I would say that you need a filter with near unity gain in the passband a little above 1/60 years^-1, which falls off rapidly thereafter. The Parks-McClellan algorithm was all the rage back when I was an undergraduate,”
Thanks for the suggestion but the frequency response curve is way to ‘ringy’ for me.
http://en.wikipedia.org/wiki/File:Pmalgorithm.png
That is the main problem with most of the higher order filters, there are none that respond well to square wave input. Gaussian is probably the best as far as that goes, hence a near Gaussian with only a few lines of code seemed best.
If you want to use true Gaussian then it will do nearly as well. Just stick with S-G rather than switch disciplines to LOWES in that case.

rgbatduke
March 17, 2014 8:15 am

Does throw up another question though. Why is it that GISS and HadCRUT are so far apart in the middle? They are close together at the start and the finish, why so different in the middle? I am not sure GISS (or HadCRUT) will thank me that much.
GISS has a (broken, often backwards) UHI correction. HADCRUT does not. Probably other reasons as well. The UHI correction used by GISS maximally affects the ends.
You cannot really say that they are close at the start and finish and different in the middle, because they are anomalies with separately computed absolute bases. That is, there is no guarantee that the mean global temperature computed from the corrected GISS data will correspond to that computed from the uncorrected HADCRUT4 data. Consequently, you can shift the curves up or down by as much as 1C. So perhaps they should be adjusted to be close in the middle, and maximally different at the start and finish.
There are multiple data adjustments in the two series, and while there is substantial overlap in data sources, the data sources are not identical. And then, as Steve M. notes, there are other long running series with their OWN adjustments and (still overlapping) data sources. It is amusing to see how different they are when plotted on the same axes via e.g. W4T, and then to imagine how their similarities and differences would change if one moved them up or down as anomalies around their distinctly computed global average temperatures on an absolute scale. It is even more amusing to consider how they would change if one actually used e.g. personal weather station data that is now readily available on a fairly impressive if somewhat random grid at least in the United States to compute an actual topographical UHI correction with something vaguely approximating a valid statistical/numerical basis for all of the series. It is more amusing still to consider what the series would look like with error bars, but that, at least, we will never see because if it were ever honestly plotted, the game would be over.
This is a partial entre towards not exactly an explicit correction to your approach, but rather to a suggestion for future consideration.
If one does look at your figure 2 above (HADCRUT4 vs smoothed), you will note that the removed noise scales from the right (most recent) to the left (oldest). This is, of course, a reflection of the increasing underlying uncertainty in the data as one goes into the more remote past. In 1850 we knew diddly about the temperature of the world’s oceans (which cover 70% of the “surface” in the global average surface temperature) and there were whole continental-sized tracts of the planet that were virtually unexplored, let alone reliably sampled for their temperature.
This leaves us with several chicken-and-egg problems that your filter may not be able to compensate/correct for. The most important one is systematic bias like the aforementioned UHI, or systematic bias because places like the Arctic and Antarctic and central Africa and central Australia and Siberia and most of the Americas or the world’s oceans were pitifully poorly represented in the data, and some of those surface areas are make dominant contributions to the perceived “anomaly” today (as maps that plot relative anomaly clearly show). Your smoothed curve may smooth the noisy data to reveal the underlying “simple” structure more clearly, but it obviously cannot fix systematic biases, only statistically neutral noise. I know you make no such claim, but it is important to maintain this disclaimer as the differences between different global average temperature anomalies are in part direct measures of these biases.
The second is that your smoothed curve still comes with an implicit error. Some fraction of the removed/filtered noise isn’t just statistical noise, it is actual measurement error, method error, statistical error that may or may not be zero sum. It might be worthwhile to do some sort of secondary computation on the removed noise — perhaps the simplest one, create a smoothed mean-square of the deviation of the data from the smoothed curve — and use it to add some sort of quasi-gaussian error bar around the smoothed curve. Indeed, plotting the absolute and signed difference between the smoothed curve and the data would itself be rather revealing, I think.
rgb

RichardLH
March 17, 2014 8:17 am

Greg:
Your request for a Lanczos filter in R seems to have already been met.
http://stackoverflow.com/questions/17264119/using-lanczos-low-pass-filter-in-r-program

GlynnMhor
March 17, 2014 8:22 am

A better result can be obtained using an Ormsby zero phase filter:
http://www.xsgeo.com/course/filt.htm
Industrial scale treatment of seismic data uses Ormsby filters, applied in the Fourier domain, almost exclusively to frequency limit the data.
The frequency content and filter slopes can be established in the design of the filter operator, and the operator is always run from end to end of the data without truncation of the output.

RichardLH
March 17, 2014 8:23 am

rgbatduke says:
March 17, 2014 at 8:15 am
“GISS has a (broken, often backwards) UHI correction. HADCRUT does not. Probably other reasons as well. The UHI correction used by GISS maximally affects the ends.
You cannot really say that they are close at the start and finish and different in the middle, because they are anomalies with separately computed absolute bases.”
Well for that alignment I did a simple OLS for the whole 1979-today period for each source and then adjusted the offsets in the various series to suit. UAH, RSS, GISS and HadCRUT.
http://climatedatablog.wordpress.com/combined/
has the whole story and that was going to be the heart of my next posting (hopefully).
Added some well known proxies in the mixture to extend the data set a few more years backwards into the past as well 🙂

RichardLH
March 17, 2014 8:25 am

GlynnMhor says:
March 17, 2014 at 8:22 am
“A better result can be obtained using an Ormsby zero phase filter:
http://www.xsgeo.com/course/filt.htm
It would be interesting to see how the output from that differed (if at all) from the CTRM. I suspect that given the data length and the time periods required the output will be nearly identical.

Bart
March 17, 2014 8:31 am

RichardLH says:
March 17, 2014 at 8:08 am
“Thanks for the suggestion but the frequency response curve is way to ‘ringy’ for me.”
That’s just an heuristic diagram. You can design for the amount of ripple, and can make it arbitrarily small for arbitrary length of the filter.
That is much better than a filter design which starts attenuating at any frequency above zero, like the Gaussian filter.

RichardLH
March 17, 2014 8:43 am

Bart says:
March 17, 2014 at 8:31 am
“That’s just an heuristic diagram. You can design for the amount of ripple, and can make it arbitrarily small for arbitrary length of the filter.”
But as far as I know that requires extra input data series length, which is the one thing we do not have.
You actually want full attenuation of zero to corner frequency, i.e. a brick wall filter, but that is precisely what a single running mean is and look what it does in the frequency domain.
http://climatedatablog.files.wordpress.com/2014/02/fig-1-gaussian-simple-mean-frequency-plots.png
Gaussian seems the best compromise all round, and CTRM is a very simple way to get that.

Clay Marley
March 17, 2014 8:47 am

The OP talks about using this triple mean filter to remove the seasonal component of data, but then showed an example using anomaly data, which already has the seasonal component removed.
The easiest and best way to remove seasonal components from seasonal data is to do what everyone else does, use anomalies. Then smooth that if desired with LOESS or Gaussian, which as a few have pointed out, can be made virtually indistinguishable from the triple mean filter.
I’d be more interested to see how filters work on not-quite-so-seasonal data. Sunspots for example, have varying periods so anomalies do not work. Willis has effectively criticized several papers that do a very poor job of smoothing sunspots.
Also, data on total land ice and snow, while seasonal, do not work well with anomalies. The reason is the peak snow coverage is very sharp and varies over a range of 5-6 weeks each year, leaving periodic spikes in the anomaly data.

RichardLH
March 17, 2014 8:57 am

Clay Marley says:
March 17, 2014 at 8:47 am
“The OP talks about using this triple mean filter to remove the seasonal component of data, but then showed an example using anomaly data, which already has the seasonal component removed. The easiest and best way to remove seasonal components from seasonal data is to do what everyone else does, use anomalies. ”
The problem with that is that the ‘normal’ is only constructed with, say, 30 samples, often with built-in errors in those samples as they are sub-sampled single running means themselves. This then gets translated into errors in the ‘normal which then get stamped into all the ‘Anomaly’ sets produce using it from then on.
An Annual filter removes those included errors and produces a mathematically more accurate result at monthly sampling rate.
If you want a look at this applied directly to temperature rather than anomalies then try
http://climatedatablog.wordpress.com/cet/
which shows the CET with just such a treatment. You could turn that into an Anomaly of self set by subtracting the overall average if you like.

Bernie Hutchins
March 17, 2014 9:16 am

A “zero-phase” filter is not possible with time-series data as it would generally be anti-causal. So when a signal processing engineer says zero-phase he/she usually means a FIR filter with impulse response that is even-symmetric with respect to the center. This is properly called “linear phase”, or constant time delay. The frequency response is a Fourier Series in the frequency variable – just a weighted sum of cosines in this case. A moving average is linear phase, for example, as is Guassian or SG.
Butterworth responses are traditionally IIR and not FIR, and are not linear phase. It is however possible to design a “Butterworth” linear phase by taking a suitable Butterworth characteristic, sampling the magnitude response, substituting in (imposing) a linear phase, and solving for the impulse response. The impulse response can be just the inverse Discrete-Time Fourier Transform (DTFT is slow form of FFT), which is called “frequency sampling” in the textbooks, but this often still has passband ripples (again because it is the sum of cosines after all). So a generalized form of frequency sampling is used where a very large (over-determined) set of magnitude samples is used and the inversion is done by least squares (again the “pseudo-inverse”) down to a reasonable length FIR filter. Works great.
Parks-McClellan is another good option with well defined passband and stopband ripples (Chebyshev error criterion). It is inherently FIR and linear phase. It also has no closed from derivation, hence the PM Algorithm. The book Tom Parks wrote with Sid Burrus “Digital Filter Design” is classic. Filter design is well established and readily available.

RichardLH
March 17, 2014 9:29 am

Bernie Hutchins says:
March 17, 2014 at 9:16 am
There are many good filter designs out there with their own particular characteristics. Few offer the benign characteristics of Gaussian though.
http://en.wikipedia.org/wiki/Gaussian_filter
…”It is considered the ideal time domain filter”.,.
Being able to implement a filter of that type with only a few lines of code (or addition steps in the case of a spreadsheet) is very difficult. CTRM manages to do so.
Also we have the requirement of a desperately short data series. Most work in other disciplines that use filters of different types have long ‘run up’ times as Greg mentions above.
So a combination of factors which means (pun) that a CTRM is a simple, Occam’s Razor, filter choice 🙂

cd
March 17, 2014 9:50 am

Greg Goodman
They can be programmed but usually by recursive formulae.
I’m guessing this is what’s need in the time/spatial domain? I’ve only ever used the Butterworth filter in the frequency domain where as I’m sure you know things are a lot “easier”.
That means they need a long ‘spin-up’ period before they converge and give a reasonably stable result that is close to intended characteristics.
Again, my use of them is obviously far more presumptuous than yours. I simply use them as they were intended as passband filters; you obviously have greater experience here and understand the nuances better, but in the frequency domain there is no recursion and no convergence, at least not how I use them ;).
I guess, and I’d be interested in hearing your view. It is often suggested, and for argument’s sake we assume an almost exhaustive data set, that carrying out processing in the frequency domain is the correct approach to applying filters; where the data series is possible/likely the composite of other periodic series. But as the recent post relating to signal stationarity (on WUWT, where you commented) suggested, doing anything according to the “global” spectrum of a signal is fraught with danger.Therefore, applying any filter in the frequency domain is in itself creating local artifacts if the underlying periodic components are non-stationary (e.g. data series composed from several “frequency-modulated” components, I use the term loosely here for brevity).
I’m not pretending to be an expert here, but that recent post seemed somewhat contrived and a little light on the implications.

brians356
March 17, 2014 10:16 am
Editor
March 17, 2014 10:40 am

RichardLH says:
March 17, 2014 at 4:14 am

Willis Eschenbach says:
March 17, 2014 at 12:22 am

“First, Richard, thanks for your work. Also, kudos for the R code, helped immensely.”

Thank you.

“My first question regarding the filter is … why a new filter? What defect in the existing filters are you trying to solve?”

Simplicity and accuracy.

You’ll have to define “accuracy” for me in terms of a smoother … how are you measuring that?
The same is true of “simplicity”. Simplicity of what? Implementation? In what computer language?

“Mmmm … if that’s the only advantage, I’d be hesitant. I haven’t run the numbers but it sounds like for all practical purposes they would be about identical if you choose the width of the gaussian filter to match … hang on … OK, here’s a look at your filter versus a gaussian filter:…As you can see, the two are so similar that you cannot even see your filter underneath the gaussian filter … so I repeat my question. Why do we need a new filter that is indistiguishable from a gaussian filter?”

Actually it is just slightly better than a Gaussian. It completely removes the 12 month cycle rather than leaving a small sliver of that still in the output.
http://climatedatablog.files.wordpress.com/2014/02/fig-2-low-pass-gaussian-ctrm-compare.png

As I showed above, your claim about the 12 month cycle is simply not true. Let me repeat it here.

There is no visible difference between your smoother and a standard gaussian smoother. Nor is there any 12 month cycle in the residuals between them (yours minus gaussian). Your claim is falsified.
And as your own link clearly shows, there is only a tiny difference between yours and gaussian, and it’s way out on the edge of the rolloff … color me unimpressed. When a difference makes a difference that is less than a line’s width on the screen, I don’t care.

“There is indeed a “wiggle” in the data, which incidentally is a great word to describe the curve. It is a grave mistake, however, to assume or assert that said wiggle has a frequency or a cycle length or a phase.”

The choice of words was because I know I can’t prove a ‘cycle’ with what we have. Doesn’t mean you cannot observe what is there though.

I do observe what’s there. What I don’t do is call it a cycle.

“Let me show you why, using your data: The blue dashed vertical lines show the troughs of the wiggles. The red dashed vertical lines show the peaks of the wiggles. As tempting as it may be to read a “cycle” into it, there is no “~ 60 year cycle”. It’s just a wiggle. Look at the variation in the lengths of the rising parts of the wiggle—18 years, 40 years, and 41 years. The same is true of the falling parts of the wiggle. They are 29 years in one case and 19 years in the other. Nothing even resembling regular.”

Hmmm. I would question your choice of inflexion points. To do it properly it would probably be best to de-trend the curve first with the greater than 75 years line (not a straight line!) to get the central crossing points and then do any measurements. Peak to Peak is always subject to outliers so is usually regarded as less diagnostic. But as there are only two cycles this is all probably moot anyway. If there is anything else mixed in with this other than pure sine waves then all bets are off for both period, phase and wave shape.

So you’re taking up the Greg Goodman style of debate, where you just make some off-the-wall claim and then neglect to actually go see if it is true?
Look, Richard. I did the work to produce my estimate of the inflection points, which clearly shows there is not a regular cycle of any kind.
Now if you think that it is not correct because I didn’t “detrend the curve first with the greater than 75 years line (not a straight line!)”, then put your money where your mouth is and do the legwork to show us that your claim is true. I did it for my claim … your turn.
I don’t even know what kind of a “75 years line (not a straight line!)” you’d recommend we use to detrend it. So do the work and come back and show us just how much your (unknown) procedure changes the cycle lengths … my prediction is, not much …

I just display what is there and see where it goes.

No, you display what is there and then extrapolate it unjustifiably.

“The problem with nature is that you’ll have what looks like a regular cycle … but then at some point, it fades out and is replaced by some other cycle. ”

The interesting thing is when you do comparisons to some proxy data with the required resolution.
Then out pops some possible correlation that does need addressing.

That’s your evidence? A graph of the smoothed swings of the PDO? Look, there’s no regular cycle there either. I don’t have your data or code, but by eye, the Chen data (red line) has cycles of lengths 90, 90, 110, 60, and 60 years … is that a “possible correlation that does need addressing”? Is that another of the famous “approximately sixty year” cycles? Or is it an “approximately ninety years” cycle?
In any case, I don’t know if it’s a “possible correlation” that needs addressing, because you haven’t said what it was supposed to be correlated with. Is there another wiggle with cycles of 90, 90, 110, 60, and 60 years that I don’t know about?

“To sum up, you are correct that “what you can’t do is say the wriggle is not there”. It is there. However it is not a cycle. It is a wiggle, from which we can conclude … well … nothing, particularly about the future. ”

Well the 15 year S-G trend says the immediate future is downwards.

Yes … but your method (and a gaussian smooth and a loess smooth), on the other hand says it’s just gone level. Who you gonna believe?

You could conclude that, if the ~60 year ‘cycle’ repeats, then the downwards section is going to be 30 years long. Time alone will tell.

Look, Richard. Either a cycle is 60 years, or it is not. Whenever anyone starts waving their hands and saying that there is an “approximately 60 year cycle”, I tune out because there’s no such thing.
Let me go over the bidding again. If we measure from peak to peak, there are two cycles in the HadCRUT4 data, of length 48 years and 60 years. If we measure trough to trough, there are two cycles in the data, of length 70 years and 61 years.
Now if you wish to call a wiggle with “cycle lengths” of 61, 48, 70, and 60 years an “approximately 60 year cycle”, and then project that putative “cycle” out for 30 years, I can’t stop you.
I will, however, point out that such behavior is a sign of incipient cyclomania, and in the event it persists longer than four hours you should seek immediate medical assistance …
w.

John West
March 17, 2014 10:42 am

RichardLH says:
”This does allow quite a detailed look at the temperature series that are available. It allows for meaningful comparisons between those series.”
A detailed look at manipulated junk is still manipulated junk (IMHO). I’ll concede it does allow for comparing one manipulated junk data-set to another manipulated junk data-set and thereby revealing they’re different manipulated junk datasets. (JMHO)
Just look at your figure 4 between 1945 and 1970. Does that look like something that would cause an ice age scare?
Look, I really do appreciate your effort but I’m afraid those that have betrayed us and objectivity have ruined any attempt at data analysis for many more years decades to come.

John West
March 17, 2014 10:54 am

wbrozek says:
”In that case, how would you prove that the warming that is occurring now is not catastrophic?”
That’s the problem, we can’t prove it’s not and they can’t prove it is.But we have the satellites now for overwatch so if it does something dramatic like pause (…. LOL …), no, it’d have to be more dramatic than that (like drop significantly) then it’d be extremely difficult for them to explain.

RichardLH
March 17, 2014 11:22 am

Willis Eschenbach says:
March 17, 2014 at 10:40 am
You’ll have to define “accuracy” for me in terms of a smoother … how are you measuring that?
The same is true of “simplicity”. Simplicity of what? Implementation? In what computer language?”
Mathematically more accurate than a single mean, mathematically simpler than a Gaussian.
“As I showed above, your claim about the 12 month cycle is simply not true. Let me repeat it here. There is no visible difference between your smoother and a standard gaussian smoother. Nor is there any 12 month cycle in the residuals between them (yours minus gaussian). Your claim is falsified.”
http://climatedatablog.files.wordpress.com/2014/02/fig-2-low-pass-gaussian-ctrm-compare.png
Then these two lines would lie one over the other and they don’t. Small difference only, true, but there none the less. Claim substantiated.
“And as your own link clearly shows, there is only a tiny difference between yours and gaussian, and it’s way out on the edge of the rolloff … color me unimpressed. When a difference makes a difference that is less than a line’s width on the screen, I don’t care. ”
So you try and get a Gaussian response curve in a spreadsheet with only two extra columns and no macros. Or only a few lines of R to do the same thing. Look, if you don’t like CTRM – use Gaussian instead. No skin off my back. Move on to where it is really important, the 15 year corner frequency and why that displays a ~60 year signal. All available frequencies above 15 years drop out to something in that bracket and a final residual at longer then 75 years.
“So you’re taking up the Greg Goodman style of debate, where you just make some off-the-wall claim and then neglect to actually go see if it is true?”
No. Please don’t put words in my mouth. Ask and I will respond.
“Look, Richard. I did the work to produce my estimate of the inflection points, which clearly shows there is not a regular cycle of any kind. Now if you think that it is not correct because I didn’t “detrend the curve first with the greater than 75 years line (not a straight line!)”, then put your money where your mouth is and do the legwork to show us that your claim is true. I did it for my claim … your turn.”
Hmmm. You were the one proposing to measure two ‘cycles’ of data and expecting to see a precise answer, not me. I understand all too well what is possible and/or not possible here. I consistently say ~60 years because that is all that can reasonably be stated. In fact if you do subtract the 75 year line you get a non-symmetrical wave shape with a shorter ‘top half’ to a longer ‘bottom half’ but with just two samples I would not stake my life on it. Too many other possibilities including wave shape or other longer ‘cycles’ could also be the reason.
“I don’t even know what kind of a “75 years line (not a straight line!)” you’d recommend we use to detrend it. So do the work and come back and show us just how much your (unknown) procedure changes the cycle lengths … my prediction is, not much … ”
Well I would have thought the blue line on the graph was a clue! The main problem is that is only a single running mean and has loads of distortions in it. I could run a S-G at 75 years which would be much more likely to produce something reasonable.
I can produce the band pass splitter circuit you obviously want by subtracting one output from the other but I thought that was a step too far as yet. I’ve only just brought this subject up!
Now you are demanding it and sort of hinting I don’t know what I am doing. Not very respectful.
“No, you display what is there and then extrapolate it unjustifiably.”
An S-G curve is very, very well respected. The nearest equivalent in statistics is LOWES (which some would claim was based on it in any case). If you wish to discount LOWES then do so for S-G. Otherwise the projection stands.
“That’s your evidence? A graph of the smoothed swings of the PDO? Look, there’s no regular cycle there either. I don’t have your data or code, but by eye, the Chen data (red line) has cycles of lengths 90, 90, 110, 60, and 60 years … is that a “possible correlation that does need addressing”? Is that another of the famous “approximately sixty year” cycles? Or is it an “approximately ninety years” cycle?”
If you would like the code, just ask. You could probably figure it out given that I have a simple example with RSS as a link in the article (see above). Just replace the source data with the Shen data and jaiso data and use a de-trended HadCRUT and you will be there.
Pretty good match with the available thermometer data in their overlap periods.
“In any case, I don’t know if it’s a “possible correlation” that needs addressing, because you haven’t said what it was supposed to be correlated with. Is there another wiggle with cycles of 90, 90, 110, 60, and 60 years that I don’t know about?”
As the thermometer data only goes back to the 1800s that is all the co-relation that is possible.
“Yes … but your method (and a gaussian smooth and a loess smooth), on the other hand says it’s just gone level. Who you gonna believe?”
You are obviously looking at different graphs to me then. It is not a LOWES, it is an S-G. Verified against the Gaussian for parameter choice ‘proof’.
“Look, Richard. Either a cycle is 60 years, or it is not. Whenever anyone starts waving their hands and saying that there is an “approximately 60 year cycle”, I tune out because there’s no such thing.”
Nature has a habit of never being clockwork precise – other than for orbits and then only for a simple two body solution.
“Let me go over the bidding again. If we measure from peak to peak, there are two cycles in the HadCRUT4 data, of length 48 years and 60 years. If we measure trough to trough, there are two cycles in the data, of length 70 years and 61 years.”
As I noted above, it is quite possible that the half cycles are of different periods. That is the way nature often does this stuff. Way too early to tell for sure of course, but this is just a first step.
“Now if you wish to call a wiggle with “cycle lengths” of 61, 48, 70, and 60 years an “approximately 60 year cycle”, and then project that putative “cycle” out for 30 years, I can’t stop you. I will, however, point out that such behavior is a sign of incipient cyclomania, and in the event it persists longer than four hours you should seek immediate medical assistance …”
Thank you for your concern. However that does not make the observations go away. The data has a wriggle in it that needs explaining at around ~60 years and, so far, none of the proposed physics does that.

Bart
March 17, 2014 11:26 am

Willis Eschenbach says:
March 17, 2014 at 10:40 am
“Either a cycle is 60 years, or it is not.”
There are no precise cycles in nature. Even our best electronic oscillators wander in frequency over time, and those are meticulously designed with conscious intent. But, they still stay within the general neighborhood of a central frequency, and the oscillations can be extrapolated forward in time on that basis, with the prediction becoming gradually more uncertain the longer the extrapolation interval.
A lot of us saw the ~60 year cycle in the data more than a decade ago. People with their heads in the sand insisted it was “cyclomania”. The rest of us predicted it would show a turnaround mid-decade. Guess what? It did. Denial of its existence is no longer tenable.

RichardLH
March 17, 2014 11:27 am

John West says:
March 17, 2014 at 10:42 am
Well we have to work with what is there, not what we would like to be there. Just because you don’t like what there is, there is no reason to discard it completely.
You do realise that if a natural ~60 ‘wriggle’ is there in the data then AGW is a lot less than is currently calculated don’t you?

Bernie Hutchins
March 17, 2014 11:29 am

Greg Goodman said March 17, 2014 at 2:24 am in part:
“I’ve always been dubious of LOESS filters because they have a frequency response that changes with the data content. I’ll have to have a closer look at SG. It does have a rather lumpy stop band though. I tend to prefer Lanczos for a filter with a flat pass band ( the major defect of gaussian and the triple RM types ).”
I am not familiar enough with LOESS to comment. If the frequency response changes with the data, it is not a Linear Time-Invariant (LTI) filter. SG is LTI. True enough the trade-off for a flat passband is a bumpier stop-band. Always is – No free lunches.
As for Lanczos, if I recall correctly it is an impulse response that is nothing more than a truncated sinc. As such it is subject to Gibbs Phenomenon transition band ripples – perhaps 23% or so – hardly a flat passband.
These points are well-considered in the signal processing literature. There is a danger to using a “built-in” without understanding it in detail. Hope I’m not being unfair – don’t intend to be.

RichardLH
March 17, 2014 11:32 am

Bart says:
March 17, 2014 at 11:26 am
“A lot of us saw the ~60 year cycle in the data more than a decade ago. People with their heads in the sand insisted it was “cyclomania”. The rest of us predicted it would show a turnaround mid-decade. Guess what? It did. Denial of its existence is no longer tenable.”
Well you do have to stack up: volcanos, SO2 and CO2 in just the right order and magnitude with no lag and you CAN get there! if you try really hard 🙂

RichardLH
March 17, 2014 11:42 am

Willis Eschenbach says:
March 17, 2014 at 10:40 am
P.S. Care to re-do your plot without the green line so that a true comparison can be made? If I plotted temperature data with the full yearly cycle still in it then any deviations of the Anomaly would be hard to spot as well.
Play fair at least.

Bernie Hutchins
March 17, 2014 12:04 pm

Smoothing is generally the destruction of at least some of the data. If you argue that you still have the original data stashed away – then true enough (obviously) you haven’t lost anything completely. If you argue that you can recover the original from the smoothed alone, in the general case, you are wrong. Inversion is often at least ill-conditioned and usually impossible due to singularities. Most useful filters have what are intended to be stopbands. As such, they generally have nulls (unit-circle zeros). The inverse filter has unit-circle poles and is useless.
A moving average has nulls. SG has nulls. Continuous infinite-duration Gaussian, I believe, has no nulls, but I’m not sure about a truncated Gaussian which is convolved with a sinc? Successive passes (cascading) through moving-average rectangles have nulls – they accumulate.
Smoothing may be curve fitting (such as intentional interpolation), but it need not be. But if you claim insight from a smoothed curve, all you can really claim is that the data provides some evidence for what you feel is a proper model of the data, and the rest you assume is noise, worthy of being discarded. This is certainly circular reasoning at least in part. Such a claim requires considerable caution and full analysis and disclosure.

RichardLH
March 17, 2014 12:17 pm

Bernie Hutchins says:
March 17, 2014 at 12:04 pm
“Smoothing is generally the destruction of at least some of the data. If you argue that you still have the original data stashed away – then true enough (obviously) you haven’t lost anything completely. If you argue that you can recover the original from the smoothed alone, in the general case, you are wrong. ”
Not true in the case of a digital low pass/high pass band pass splitter filter such as a CTRM. You can do a 1-x by subtracting one band from the input source and get the other.
Has to be the case because here is no-where else for the data to go! In digital terms only of course, analogue could have some losses 🙂
(Rounding errors and the like excepted of course but we can assume they are not likely to be even close to a dominant term – quantisation errors come much,much higher than that).

Bernie Hutchins
March 17, 2014 12:28 pm

Richard –
Please read what I wrote, and then think. You said:
“Not true in the case of a digital low pass/high pass band pass splitter filter such as a CTRM. You can do a 1-x by subtracting one band from the input source and get the other.”
I said:
“Smoothing is generally the destruction of at least some of the data. If you argue that you still have the original data stashed away – then true enough (obviously) you haven’t lost anything completely.”
You say you haven’t thrown data away BECAUSE you kept a copy of the input. Didn’t I say that?

RichardLH
March 17, 2014 12:56 pm

Bernie Hutchins says:
March 17, 2014 at 12:28 pm
“Please read what I wrote, and then think.”
I did. People are always suggesting that a simple low pass filter throws information away. That is not the case here (and rarely is in climate work). The input digitisation/rounding/truncation errors are by far and away the largest term. Anything to do with the calculations in the filter are in the dust (assuming reasonable floating point operations anyway). There are only a few adds and three divisions to get to any single output value. If you do the calculations of error propagation you will see.
It becomes a sort of mantra that people roll out to say why this is not a valid treatment without considering if that is truly the case.
Sorry you just pushed one of my buttons and I responded. probably in an ill considered way. Apologies.
If we were talking about 16 or 24 bit audio or video streams I would agree that this is not completely lossless, but with the data streams we have in climate, filter errors are the least of our worries.

March 17, 2014 1:46 pm

RichardLH says:
March 16, 2014 at 1:58 pm
Lance Wallace says:“Can the Triple whatsis somehow compare the two curves and derive either a lag time or an estimate of how much of the CO2 emissions makes it into the observed atmospheric concentrations?”
RichardLH :Not really. Low pass filters are only going to show periodicity in the data and the CO2 figure is a continuously(?) rising curve.
A LPF by definition, passes frequencies below it’s cut-off frequency. Since D.C. (i.e. frequency = 0) is always below the cut-off, the above statement by RLH is incorrect. A ramp-like input into a LPF will cause a lagged ramp-like output, with the lag determined by the filters group delay characteristic.
Note that it is incorrect I think to speak of an LTI filter (such as the one described here) as causing “distortion”, which as a term of signal processing art refers exclusively to artifacts caused by system non-linearity. If you can sum the output of your filter with the residual (i.e. the rejected signal) and reconstruct the input, no distortion has occurred.
Another way of saying this is that an LTI filter can not create energy at a given frequency that is not present in the input signal, even if the filter response peaks badly at that frequency. Such peaking/sidelobe/pb-ripple may extenuate frequencies you wish to extinguish (an so in a sense “distort” the variance present at that frequency relative to some other frequency band of interest), but this is not distortion as the term is normally used.

March 17, 2014 1:54 pm

Bart You are right about there being no precise cycles in nature .The 60 year cycle is very obvious as a modulation of the approximate 1000 year quasi periodicity which comes and goes as it beats with other cycles. See Fig 4 at
http://climatesense.norpag.blogspot.com
It looks good at 10000.9000 8000. 5000? 2000 1000 and 0 ( the present warming)
The same link provides an estimate of the possible coming cooling based on the 60 and 1000 year periodicities and the neutron count as the best proxy for solar “activity”

Bernie Hutchins
March 17, 2014 2:00 pm

Richard –
Except as you can construct a proper inverse filter, smoothing (filtering, reduction of bandwidth) will remove information, and here you can’t construct this inverse. You say that you have not lost information because you kept the input separately. That doesn’t count!
Nobody is talking about numerical issues of the processing. Measurement issues (systematic) are a problem as is often discussed on this blog. It is very difficult to see how smoothing helps these.
And we all already agree we can see the apparent 60 year periodicity. What does your method offer that is better, let alone new?
If you are instead suggesting that you are not discarding any important information, then make that case, both in showing that you have chosen the correct parameters for a correct model, and that you understand the errors involved.

Editor
March 17, 2014 2:02 pm

Bart says:
March 17, 2014 at 11:26 am

Willis Eschenbach says:
March 17, 2014 at 10:40 am

“Either a cycle is 60 years, or it is not.”

… A lot of us saw the ~60 year cycle in the data more than a decade ago. People with their heads in the sand insisted it was “cyclomania”. The rest of us predicted it would show a turnaround mid-decade. Guess what? It did. Denial of its existence is no longer tenable.

That’s a marvelous anecdote, Bart, but kinda short on links. The high point of the detrended HadCRUT4 data you are using occurred in 2003, which means that “predictions” after that point are hindcasts. So to examine your claim that “a lot” of people predicted what happened “more than a decade ago”, what we need are links to contemporaneous accounts of say three of your and other folks’ predictions made pre-2003 that global temperatures would peak by “mid-decade”.
I mean if “a lot” of you were making such predictions as you claim, surely linking to two or three of them isn’t a big ask.
w.
PS—Here is what I think about cycles:
Does the temperature go up and down? Yes, it does, on all timescales. They are best described as wiggles.
Do these wiggles imply the existence of cycles stable enough to have predictive value? I’ve never found one that could survive the simplest out-of-sample testing, and believe me, I’ve looked very hard.
Y’all seem to think I’m refusing to look at cycles. I’m not
I’m refusing to look at cycles AGAIN, because over the decades I’ve looked at, tested, and tried to extrapolate dozens and dozens of putative cycles, and investigated reports of others’ attempts as well. I wrote an Excel spreadsheet to figure out the barycentric orbit of the sun, for heaven’s sake, you don’t get more into cycles than that. I was as serious a cyclomaniac as the next man, with one huge, significant exception … I believe in out-of-sample testing …
Net result?
Nothing useful for predictions. After testing literally hundreds of natural timeseries datasets for persistent cycles, I found nothing that worked reliably out-of-sample, and I found no credible reports of anyone else finding anything that performed well out-of-sample. Not for lack of trying or lack of desire to find it, however. I’d love to be shown something like that the Jupiter-Saturn synodic period was visible in climate data, and I’ve looked hard for that very signal … but I never encountered anything like it, or any other cycles that worked for that matter.
So, as an honest scientist, I had to say that if the cycles are there, nobody’s found them yet, and turn to more productive arenas.
w.

brians356
Reply to  Willis Eschenbach
March 17, 2014 2:22 pm

Willis,
I’m still waiting [for] someone to produce a hunk of hide or bone from a Sasquatch, as well.
“Where has all the rigor gone? Long time pa-a-ssing …”

brians356
March 17, 2014 2:23 pm

” … for someone …”

Editor
March 17, 2014 2:29 pm

RichardLH says:
March 17, 2014 at 12:56 pm

People are always suggesting that a simple low pass filter throws information away. That is not the case here (and rarely is in climate work).

That doesn’t seem right to me. Any filter which is not uniquely invertible loses information.
For example, while we can get from the numbers {1,2,3,4,5} to their mathematical average of 3, there is no way to be given the average of “3” and work backwards to {1,2,3,4,5} as a unique answer. So even ordinary averaging is not uniquely invertible, and thus it loses information. The same is true for the gaussian average, or a boxcar average, or most filters.
w.

March 17, 2014 2:34 pm

Willis I repeat my comment to Bart from above
“Bart You are right about there being no precise cycles in nature .The 60 year cycle is very obvious as a modulation of the approximate 1000 year quasi periodicity which comes and goes as it beats with other cycles. See Fig 4 at
http://climatesense.norpag.blogspot.com
It looks good at 10000.9000 8000. 5000? 2000 1000 and 0 ( the present warming)
The same link provides an estimate of the possible coming cooling based on the 60 and 1000 year periodicities and the neutron count as the best proxy for solar “activity”
These quasi periodicities in the temperature data suggest what periodicities in the driver data are worth investigating. You do not need to understand the processes involved in order to make reasonable forecasts – see the link above.
The chief uncertainty in my forecasts is in knowing the timing of the peak in the 1000 year quasi periodicity .As Bart says- there no precise cycles – this one seems to oscillate from say 950 – 1050
. Looking at the state of the sun I think we are just past a synchronous peak in both the 60 and 1000 year periodicities. I’d appreciate your view on this .

Greg Goodman
March 17, 2014 2:47 pm

Bernie H: “As for Lanczos, if I recall correctly it is an impulse response that is nothing more than a truncated sinc. As such it is subject to Gibbs Phenomenon transition band ripples – perhaps 23% or so – hardly a flat passband. ”
I went into some detail about all this on hte CE thread that Richard linked at the top . I included links to freq spectra of all discussed filters including Lanczos, and provided example code implementation in the update, if anyone wants to try it.
Here is a comparison of the freq response of the symmetric triple RM that Richard is providing, a Lanczos and a slightly different asymmetric triple RM that minimises the negative lobe that inverts part of the data. (It’s a fine difference, but while we’re designing a filter, we may as well minimise it’s defects).
http://climategrog.wordpress.com/2013/11/28/lanczos-filter-script/
Close-up of the stop-band ripple:
http://climategrog.wordpress.com/?attachment_id=660
There’s link in there that provides a close up of the detail of the lobes for those interested.
Yes there is a little bump in the pass band but it’s not the gregarious ripple that is seen in SG Butterworth, Chebychev, etc., and yes it’s a symmetric kernel so linear phase response.
I used a three lobe lanczos because it does not have too much Gibbs effect. You can obtain narrower transition band at the expense of accepting more ripple and over-shoot. Horses for courses as always in filter design.
The basic idea with Lanczos is to start with the ideal step function frequency response: the ‘ideal’ filter. But this requires an infinite length *sinc function’ as the kernel to achieve it. As soon as you truncate the kernel you multiply the nice, clear step your started with by another sinc fn.
Lanczos minimises the latter defect by fading out the truncation of the kernel rather than cutting it off. How quickly you cut it off, or fade it out determines the balance between the ideal step and the ringing. Lanczos was quite genius at maths and calculated this as the optimum solution. So it is somewhat more subtle than “nothing more than a truncated sinc”.
You can’t do it in three columns in a spreadsheet but it can be defined fairly easily as a macro function and it’s not that hard to code. It’s still a simple kernel based convolution. In fact it is no more complicated to do than a gaussian.
Gaussian or 3RM are useful, well-behaved low-pass filters but they make a pretty poor basis for high-pass or band-pass since they start to attenuate right from zero. If you want to make compound filters something with at least some portion of flat passband is much better. That is why I provided a high-pass Lanczos in my article too.

Greg Goodman
March 17, 2014 3:06 pm

@Willis:
Comparison of freq resp of gaussian and triple RM:
http://climategrog.wordpress.com/?attachment_id=424
It can be seen gauss leaks about 4% of the nominal cut-off that the 3RM blocks completely. This can be a useful difference in the presence of a strong annual cycle. If you use a thick enough pencil, and scale your graph to see the cycle that was so big you wanted to get rid of it, it may not be that noticeable.
When the 4% does not matter, there’s nothing wrong with gaussian, it’s a good filter.

steverichards1984
March 17, 2014 3:09 pm

“RichardLH:
Excellent post. a good description of an easy to use filter to help expose underlying behaviour.
I find it sad that some complain about cycles when you said: “If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there”
I hope the crowd sourcing aspect will enable the multiplier (1.2067) to be refined more quickly with more R users on to it.
I find it sad that people say “where is your code and where is the data you used” when it is on your linked website.
I hope you continue to develop these ideas.
If it is significantly better than fft’s as you suggest, then I can see its usage moving into other technical areas.

Greg Goodman
March 17, 2014 3:28 pm

Willis: “So even ordinary averaging is not uniquely invertible, and thus it loses information. The same is true for the gaussian average, or a boxcar average, or most filters.”
That is true for straight average and RM since all the weights are equal. It is not the case for a gaussian or other kernel were the weights are unique.
The reverse process is called deconvolution. It’s basically a division in the frequency domain in the same way a convolution is multiplication in the freq domain. For example you can ‘soften’ a digital image with a gaussian ‘blur’ ( a 2-D gaussian low-pass filter ). You can then ‘unblur’ with a deconvolution. There are some artefacts due to calculation errors but in essence it is reversible.
This can be taken further to remove linear movement blur, or to refocus an out of focus photo. Yeah, I’d spent years telling people once you had taken a blurred shot there was NOTHING you could do with photoshop or anything else to fix it because the information just wasn’t there to recover. Wrong!
It is even possible to remove quite complex hand shake blurs, if you can find a spot in the photo that gives you a trace of the movement. This is also the way they initially corrected the blurring caused by the incorrect radius on the Hubble space telescope mirror. They found a dark bit of sky, with just one star in the field of view. They took a (blurred) image of it and then deconvolved all further shots using the FT of the blurred star image. That got them almost back to design spec and tied them through until they could get someone up there to install corrective optics.

Greg Goodman
March 17, 2014 3:43 pm

One way to get a better look at how regular the supposed 60 year cycle is, would be to ‘detrend’ by fitting, say a 2nd order polynomial to the unfiltered data, subtract it and then do the filter. This should produce a levelled out wiggle which can be compared to a pure cosine or peaks and troughs picked out to see how stable the cycle length and amplitude is.
Just in case anyone is inclined, I have something else I’d rather work on now.

Bernie Hutchins
March 17, 2014 4:08 pm

Greg Goodman says:
March 17, 2014 at 2:47 pm
Greg – Regarding Lanczos: thanks for that – I learned something. It is not a truncated sinc but the truncated product of two sincs.
The two sincs have different widths, so their produce in the time domain corresponds to the convolution of two unequal width rectangles – which is a trapezoid. This is then in turn truncated (by a rectangular product of the sync product in time) so the trapezoid in frequency is smeared by a frequency domain sinc. [ We use this, (it’s an old friend), but don’t give it a special name – it’s a “linear transition region”.]
Because of the gradual transition, the truncation is not the full 23% Gibbs but rather a much smaller bump. The result would look very much like a trapezoid with small bumps both sides of the linear slope. In fact it would look like something like – EXACTLY what you plotted! [Good homework problem. In practice we would probably truncate with a Hamming window rather than with a rectangle – with virtually all the Gibbs gone. Or best of all, you minimize the frequency domain squared error and make that transition region a “don’t care” band – like with Parks-McClellan.]
I like that filter pretty well. No significant passband ripple, and a Hamming window would fix even that. Nice control over roll-off if that is important. Good candidate along with SG. Thanks for the reply.
I am not sure, however why you suggest “gregarious ripple that is seen in SG Butterworth, Chebychev, etc”. Certainly SG and BW have no passband ripple. Chebychev certainly does, but note that SG looks a lot like inverse Chebyshev.
Bernie

RichardLH
March 17, 2014 4:29 pm

Jeff Patterson says:
March 17, 2014 at 1:46 pm
“A LPF by definition, passes frequencies below it’s cut-off frequency. Since D.C. (i.e. frequency = 0) is always below the cut-off, the above statement by RLH is incorrect. A ramp-like input into a LPF will cause a lagged ramp-like output, with the lag determined by the filters group delay characteristic.”
That is true but not what was asked which was.
Lance Wallace says:“Can the Triple whatsis somehow compare the two curves and derive either a lag time or an estimate of how much of the CO2 emissions makes it into the observed atmospheric concentrations?”
I gave an answer that I intended to indicate that a low pass filter is incapable of such a connection.
I apologise it the words I used failed to convey that meaning.
P.S. The lag in this case would be one month, i.e. the sample frequency. AFAIK.

RichardLH
March 17, 2014 4:33 pm

Bernie Hutchins says:
March 17, 2014 at 2:00 pm
“And we all already agree we can see the apparent 60 year periodicity.”
If I thought that the statement was true and universally accepted then I would probably not have constructed the post.
“What does your method offer that is better, let alone new?”
Only a very simple way to construct a Gaussian filter. Two extra columns in a spreadsheet and a few lines of R.
The reasoning about using high quality filters to examine the temperature traces rather than a statistical analysis hold regardless of the particular filter employed.
I believe that Gaussian is the best. I believe that a 15 year low pass uncovers details that others are missing/ignoring.

Editor
March 17, 2014 4:34 pm

Greg Goodman says:
March 17, 2014 at 3:06 pm
@Willis:

Comparison of freq resp of gaussian and triple RM:
http://climategrog.wordpress.com/?attachment_id=424
It can be seen gauss leaks about 4% of the nominal cut-off that the 3RM blocks completely. This can be a useful difference in the presence of a strong annual cycle. If you use a thick enough pencil, and scale your graph to see the cycle that was so big you wanted to get rid of it, it may not be that noticeable.
When the 4% does not matter, there’s nothing wrong with gaussian, it’s a good filter.

Thanks for that, Greg. However, as you might imagine (and as I mentioned above) I also looked at the residuals from the process. There is no annual signal in the residual (CTRM minus gaussian).
As a result, once again I reject the claim of both yourself and Richard that the Gaussian doesn’t remove the annual signal. Both of them remove it.
Finally, although the difference at a given point might be 4%, the difference between the filters is the difference in the integrals … and that looks by eye to be much less than 4%
w.

ThinkingScientist
March 17, 2014 4:34 pm

RichardLH
Nice article, I like the simplicity of getting a high quality filter/splitter from just three spreadsheet columns, easy to understand and easy to apply. And simple to get the low and high pass parts without much loss. Neat.
Concerning the quasi-60 year periodicity, I was reminded to go and dig out a spreadsheet I made where I was looking at the periodicity in the temperature data too. My method was based on sine wave cross-correlation, looking for the peak cross-correlation as I swept the frequency up. Bit laborious in a spreadsheet, but I get a period of 66 years using that method. Takes more than 3 columns though…:-)
The other interesting data set to look at which is a long reconstruction (but no tree rings!) and only using other published proxies is Loehle (2007). On that data set the peak cross-correlation period (with the swept frequency cross-correlation I used) is 1560 years.

RichardLH
March 17, 2014 4:36 pm

Willis Eschenbach says:
March 17, 2014 at 2:02 pm
“So, as an honest scientist, I had to say that if the cycles are there, nobody’s found them yet, and turn to more productive arenas.”
So as an honest scientists I observe that there is a wriggle in the current data which needs explaining and cannot just be swept under the carpet.
Something caused what we see. Any offers as to what? Co-incidence?

Bernie Hutchins
March 17, 2014 4:36 pm

Greg Goodman replied to Willis March 17, 2014 at 3:28 pm in part:
“ Willis: “So even ordinary averaging is not uniquely invertible, and thus it loses information. The same is true for the gaussian average, or a boxcar average, or most filters.” Greg replied: “That is true for straight average and RM since all the weights are equal. It is not the case for a gaussian or other kernel were the weights are unique.” “
I believe the real issue is whether or not the filter has notches. Students are often surprised that so many digital filters have unit circle zeros, and ask why: Because so many useful filters (nearly all?) have stopbands and we go through zero to be reasonably close to zero. But inverting unit circle zeros does not work. If you notch out something, it’s gone and you can’t get it back. That is, you have poles on the unit circle (infinite response at the corresponding, former, notch frequencies).
Further it has nothing to do with unequal weights. RM has unit circle zeros, as does SG, CTRM, Lanctos, and most other filters except Gaussian. Occasionally filters don’t have stopbands (such as audio equalizers – like tone controls) and avoid unit-circle zeros. It is always possible to specify such a filter – it just may not have useful applications.

RichardLH
March 17, 2014 4:40 pm

Willis Eschenbach says:
March 17, 2014 at 2:29 pm
“That doesn’t seem right to me. Any filter which is not uniquely invertible loses information. ”
Your intuition is wrong however. There is no data loss, only removal from this plot. The other high pass plot can be constructed that shows the removed information by subtracting this trace from the input trace.. Together they always add to the original input.
It is possible that some frequencies around the corner value will be distributed into both high and low in various mixtures, but when added back together the answer will still be correct.

cd
March 17, 2014 4:40 pm

Greg
It is not the case for a gaussian or other kernel were the weights are unique.
Sorry as someone who has to program professional applications that is just wrong. Sometimes I wonder whether you read something online only to spout some more “smoothed” and rather “convoluted” version here…
How on earth with overlapping kernel does one untangle the interference, how would one do it EVEN IN A DISCRETE VERSION WITH THE A GAUSSIAN FILTER KERNEL!!!!!!!!!!!!!!!!!!!!!
Image processing is a very easy and an accessible way to test this:
You cannot apply a Gaussian filter to an image, save the image then pass it to someone without the original image, then given just the kernel design reconfigure reproduce the original image. There are many “deconvolution” methods but they never recreate the original image, that would need to assume continuity (and determinism) in a discretised series where the best one can hope for is semi-continuity in a stochastic series.
You sir are full of BS! And I am starting to wonder whether I’ve given Richard to much kudos his answers to my question – hardly technically demanding – have produced answers which are neither here nor there.

garymount
March 17, 2014 4:42 pm

Personally I don’t see cycles, I see damping. In my days of training for my airplane pilot license I was taught about the stability of the controlling surfaces of the airplane, ailerons, rudder, the pitch, yaw and roll axis. What happens when you apply power, pitch the nose up, or down, and fatefully when you drop like a rock when you put yourself into a stall (I pur