Crowdsourcing A Full Kernel Cascaded Triple Running Mean Low Pass Filter, No Seriously…

Fig 4-HadCrut4 Monthly Anomalies with CTRM Annual, 15 and 75 years low pass filters

Image Credit: Climate Data Blog

By Richard Linsley Hood  – Edited by Just The Facts

The goal of this crowdsourcing thread is to present a 12 month/365 day Cascaded Triple Running Mean (CTRM) filter, inform readers of its basis and value, and gather your input on how I can improve and develop it. A 12 month/365 day CTRM filter completely removes the annual ‘cycle’, as the CTRM is a near Gaussian low pass filter. In fact it is slightly better than Gaussian in that it completely removes the 12 month ‘cycle’ whereas true Gaussian leaves a small residual of that still in the data. This new tool is an attempt to produce a more accurate treatment of climate data and see what new perspectives, if any, it uncovers. This tool builds on the good work by Greg Goodman, with Vaughan Pratt’s valuable input, on this thread on Climate Etc.

Before we get too far into this, let me explain some of the terminology that will be used in this article:

—————-

Filter:

“In signal processing, a filter is a device or process that removes from a signal some unwanted component or feature. Filtering is a class of signal processing, the defining feature of filters being the complete or partial suppression of some aspect of the signal. Most often, this means removing some frequencies and not others in order to suppress interfering signals and reduce background noise.” Wikipedia.

Gaussian Filter:

A Gaussian Filter is probably the ideal filter in time domain terms. That is, if you consider the graphs you are looking at are like the ones displayed on an oscilloscope, then a Gaussian filter is the one that adds the least amount of distortions to the signal.

Full Kernel Filter:

Indicates that the output of the filter will not change when new data is added (except to extend the existing plot). It does not extend up to the ends of the data available, because the output is in the centre of the input range. This is its biggest limitation.

Low Pass Filter:

A low pass filter is one which removes the high frequency components in a signal. One of its most common usages is in anti-aliasing filters for conditioning signals prior to analog-to-digital conversion. Daily, Monthly and Annual averages are low pass filters also.

Cascaded:

A cascade is where you feed the output of the first stage into the input of the next stage and so on. In a spreadsheet implementation of a CTRM you can produce a single average column in the normal way and then use that column as an input to create the next output column and so on. The value of the inter-stage multiplier/divider is very important. It should be set to 1.2067. This is the precise value that makes the CTRM into a near Gaussian filter. It gives values of 12, 10 and 8 months for the three stages in an Annual filter for example.

Triple Running Mean:

The simplest method to remove high frequencies or smooth data is to use moving averages, also referred to as running means. A running mean filter is the standard ‘average’ that is most commonly used in Climate work. On its own it is a very bad form of filter and produces a lot of arithmetic artefacts. Adding three of those ‘back to back’ in a cascade, however, allows for a much higher quality filter that is also very easy to implement. It just needs two more stages than are normally used.

—————

With all of this in mind, a CTRM filter, used either at 365 days (if we have that resolution of data available) or 12 months in length with the most common data sets, will completely remove the Annual cycle while retaining the underlying monthly sampling frequency in the output. In fact it is even better than that, as it does not matter if the data used has been normalised already or not. A CTRM filter will produce the same output on either raw or normalised data, with only a small offset in order to address whatever the ‘Normal’ period chosen by the data provider. There are no added distortions of any sort from the filter.

Let’s take a look at at what this generates in practice.The following are UAH Anomalies from 1979 to Present with an Annual CTRM applied:

Fig 1-Feb UAH Monthly Global Anomalies with CTRM Annual low pass filter

Fig 1: UAH data with an Annual CTRM filter

Note that I have just plotted the data points. The CTRM filter has removed the ‘visual noise’ that a month to month variability causes. This is very similar to the 12 or 13 month single running mean that is often used, however it is more accurate as the mathematical errors produced by those simple running means are removed. Additionally, the higher frequencies are completely removed while all the lower frequencies are left completely intact.

The following are HadCRUT4 Anomalies from 1850 to Present with an Annual CTRM applied:

Fig 2-Jan HadCrut4 Monthly Anomalies with CTRM Annual low pass filter

Fig 2: HadCRUT4 data with an Annual CTRM filter

Note again that all the higher frequencies have been removed and the lower frequencies are all displayed without distortions or noise.

There is a small issue with these CTRM filters in that CTRMs are ‘full kernel’ filters as mentioned above, meaning their outputs will not change when new data is added (except to extend the existing plot). However, because the output is in the middle of the input data, they do not extend up to the ends of the data available as can be seen above. In order to overcome this issue, some additional work will be required.

The basic principles of filters work over all timescales, thus we do not need to constrain ourselves to an Annual filter. We are, after all, trying to determine how this complex load that is the Earth reacts to the constantly varying surface input and surface reflection/absorption with very long timescale storage and release systems including phase change, mass transport and the like. If this were some giant mechanical structure slowly vibrating away we would run low pass filters with much longer time constants to see what was down in the sub-harmonics. So let’s do just that for Climate.

When I applied a standard time/energy low pass filter sweep against the data I noticed that there is a sweet spot around 12-20 years where the output changes very little. This looks like it may well be a good stop/pass band binary chop point. So I choose 15 years as the roll off point to see what happens. Remember this is a standard low pass/band-pass filter, similar to the one that splits telephone from broadband to connect to the Internet. Using this approach, all frequencies of any period above 15 years are fully preserved in the output and all frequencies below that point are completely removed.

The following are HadCRUT4 Anomalies from 1850 to Present with a 15 CTRM and a 75 year single mean applied:

Fig 3-Jan HadCrut4 Monthly Anomalies with CTRM Annual, 15 and 75 years low pass filters

Fig 3: HadCRUT4 with additional greater than 15 year low pass. Greater than 75 year low pass filter included to remove the red trace discovered by the first pass.

Now, when reviewing the plot above some have claimed that this is a curve fitting or a ‘cycle mania’ exercise. However, the data hasn’t been fit to anything, I just applied a filter. Then out pops some wriggle in that plot which the data draws all on its own at around ~60 years. It’s the data what done it – not me! If you see any ‘cycle’ in graph, then that’s your perception. What you can’t do is say the wriggle is not there. That’s what the DATA says is there.

Note that the extra ‘greater than 75 years’ single running mean is included to remove the discovered ~60 year line, as one would normally do to get whatever residual is left. Only a single stage running mean can be used as the data available is too short for a full triple cascaded set. The UAH and RSS data series are too short to run a full greater than 15 year triple cascade pass on them, but it is possible to do a greater than 7.5 year which I’ll leave for a future exercise.

And that Full Kernel problem? We can add a Savitzky-Golay filter to the set,  which is the Engineering equivalent of LOWESS in Statistics, so should not meet too much resistance from statisticians (want to bet?).

Fig 4-Jan HadCrut4 Monthly Anomalies with CTRM Annual, 15 and 75 years low pass filters and S-G

Fig 4: HadCRUT4 with additional S-G projections to observe near term future trends

We can verify that the parameters chosen are correct because the line closely follows the full kernel filter if that is used as a training/verification guide. The latest part of the line should not be considered an absolute guide to the future. Like LOWESS, S-G will ‘whip’ around on new data like a caterpillar searching for a new leaf. However, it tends to follow a similar trajectory, at least until it runs into a tree. While this only a basic predictive tool, which estimates that the future will be like the recent past, the tool estimates that we are over a local peak and headed downwards…

And there we have it. A simple data treatment for the various temperature data sets, a high quality filter that removes the noise and helps us to see the bigger picture. Something to test the various claims made as to how the climate system works. Want to compare it against CO2. Go for it. Want to check SO2. Again fine. Volcanoes? Be my guest. Here is a spreadsheet containing UAH and a Annual CTRM and R code for a simple RSS graph. Please just don’t complain if the results from the data don’t meet your expectations. This is just data and summaries of the data. Occam’s Razor for a temperature series. Very simple, but it should be very revealing.

Now the question is how I can improve it. Do you see any flaws in the methodology or tool I’ve developed? Do you know how I can make it more accurate, more effective or more accessible? What other data sets do you think might be good candidates for a CTRM filter? Are there any particular combinations of data sets that you would like to see? You may have noted the 15 year CTRM combining UAH, RSS, HadCRUT and GISS at the head of this article. I have been developing various options at my new Climate Data Blog and based upon your input on this thread, I am planning a follow up article that will delve into some combinations of data sets, some of their similarities and some of their differences.

About the Author: Richard Linsley Hood holds an MSc in System Design and has been working as a ‘Practicing Logician’ (aka Computer Geek) to look at signals, images and the modelling of things in general inside computers for over 40 years now. This is his first venture into Climate Science and temperature analysis.

Advertisements

  Subscribe  
newest oldest most voted
Notify of
Kirk c

That is so cool!

Lance Wallace

The CO2 curve (seasonally detrended monthly) can be fit remarkably closely by a quadratic or exponential curve, each with <1% error for 650 or so consecutive months:
http://wattsupwiththat.com/2012/06/02/what-can-we-learn-from-the-mauna-loa-co2-curve-2/
The exponential has a time constant (e-folding time) on the order of 60 years (i.e., doubling time for the anthropogenic additions from the preindustrial level of 260 ppm of about 60*0.69 = 42 years).
Can the Full Kernel Triple thingamajig provide any further insight into the characteristics of the curve? For example, due to various efforts by governments to reduce CO2 emissions, can we see any effect on the curve to date? I tried fitting the exponential curve only up to the year 2005, then 2010, finally to September 2013) and there was a very small movement toward lengthened e-folding time (60.26, 60.94, 61.59 years). But also, theoretically, there is some relation between atmospheric CO2 and CO2 emissions, but one has to assume something about the lag time between CO2 emissions and CO2 concentrations. Can the Triple whatsis somehow compare the two curves and derive either a lag time or an estimate of how much of the CO2 emissions makes it into the observed atmospheric concentrations? Simply assuming that all the CO2 emissions make it into the atmosphere in the following year gives an R^2 on the order of 40% (IIRC).

geran

Since this is your first foray into “climate science”, Richard, let me help with the basics you will need to know.
Climate modeling is about as close to science as are video games. (Except in the better video games, there are sound effects.) Climate modeling works like this: Enter input, run program, no catastrophic results, enter new input and re-start. Continue until you get catastrophic results. Then, publish results and request more funding.
If your program never achieves a catastrophic result, adjust input data until proper results are achieved. (Ever heard of a hockey stick?)
If someone questions your knowledge of science, produce numerous graphics, in color, of your results.
It’s a veritable gold mine!

Arno Arrak

Why bother with HadCrut and GISS? They are worthless. Use UAH and RSS, but get rid of that annual version and use monthly versions. And draw the trend with a semi-transparent magic marker. See Figure 15 in my book “What Warming?”

RichardLH

Lance Wallace says:
March 16, 2014 at 1:42 pm
“Can the Triple whatsis somehow compare the two curves and derive either a lag time or an estimate of how much of the CO2 emissions makes it into the observed atmospheric concentrations?”
Not really. Low pass filters are only going to show periodicity in the data and the CO2 figure is a continuously(?) rising curve.
It is useful to compare how the CO2 curve matches to the residual after you have removed the ~60 ‘cycle’ and there it does match quite well but with one big problem, you need to find something else before 1850 to make it all work out right.
Volcanos are the current favourite but I think that getting just the right number and size of volcanos needed is stretching co-incidence a little too far. Still possible though.

Eliza

Arno 100% correct GISS, HADCRUT are trash data I don’t understand why any posting using that data can be taken seriously *”adjustments”, “UHI ect…”. This is just feeding the warmist trolls.

RichardLH

Arno Arrak says:
March 16, 2014 at 1:45 pm
“Why bother with HadCrut and GISS? They are worthless. Use UAH and RSS, but get rid of that annual version and use monthly versions.”
They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.
They do match together quite well though with some odd differences.
What good would a monthly view do when trying to assess climate? Assuming you accept that climate is a long term thing, i.e. longer than 15 years. Monthly is down in the Weather range.

RichardLH

Eliza says:
March 16, 2014 at 2:00 pm
“Arno 100% correct GISS, HADCRUT are trash data”
You might like to consider the fact that any long term adjustments will show up in the residual, greater than 75 years curve, and, if they have indeed occurred, would only serve to flatten that part of the output.
The ~60 year wriggle will still be there and needs explaining.

Giss and hadcrut are not the only series.
Ncdc. Berkeley. Cowtan and way.
Im using ctrm in some work on gcrs and cloud cover. Thanks for the code richard.
Raise your objections now to the method…folks.

Very nice elucidation of the 60 year cycle in the temperature data, Now do the same for the past 2000 years using a suitable filter. A review of candidate proxy data reconstructions and the historical record of climate during the last 2000 years suggests that at this time the most useful reconstruction for identifying temperature trends in the latest important millennial cycle is that of Christiansen and Ljungqvist 2012 (Fig 5)
http://www.clim-past.net/8/765/2012/cp-8-765-2012.pdf
For a forecast of the coming cooling based on the 60 and 1000 year quasi periodicities in the temperatures and the neutron count as a proxy for solar “activity ”
see http://climatesense-norpag.blogspot.com
this also has the Christiansen plot see Fig3. and the 1000 year cycle from ice cores Fig4
The biggest uncertainty in these forecasts is the uncertainty in the timing of the 1000 year cycle peak.
In the figure in Richards post it looks like it is about 2009. From the SST data it looks like about 2003. See Fig 7 in the link and NOAA data at
ftp://ftp.ncdc.noaa.gov/pub/data/anomalies/annual.ocean.90S.90N.df_1901-2000mean.dat
It is time to abandon forecasting from models and for discussion and forecasting purposes use the pattern recognition method seen at the link
http://climatesense-norpag.blogspot.com .

J Martin

What is the point of all this, what is the end result, projection / prediction ?

Arno Arrak

Lance Wallace says on March 16, 2014 at 1:42 pm:
“The CO2 curve (seasonally detrended monthly) can be fit remarkably closely by a quadratic or exponential curve, each with <1% error for 650 or so consecutive months…"
So what. CO2 is not the cause of any global warming and is not worth that expenditure of useless arithmetic. The only thing important about it is that it is completely smooth (except for its seasonal wiggle) during the last two centuries. It follows from this that it is physically impossible to start any greenhouse warming during these two centuries. We already know that there has been no warming for the last 17 years despite the fact that there is more carbon dioxide in the air now than ever before. That makes the twenty-first century greenhouse free. Since this constant addition of CO2 is not causing any warming it follows that the theory of enhanced greenhouse warming is defective. It does not work and should be discarded. The only theory that correctly describes this behavior is the Miskolczi theory that ignorant global warming activists hate. But the twentieth century did have warming. It came in two spurts. The first one started in 1910, raised global temperature by half a degree Celsius, and stopped in 1940. The second one started in 1999, raised global temperature by a third of a degree in only three years, and then stopped. Here is where the smoothness of the CO2 curve comes in. Radiation laws of physics require that in order to start an enhanced greenhouse warming you must simultaneously add carbon dioxide to the atmosphere. That is because the absorbency of CO2 for infrared radiation is a fixed property of its molecules that cannot be changed. To get more warming, get more molecules. This did not happen in 1910 or in 1999 as shown by the Keeling curve and its extension. Hence, all warming within the twentieth century was natural warming, not enhanced greenhouse warming. Cobsequently we now have the twentieth and the twenty-first centuries both entirely greenhouse free. Hence that anthropogenic global warming that is the life blood of IPCC simply does not exist. To put it in other words: AGW has proven to be nothing more than a pseudo-scientific fantasy, result of a false belief that Hansen discovered greenhouse warming in 1988.

Hlaford

The N-path filters make the best notch filters in sampled domains. They fell under the radar for the other kinds of filtering, but here you’d have 12 paths and a high pass at each of them for the monthly data.
I’ve seen a paper where a guy explains how he removed the noise by vuvuzele trumpets using N-path filtering.

RichardLH

Steven Mosher says:
March 16, 2014 at 2:21 pm
“Giss and hadcrut are not the only series. Ncdc. Berkeley. Cowtan and way.
Im using ctrm in some work on gcrs and cloud cover. Thanks for the code richard.”
No problem – it’s what Greg and Vaughan thrashed out translated into R (for the CTRM).
I know there are other series but I wanted to start with the most commonly used ones first. I have treatments for some of the others as well. At present I am working on a set of Global, Land, Ocean triples as that has thrown up some interesting observations.
It is a great pity that the temperatures series are not available in very in easy to read into R form but require, sometimes, a lot of code just to turn them into data.frames. Makes it a lot more difficult to post the code to the ‘net in an easy to understand form.
The ~60 years ‘cycle’ shows up strongly in all of them as far. 🙂

jai mitchell

Since this constant addition of CO2 is not causing any warming it follows that the theory of enhanced greenhouse warming is defective. It does not work and should be discarded.
. . .hilarious. . .
http://www.skepticalscience.com//pics/oceanheat-NODC-endof2013.jpg

RichardLH

J Martin says:
March 16, 2014 at 2:31 pm
“What is the point of all this, what is the end result, projection / prediction ?”
It allows you to look at how the system responded to the inputs over the last 150+ years with less mathematical errors hiding the details. You can use the S-G trend as a loose guide to the immediate future trend if you like.

Gary

How much useful information is being lost by filtering out the high frequency “noise” ? In other words, how do you judge what the most effective bandwidth of the filter is?

RichardLH

Hlaford says:
March 16, 2014 at 2:37 pm
“The N-path filters make the best notch filters in sampled domains.”
There are many good notch filters out there, but that is not what this is. In fact it is the exact opposite. A broadband stop/pass filter that will allow ALL of the frequencies above 15 years in length to be present in the output. No need to choose what to look for, anything that is there, will be there.

RichardLH

Gary says:
March 16, 2014 at 2:42 pm
“How much useful information is being lost by filtering out the high frequency “noise” ? In other words, how do you judge what the most effective bandwidth of the filter is?”
Well what you lose is, Daily, Weather, Monthly, Yearly and Decadal. What you keep is multi-decadal and longer. Which do you consider to be relevant to Climate?
If you are interested in the other stuff you can run a high pass version instead and look at that only if you wish. Just subtract this output signal from the input signal and away you go.
No data is truly lost as such, it is just in the other pass band. The high and low pass added together will, by definition, always be the input signal.

David L. Hagen

Thanks Richard for an insightful way of exposing the 60 year cycle.
1) Has an algebraic factor been found behind the 1.2067 factor or is this still empirical per von Pratt?
2) Does using the more accurate year length of 365.26 days make any difference?
3) Suggest showing the derivative of the CTRM curve.
That would show more clearly that the rate of warming is declining. If that derivative goes though zero, that would give evidence that we are now entering the next “cooling” period vs just flattening in warming.
PS suggest amending “with a 15 CTRM and” to “with a 15 CTRM year and” to read more clearly.
Interesting interaction you had with TallBloke.

Mike McMillan

Well, it certainly gets rid of the 1998 peak problem. GISS would be appreciative.

Bernie Hutchins

Is it possible to humor some of us old-timer digital filter designers by specifically saying what the (Finite) Impulse Response of the CTRM is; in basic terms such as what simpler elements are being cascaded (convolved for the IR response, multiplied for the frequency response, etc.)? This is the key to understanding filtering – old or new. Thanks.

RichardLH

Dr Norman Page says:
March 16, 2014 at 2:30 pm
“Very nice elucidation of the 60 year cycle in the temperature data, Now do the same for the past 2000 years using a suitable filter.”
Bit of a challenge finding a thermometer series going back that far. 😉
Proxy series all come with their own problems. They are rarely in a resolution that will allow the ~60 year signal to be seen.
I do have one which is the Shen PDO reconstruction from rainfall which does it quite well back to the 1400s.
http://climatedatablog.files.wordpress.com/2014/02/pdo-reconstruction-1470-1998-shen-2006-with-gaussian-low-pass-30-and-75-year-filters-and-hadcrut-overlay.png
Shen, C., W.-C. Wang, W. Gong, and Z. Hao. 2006.
A Pacific Decadal Oscillation record since 1470 AD reconstructed
from proxy data of summer rainfall over eastern China.
Geophysical Research Letters, vol. 33, L03702, February 2006.
ftp://ftp.ncdc.noaa.gov/pub/data/paleo/historical/pacific/pdo-shen2006.txt
Looks like the ~60 year is present a long way back then. As to any longer cycles, there the problem is resolution and data length. In most case the ‘noise’ is so great you can conclude almost anything and not be proved wrong by the data unfortunately.

Graeme W

I have a couple of questions regarding the initial filter.
1. A 365 day filter has an ongoing problem due to leap-years. There are, roughly, 25 leap days in a century, which pushes a 365 day filter almost a month out of alignment. How is this catered for? It’s not a problem with a 12 month filter, but then you hit the problem that each month is not equal in length.
2. You’re doing filtering on anomalies to remove the seasonal component. Aren’t the anomalies supposed to do that themselves, since they’re anomalies from the average for each part of the season. That is, the anomalies themselves are trying to remove the seasonal component, and the filter is also trying to remove the seasonal component. How do ensure that any result we get isn’t an artifact of these two processes interacting?
Please forgive me if these questions show my ignorance, but they’ve been a concern of mine for awhile.

Steve Taylor

Richard, are you related to the late, great John Linsley Hood, by any chance ?

Clay Marley

I am not too fond of the triple running mean filter, mainly because it is not easy to implement generically in Excel. By that I mean if I wanted to change the number of points in one stage, it is a fair amount of work. Also there are three smoothing parameters to tinker with rather than a single number of regression points, and I have not seen any good theory on how to pick the three ranges to average over.
I prefer either a LOESS or Gaussian filter for this kind of data. Both can be implemented easily in Excel with macros. Both have the advantage of being able to handle points near either end of the data set. LOESS also works when the points are not equally spaced. By selecting the proper number of regression points, either one will be able to produce results virtually indistinguishable from the triple running mean filter. Also with LOESS it is possible to estimate the derivative at any point.
I have not experimented much with the Savitzky-Golay filter yet because it requires recalculating the coefficients with each change in the order or number of regression points. Can be done of course, just makes the macros more complicated. And I am not sure it is worth the effort. One of these days I’ll probably get around to it.
Any smoothing like this is mainly just a data visualization tool. How it looks is largely subjective. Rarely do I see a smoothed signal being used in later analysis, such as correlating it with something; and when I do it raises a red flag.

RichardLH
“They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.”
like I said. wrong

RichardLH

David L. Hagen says:
March 16, 2014 at 3:01 pm
“Thanks Richard for an insightful way of exposing the 60 year cycle.”
Lots of people claim it is not there. This just shows that you cannot really ignore it that easily.
“1) Has an algebraic factor been found behind the 1.2067 factor or is this still empirical per von Pratt?”
Vaughan has a precise set of reasoning which he gave on Greg’s thread over at JCs. It is not empirical but pure mathematical.
” 2) Does using the more accurate year length of 365.26 days make any difference?”
I have found that there may be a 4 year signal in some data which would suggest that it may well be. I haven’t looked in detail at the high pass portion of the signal yet.
“3) Suggest showing the derivative of the CTRM curve.”
I hate Linear Trends and all their cousins. They only hold true over the range they are drawn from and hide rather than reveal as far as I am concerned. Use the residual, greater than 75 years as a better alternative for any long term values. You can run an S-G at that length but please be aware of S-G tendencies to be strongly influenced by the ‘end’ data..
“That would show more clearly that the rate of warming is declining. If that derivative goes though zero, that would give evidence that we are now entering the next “cooling” period vs just flattening in warming.”
I think that the S-G 15 year shows that quite well in any case.

RichardLH

Steve Taylor says:
March 16, 2014 at 3:17 pm
“Richard, are you related to the late, great John Linsley Hood, by any chance ?”
My Uncle. He got me interested in Audio (and Science) a long time back and was the originator of the 1.3371 value I was originally using in my first CTRM digital work.

RLH The Christiansen data I linked to are I think archived. Read the paper and see why I think they do provide a temperature record which is good enough to be useful for analysis. It should replace the Mann hockey stick in the public perception of the NH climate over the last 1000 years.
Again check Figs 3 and 4 at http://climatesense-norpag.blogspot.com.
Forecasting is fairly straightforward if you know where you are on the 1000 year quasi-periodicity.
If you don’t take it into consideration you are lost in a maze of short term noise.

RichardLH

Bernie Hutchins says:
March 16, 2014 at 3:14 pm
“Is it possible to humor some of us old-timer digital filter designers by specifically saying what the (Finite) Impulse Response of the CTRM is”
Basically the triple cascade allows for a near to Gaussian response curve. This can be seen in greater detail on Greg’s thread where the various curves can be compared. CTRM is just slightly better than Gaussian in that it completely removes the corner frequency whereas a true Gaussian does leave a small portion still in the output.

RichardLH

Mike McMillan says:
March 16, 2014 at 3:10 pm
“Well, it certainly gets rid of the 1998 peak problem. GISS would be appreciative.”
Does throw up another question though. Why is it that GISS and HadCRUT are so far apart in the middle? They are close together at the start and the finish, why so different in the middle? I am not sure GISS (or HadCRUT) will thank me that much.

You use the term signal but you haven’t provided a definition for “signal”. Is this data? Is “signal” an entity of some kind? Is there a signal in the probabilities of rolling a dice? I just haven’t come across the word “signal” very often in any of the studies over all the decades of my personal studies and work, perhaps because other terminology is used. My studies have included electronics, but it didn’t involve signal processing. I also spent 10 years full time work in drawing electrical schematic diagrams, about 3 decades of computer science, and I somewhat recently spent several years studying calculus every single day, so now I call myself a mathematician.
If we are using computers to work with a signal, that means it is probably converted to discretized data, consisting of finite numbers, though you can use semantic computation, though I don’t think that’s what I see happening.
Sorry for being somewhat pedantic, but I’m just not conformable with calling the temperature of the global surface a signal, so I’m asking for a better description of “signal”.

david moon

It’s hard to tell what this is from the description in words. Can you show a block diagram or provide a spreadsheet? Also just showing the time response to a step input, or the frequency response would be helpful.
It sounds like it’s an FIR filter with no feedback of the output. If so, you should be able to multiply out the nested means and the coefficient and get 12 coefficients, one for each tap of an FIR.
One quibble: 75 years is period, not a frequency. It would be more correct to say “…frequencies with periods greater than 75 years…”

RichardLH

Graeme W says:
March 16, 2014 at 3:16 pm
“I have a couple of questions regarding the initial filter.
1. A 365 day filter has an ongoing problem due to leap-years. There are, roughly, 25 leap days in a century, which pushes a 365 day filter almost a month out of alignment. How is this catered for? It’s not a problem with a 12 month filter, but then you hit the problem that each month is not equal in length.”
True. The advantage of a continuous CTRM filter is that the full sampling frequency is preserved in the output so any error is very tiny. It you were to sub-sample to Annual then the problem you see would be present. This does not have that problem.
“2. You’re doing filtering on anomalies to remove the seasonal component. Aren’t the anomalies supposed to do that themselves, since they’re anomalies from the average for each part of the season. That is, the anomalies themselves are trying to remove the seasonal component, and the filter is also trying to remove the seasonal component. How do ensure that any result we get isn’t an artifact of these two processes interacting?”
Fortunately it doesn’t matter if you do this on Temperature or Anomalies, you end up with the same curve, just a different label to the left. The problem with ‘normals’ is that there are only 30 additions to make the normal. That will leave some error term in that which will then be present in all of the Anomaly outputs. Using a CTRM removes that residual as well rather nicely.
“Please forgive me if these questions show my ignorance, but they’ve been a concern of mine for awhile.”
Mine too which is why I use CTRM now as it is much mathematically purer.

RichardLH

Steven Mosher says:
March 16, 2014 at 3:21 pm
“RichardLH
“They are two of the only series that stretch back to 1850 (which unfortunately the satellite data does not) and with Global coverage.”
like I said. wrong”
OK. To be precise, there is only one set of thermometers from which the various treatments are drawn.
Some included more thermometers than the others, some less. I have comparisons between most of them and none come out particularly favourably. I have BEST, C&W, NCDC and others. All show surprising disparities with each other with no easy explanations.
I’ll get round to them in time but the next set is Land, Ocean and General triples for the sets I do have to hand right now.

RichardLH

Dr Norman Page says:
March 16, 2014 at 3:27 pm
“RLH The Christiansen data I linked to are I think archived.”
Thanks, I will. DO you know if the data is on the ‘net and the url? Or is it just the paper?

Bernie Hutchins

RichardLH says: March 16, 2014 at 3:30 pm
“Bernie Hutchins says:
March 16, 2014 at 3:14 pm
“Is it possible to humor some of us old-timer digital filter designers by specifically saying what the (Finite) Impulse Response of the CTRM is”
Basically the triple cascade allows for a near to Gaussian response curve. This can be seen in greater detail on Greg’s thread where the various curves can be compared. CTRM is just slightly better than Gaussian in that it completely removes the corner frequency whereas a true Gaussian does leave a small portion still in the output.”
Fine – thanks. But we need a “formula” or design procedure that leads to h(n) = …. so that we can calculate the frequency response in a known way (sum of frequency domain cosines in this case). It seems to me that SG with its optimal flatness around DC couldn’t be beaten. Because the data is noisy, spurious resonances (if any) matter. Hence the advantage of SG.

RichardLH

garymount says:
March 16, 2014 at 3:34 pm
“You use the term signal but you haven’t provided a definition for “signal”. Is this data?”
You are quite right. I do tend to use the term loosely. As far as I am concerned a ‘signal’ is any variation in the temperature data, long term. I tend to class Weather, Monthly, Annual as ‘noise’.
Just my background. Sorry if it confuses.

david moon

Assuming monthly data points: 1, 2, etc. Is this how it works?
M1 = mean(1,2,3,4)
M2 = mean(5,6,7,8)
M3 = mean(9,10,11,12)
M4 = 1.2067 x mean(M!,M2)
M5 = 1.2067 x mean(M2,M3)
OUTPUT = 1.2067 x mean(M4,M5)
Shift data one place and continue…

RichardLH says: March 16, 2014 at 3:51 pm
– – –
Thanks Richard, now I feel better 🙂

RichardLH

david moon says:
March 16, 2014 at 3:36 pm
“It’s hard to tell what this is from the description in words. Can you show a block diagram or provide a spreadsheet?”
…Here is a spreadsheet (http://wattsupwiththat.files.wordpress.com/2010/07/uah-with-annual-ctrm.xlsx) containing UAH and a Annual CTRM.. from above
“Also just showing the time response to a step input, or the frequency response would be helpful.”
If you want freqency response graphs and the like then Greg’s thread on JCs is a better place. I left out a lot of the technical as the post got way too long elsewise.
Remove bits
…all this is at Judith Curry’s site.
http://judithcurry.com/2013/11/22/data-corruption-by-running-mean-smoothers/
Data corruption by running mean ‘smoothers’
Posted on November 22, 2013 by Greg Goodman
or visit Greg’s own site for the same article.
http://climategrog.wordpress.com/2013/05/19/triple-running-mean-filters/

“One quibble: 75 years is period, not a frequency. It would be more correct to say “…frequencies with periods greater than 75 years…””
Again, sorry for the sloppy terminology. I tend to use the two without considering the exact wording required. I think in one and have to write in the other!

RichardLH

david moon says:
March 16, 2014 at 3:54 pm
“Assuming monthly data points: 1, 2, etc. Is this how it works?”
The spreadsheet with its colour coding may help you understand what is going on better.
http://wattsupwiththat.files.wordpress.com/2010/07/uah-with-annual-ctrm.xlsx
Basically you run a single mean over the data. then you use that column as the input to the next stage and so on. Hence the cascaded bit.

A Crooks of Adelaide

Hi,
Just a note on my own observations with respect to the top graph
What I see is a sixty year sine curve
What I also see is a pattern of two peaks followed by a deep trough – two peaks followed by a deep trough etc. The troughs repeat every 7.5 years next one due in 2016 the peaks are every 3.75 years
I use http://www.climate4you.com/images/AllCompared%20GlobalMonthlyTempSince1979.gif
as my satelite source.
If you plot El Chichon and Pinatubo on the graph you will see how much volcanic events disrupt the this pattern ( http://www.esrl.noaa.gov/gmd/grad/mloapt.html) knocking the shoulders of the plateaus of the running average
Sure there is an oddity around 2005 where the peaks ahve their own little troughs too bad
The graphs I use are:
A = 0.18*SIN(((YEAR-1993)/60)*2*3.14159)+0.2
B = 0.1*COS(((YEAR-1982)/7.5)*2*3.14159
C = 0.25*COS(((YEAR -1980)/3.75)*2*3.14159
See you in 2016 to see how close I get!

Arno Arrak

RichardLH on March 16, 2014 at 2:03 pm
Richard, you are hung up on useless arithmetic and are defending falsified temperature curves. Both HadCRUT and GISS showed presence of a non-existent warming in the eighties and nineties. My nose tells me that Hansen is behind this. I pointed it out in my book and two years later both sources changed their data retroactively to make it conform to satellites which do not show this warming. They did this secretly and no explanation was given. To coordinate it there had to be a trans-Atlantic communication link between them. This group also includes NCDC which I did not mention before.
Another thing you talk about is volcanos: ” Volcanos are the current favourite but I think that getting just the right number and size of volcanos needed is stretching co-incidence a little too far.” Forget about volcanoes, there is no volcanic cooling. The hot gases from an eruption ascend to the stratosphere where they warm it at first. In about two years time it turns to cooling but this stays up there too and never descends into the troposphere. The alleged “volcanic cooling” is caused by an accidental coincidence of expected cooling with the location of a La Nina cooling that is part of ENSO. The entire global temperature curve is a concatenation of El Nino peaks with La Nina valleys in between. It has been that way ever since the Panamanian Seaway closed. Many so-called “experts” don’t know this and would like nothing better than to detrend it out of existence. But by so doing they are destroying information and information about the exact location of El Nino peaks is required to evaluate the effect of volcanoes. In general, if an eruption is coincident with an El Nino peak it will be followed by a La Nina valley which promptly gets appropriated for a “volcanic cooling” incident. This happened with Pinatubo. But ENSO and volcanism are not in phase and it can happen that an eruption coincides with a La Nina valley. It will then be followed by an El Nino peak exactly where the volcanic cooling is supposed to happen and volcanologists are still scratching their heads about it. This is what happened with El Chichon in Mexico that was equally as strong as Pinatubo. This lack of understanding leads to more stupidity with models. Looking at a CMIP5 feather duster I noticed that all of the threads that completely disagree with one another about the future dip down together at the locations of El Chichon and Pinatubo volcanoes, no doubt because they have volcanic cooling coded into their software. This story is actually no secret because it has been in my book all along but these “climate” scientists just don’t do their homework and are still ignorant about it.

A Crooks of Adelaide

I guess I needed to add the important bit on how I combine the formulae Sorry about that:
The overarching trend is a sixty year cycle: = A
The moving 20-month average adds a 7.5 year cycle attenuated by the truncation of
the positive peaks of the 7.5 year cycle : = A + (IF B>0, 0, ELSE = B)
The monthly average combines a 7.5 year cycle with a 3.75 year cycle (i.e. twice the
7.5 year cycle) to capture the pattern where every second trough in the 3.75 year COS
function is significantly deeper the others : = A + (3/4) * B + C

RichardLH

Bernie Hutchins says:
March 16, 2014 at 3:51 pm
“Fine – thanks. But we need a “formula” or design procedure that leads to h(n) = …. so that we can calculate the frequency response in a known way (sum of frequency domain cosines in this case).”
Well for Gaussian then Greg refers to this for the full mathematical derivation.
http://www.cwu.edu/~andonie/MyPapers/Gaussian%20Smoothing_96.pdf
For frequency responses then
http://climatedatablog.files.wordpress.com/2014/02/fig-1-gaussian-simple-mean-frequency-plots.png
and
http://climatedatablog.files.wordpress.com/2014/02/fig-2-low-pass-gaussian-ctrm-compare.png
given the various filter responses graphically.

RichardLH

A Crooks of Adelaide says:
March 16, 2014 at 4:02 pm
The point of this is that is NOT a curve fit of any form. It is a simple low pass treatment of the data. Any ‘cycle’ you see, you need to explain. This just shows what is there.

RichardLH

Arno Arrak says:
March 16, 2014 at 4:07 pm
“Richard, you are hung up on useless arithmetic and are defending falsified temperature curves. Both HadCRUT and GISS showed presence of a non-existent warming in the eighties and nineties.”
Low pass filters are not useless arithmetic. They are the way that you can deal with the data you have all the time. You use them all the time in Daily, Monthly and Yearly form. This is just an extension of that.
If you believe that the data has been fiddled, then this should be an ideal tool for you to demonstrate that.

Bernie Hutchins

RichardLH says:
March 16, 2014 at 4:09 pm
Thanks Richard – that’s what I thought.
But when you cascade Moving Averages (running means, rectangles, “boxcars”) you get all the nulls of the original rectangles in the cascade. When you cascade Gaussians, you get essentially the smallest bandwidth of the set (longest average). With SG, you get an attractive wide low-pass bandwidth (optimally flat) at the expense of less high-pass rejection (never a free lunch).
SG is easy to do. By coincidence I posted an application note on it just a month ago. Note my Fig. 6b for example compared to the two frequency responses you posted for me.
http://electronotes.netfirms.com/AN404.pdf
Without due regard for the frequency response, you COULD BE seeing “colored noise” due to a spurious resonance. You are certainly NOT seeing something totally spurious, because for one thing, we can “see” the 60 year periodicity already in the original data. Less noise – but no new information?