Extreme Times

Guest Post by Willis Eschenbach

I read a curious statement on the web yesterday, and I don’t remember where. If the author wishes to claim priority, here’s your chance. The author said (paraphrasing):

If you’re looking at any given time window on an autocorrelated time series, the extreme values are more likely to be at the beginning and the end of the time window.

“Autocorrelation” is a way of measuring how likely it is that tomorrow will be like today. For example, daily mean temperatures are highly auto-correlated. If it’s below freezing today, it’s much more likely to be below freezing tomorrow than it is to be sweltering hot tomorrow, and vice-versa.

Anyhow, being a suspicious fellow, I thought “I wonder if that’s true …”. But I filed it away, thinking, I know that’s an important insight if it’s true … I just don’t know why …

Last night, I burst out laughing when I realized why it would be important if it were true … but I still didn’t know if that was the case. So today, I did the math.

The easiest way to test such a statement is to do what’s called a “Monte Carlo” analysis. You make up a large number of pseudo-random datasets which have an autocorrelation structure similar to some natural autocorrelated dataset. This highly autocorrelated pseudo-random data is often called “red noise”. Because it was handy, I used the HadCRUT global surface air temperature dataset as my autocorrelation template. Figure 1 shows a few “red noise” autocorrelated datasets in color, along with the HadCRUT data in black for comparison.

hadcrut3 temperate data pseudodataFigure 1. HadCRUT3 monthly global mean surface air temperature anomalies (black), after removal of seasonal (annual) swings. Cyan and red show two “red noise” (autocorrelated) random datasets.

The HadCRUT3 dataset is about 2,000 months long. So I generated a very long string (two million data points) as a single continuous long red noise “pseudo-temperature” dataset. Of course, this two million point dataset is stationary, meaning that it has no trend over time, and that the standard deviation is stable over time.

Then I chopped that dataset into sequential 2,000 data-point chunks, and I looked at each 2,000-point chunk to see where the maximum and the minimum data points occurred in that 2,000 data-point chunk itself. If the minimum value was the third data point, I put down the number as “3”, and correspondingly if the maximum was in the next-to-last datapoint it would be recorded as “1999”.

Then, I made a histogram showing in total out of all of those chunks, how many of the extreme values were in the first hundred data points, the second hundred points, and so on. Figure 2 shows that result. Individual runs of a thousand vary, but the general form is always the same.

histogram extreme value locations temperature pseudodataFigure 2. Histogram of the location (from 1 to 2000) of the extreme values in the 2,000 datapoint chunks of “red noise” pseudodata.

So dang, the unknown author was perfectly correct. If you take a random window on a highly autocorrelated “red noise” dataset, the extreme values (minimums and maximums) are indeed more likely, in fact twice as likely, to be at the start and the end of your window rather than anywhere in the middle.

I’m sure you can see where this is going … you know all of those claims about how eight out of the last ten years have been extremely warm? And about how we’re having extreme numbers of storms and extreme weather of all kinds?

That’s why I busted out laughing. If you say “we are living today in extreme, unprecedented times”, mathematically you are likely to be right, even if there is no trend at all, purely because the data is autocorrelated and “today” is at one end of our time window!

How hilarious is that? We are indeed living in extreme times, and we have the data to prove it!

Of course, this feeds right into the AGW alarmism, particularly because any extreme event counts as evidence of how we are living in parlous, out-of-the-ordinary times, whether hot or cold, wet or dry, flood or drought …

On a more serious level, it seems to me that this is a very important observation. Typically, we consider the odds of being in extreme times to be equal across the time window. But as Fig. 2 shows, that’s not true. As a result, we incorrectly consider the occurrence of recent extremes as evidence that the bounds of natural variation have recently been overstepped (e.g. “eight of the ten hottest years”, etc.).

This finding shows that we need to raise the threshold for what we are considering to be “recent extreme weather” … because even if there are no trends at all we are living in extreme times, so we should expect extreme weather.

Of course, this applies to all kinds of datasets. For example, currently we are at a low extreme in hurricanes … but is that low number actually anomalous when the math says that we live in extreme times, so extremes shouldn’t be a surprise?

In any case, I propose that we call this the “End Times Effect”, the tendency of extremes to cluster in recent times simply because the data is autocorrelated and “today” is at one end of our time window … and the corresponding tendency for people to look at those recent extremes and incorrectly assume that we are living in the end times when we are all doomed.

All the best,

w.

Usual Request. If you disagree with what someone says, please have the courtesy to quote the exact words you disagree with. This avoids misunderstandings.

 

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
218 Comments
Inline Feedbacks
View all comments
Steve C
April 25, 2014 6:33 am

@Willis – You really didn’t need to ask. Of course you have full permission, and thanks for the flowers!

ferdberple
April 25, 2014 6:38 am

Willis, the implications are indeed important. Extreme weather can be manufactured statistically simply by segmenting the time series. Intuitively it may not require auto-correlated data if the segments are small enough. However, auto-correlation should allow the effect to occur with larger segments as compared to true random data.
This is a surprising result because it goes against our common sense ideas of randomness. It does seem worthy of a larger, more formal paper as it does have wide implications for those wishing to draw conclusions from statistics.

ferdberple
April 25, 2014 6:48 am

Itocalc says:
April 25, 2014 at 6:20 am
http://www.minneapolisfed.org/publications_papers/pub_display.cfm?id=4348&
==============
a very interesting paper:
Slutsky had shown in dramatic fashion that stochastic processes could create patterns virtually identical to the putative effects of weather patterns, self-perpetuating boom-bust phases and other factors on the economy.

David A
April 25, 2014 7:07 am

From a comment above…”Even for perfectly cyclic, trendless, noiseless data if your length of your sample is shorter than the cycle period you will find extremes at the ends of the sample is the most probable outcome.”
Thanks all for helping a layman begin to follow. Of course the natural earth cycle variance and period is quite the mystery, seeing as our climate is a function of many different cycle periods combining in ever changing variances,caused by many may different inputs combing in unique ways.
I can see why CO2 gets lost in the noise.

GreggB
April 25, 2014 7:13 am

Wow. Just when I think my troglodytic brain is beginning to get a handle on things, Willis comes along and says the inside of a table tennis ball is exciting. And then he explains it, and he’s right. Reading this blog continually humbles me, usually when I’m preening over what I had previously supposed was a clever thought of my own.
Once again Willis, thank you for opening up a new vista.

Editor
April 25, 2014 7:32 am

Makes sense. Autocorrelation with no trend is a random walk and random walks have an increasing standard deviation the longer the walk, so the last data points have a statistical tendency to be the most extreme vis a vis the beginning and vis a vis the rest of the sample. Equivalently, the beginning has a tendency to be the most extreme vis a vis the rest of the sample, and this would also hold for windows within the sample. SOUNDS right anyway.

bwanajohn
April 25, 2014 7:54 am

To me this makes perfect sense considering the definition of autocorrelated data.
Consider: an event, n, where the measure T=f(n) is autocorrelated. By the definition we know that ΔT for f(n – (n+1)) is small therefore the probability of T0 = T1 is high. The same probability holds for each delta step, n+1 – n+2; n+2 – n+3; etc. However the probabilities drop with each successive event from the original event, n, such that the probability of T0=T10 is much lower or ΔT for f(n – (n+10)) is much greater. The function is symmetric about n so that the probability for T1 = T-1, T2 = T-2, etc.
So what we end up with an inverse probability distribution centered about event n (the middle of the graph). All that is required is that the data be autocorrelated.

Mike M
April 25, 2014 8:19 am

So can we apply this statistical result in the form of a correction to various weather related records being touted by alarmists as proof of climate caused weather “extremes”?
In any case, I propose that we call this the “End Times Effect” Okay, but it comes under the sub-category of “End of CAGW Effects”.

Joseph Murphy
April 25, 2014 10:02 am

Excellent stuff Willis. This pins down a sort of a priori feeling about many claims surrounding CAGW, and why I am comfortable ignoring them without having a decent argument as to why they are irrelevant.

April 25, 2014 12:07 pm

Willis Eschenbach says:
April 24, 2014 at 8:42 pm
(checked out pink noise)
Thanks for looking, not that then.
Result you have is nonsensical, why so being the question. I think several comments give clues and is about the validity of the test. I agree with those who point out the known “non-stationary” will produce a convolution kind of result aka fourier transform hence the large items both ends.
Adding noise to something leaves something plus noise.
To me this suggests the result is dominated by the large slow excursions.
Recently I had something similar giving a strange statistics answer. Eventually I figured what, removed it from the “signal” and that left normal stats stuff. The point perhaps is non-stationary is more literal than it might seem.
It might be interesting to band-split the hadcrut at say 4 years and then do the same analysis on each portion. (low pass will do and subtract to produce the complement)
I recall another ploy involving chopping into sections and reordering.

April 25, 2014 12:26 pm

Brilliant. Of course, it makes perfect sense, but you just don’t think of it that way every day.
I have a few I’ve done like this:
http://naturalclimate.wordpress.com/2014/03/31/ipcc-ar5-claims-in-review-last-decades-unusual/
http://naturalclimate.wordpress.com/2012/01/27/268/
http://naturalclimate.wordpress.com/2012/01/28/usa-run-and-rank-analysis/
It is not the least bit unusual for the last years to be ranked #1, top 10, or whatever, as you can see.

WayneM
April 25, 2014 2:54 pm

The effect may be due to the generated time series being red (as has been suggested in several of the comments). Have you tried it with a white time series to see if the result holds for that?

Bob Shapiro
April 25, 2014 2:56 pm

My first impression, upon reading this was, Nonsense! That guy’s been looking at too many cherry-picked alarmist time series.”
On further consideration, it actually makes sense to me. In autocorrelation, two adjacent points are likely to have values close to each other. If the points are separated by a third point, then the two outside points are correlated through a correlation. The further apart the points, the weaker the correlation between those two points. The end points, on average, are further from every other point in the series than any intermediate point would be. Further away implies less correlation so the likelihood is that this is where extremes might happen more often. (Just to make sure Willis, what would the results be if you performed the test again, but moving the start and end positions of each of your 2000 point intervals over by 1000 points?)
A question though. Since GISS is notorious for adjusting adjustments to the instrumental record, how should these adjustments affect autocorrelation? Assuming the adjustments are “correct” should we expect it to increase or decrease the autocorrelation? Or should it have zero effect? If we can determine what result to expect, it might be interesting to look at what effect the adjustments actually have had on the autocorrelation!

katlab58
April 25, 2014 3:06 pm

It’s been 34 years since I was involved in statistics, so I will avoid commenting on most of this discussion. I just want to note, that people only tend to look at things when they are at an extreme. If there is an even distribution of cancer in a certain area, people don’t question unless it is clustered, i.e. extreme. That perception of extreme is based first on their personal mental database. (i.e. wow, I don’t remember having that much snow before.) so the other extreme is likely to pretty distant from the extreme that caught your attention. The fact that most things run in waves or cycles, means that most observations in the cycle are going to be close to the norm and the extremes are going to be distant from each other. Why this works on random numbers I don’t know, but why it works in real life I get.

Alan McIntire
April 25, 2014 3:22 pm

“Seattle says:
April 24, 2014 at 10:26 pm
I think I may have found a mathematical explanation for this.” arcsine law
You beat me to it. Here’s another link describing the arcsine law:
http://www.math.harvard.edu/library/sternberg/slides/1180908.pdf

gary bucher
April 25, 2014 3:42 pm

Willis, I think you made a small error on your simple example:
” Even this very simplified example follows the math. From his math we’d expect 3.2 out of eight at each end (40%), and 1.6 out of eight in the middle (20%). Instead of 3.2 and 1.6, the results are 3 and 2, which are the closest whole numbers for the actual situation.”
Unless I am missing something – the 0’s of the middle two possibilities each land at BOTH ends meaning that we have a total of 10 extremes (instead of 8) since the 0’s which are extremes are repeated. 4 extremes at each end (40%) and 2 extremes in the middle (20%). So we don’t even need to round to get exactly the same answer as the formula.
I really have appreciated both the mathematical and common sense expalantions that have made this seem much more intuitive after all.

cd
April 25, 2014 5:23 pm

Willis
This is really interesting and has had me scratching my head. Thanks!
I have tried to reason why this might be the case. The only thing I can think of is outline below – and forgive me here for conjecture. So you say you’re using a simple red noise generator and excluding parameters that might create drift or locally varying mean (such as crazy Hausdorff exponents etc).
When creating red noise, the easiest way (IMO) is to:
1) create white noise
2) perform forward FFT on white noise
3) apply an exponential decay as function of wave number
4) back transform FFT
Now you have the red noise (correct?).
But what I’m wondering is, that if you assume the series is autocorrelated (which it will be), then one might assume via the Wiener–Khinchin theorem, that there is an equivalence between the FFT of the series (step 3) and the FFT of the autocorrelation function (from series after step 4). Therefore, if your red noise generator is following a similar process (steps 1 to 4), then it may only use the real terms in steps 2 and 4; the cosine transform rather than the full FFT (as the autocorrelation function is symmetric). In this case, and given the nature of the red noise spectrum (step 3) the largest values (powers->amplitude) over many many runs will typically be present at lower wavenumbers. So that after transformation the start and end of your series (approximating the longest cosine wave) will typically have the highest values. I admit that your mid-series will have the lowest values and you state that you plot extremes, so this may only work if by extremes you mean highest values.

April 25, 2014 8:24 pm

WayneM asked at April 25, 2014 at 2:54 pm
“The effect may be due to the generated time series being red (as has been suggested in several of the comments). Have you tried it with a white time series to see if the result holds for that?”
Good suggestion. If Willis is right (bite my tongue!!!) than the histogram for White should be flat. It is:
http://electronotes.netfirms.com/ac.jpg
The figure shows red, white, and white with a (pinkish but not strictly pink) feedback of 0.8 (feedback for red is a=1.0, for white a=0). Also on the figure is the Matlab code that produced the figures, and with other options, just for documentation and to show how simple this is.

April 25, 2014 8:37 pm

cd says in part at April 25, 2014 at 5:23 pm
“……When creating red noise, the easiest way (IMO) is to: ……”
This is an easy (easier?)problem in the time domain – it is just a random walk. It is a discrete integrator or accumulator. The first two pages here give my Matlab code and brief description:
http://electronotes.netfirms.com/AN384.pdf
I use the FFT to verify the spectrum.

Nick Stokes
April 25, 2014 9:58 pm

I think it’s related to the logic of the TOBS adjustment. Suppose you had an autocorrelated random signal and you divide into equally spaced intervals. Where would you find interval maxima?
There will be local high and low points, and if they appear in the interior, they will be counted once. But if they occur near the cuts, it’s likely that a high point will provide the maxima for two intervals. (same with minima)
So it’s somewhere up to twice as likely that you’ll get extrema near the ends. Which seems to fit with Willis’ experiment.

Larry Fields
April 25, 2014 10:11 pm

Gary Pearse says:
April 24, 2014 at 4:21 pm
“This seems to be an example of Benford’s distribution, or Benford’s Law as it is sometime called.”
Gary,
I wrote a hub about Benford’s Law a couple of years ago. It includes an original theorem, which is based upon the BOGOF Principle. (Buy one, get one free.) It’s very difficult to resist the temptation of shameless self-promotion. Here’s a link.
http://larryfields.hubpages.com/hub/Frank-Benfords-Law

cd
April 26, 2014 12:35 pm

Bernie Hutchins
A random walk? What I know of random walks is that they can have a drift (as in multiple Brownian motion simulations reveal). In this sense they are not necessarily stationary. This is not normally true of the algorithm I presented. But then I’m no expert on random walk algorithms/methods?
Also I can’t see how multiple random walk would give you typically more extreme values at the start and end. Nor can I see how with this would be the case using the simple algorithm I presented above – unless all my assumptions are correct. In short, there is a bias in the algorithm – why?

Frederick Michael
Reply to  cd
April 26, 2014 12:47 pm

No bias. It’s the very drift you mentioned. For example, consider only 3 points. Without loss of generality, you can start by generating the middle point, then generate a random walk “delta” to get the first point and another one for the last point.
Now, compute the probabilities of each point being the maximum. If these delta’s are real valued (so that, p(0)=0) then there’s a 50% chance that the last point is higher than the middle one and a 50% chance the first is higher (and they’re independent). Thus, the middle point has a 1/4 chance of being the max, while the endpoints each have 3/8.
That’s just an intuitive explanation but it should get you started.