Extreme Times

Guest Post by Willis Eschenbach

I read a curious statement on the web yesterday, and I don’t remember where. If the author wishes to claim priority, here’s your chance. The author said (paraphrasing):

If you’re looking at any given time window on an autocorrelated time series, the extreme values are more likely to be at the beginning and the end of the time window.

“Autocorrelation” is a way of measuring how likely it is that tomorrow will be like today. For example, daily mean temperatures are highly auto-correlated. If it’s below freezing today, it’s much more likely to be below freezing tomorrow than it is to be sweltering hot tomorrow, and vice-versa.

Anyhow, being a suspicious fellow, I thought “I wonder if that’s true …”. But I filed it away, thinking, I know that’s an important insight if it’s true … I just don’t know why …

Last night, I burst out laughing when I realized why it would be important if it were true … but I still didn’t know if that was the case. So today, I did the math.

The easiest way to test such a statement is to do what’s called a “Monte Carlo” analysis. You make up a large number of pseudo-random datasets which have an autocorrelation structure similar to some natural autocorrelated dataset. This highly autocorrelated pseudo-random data is often called “red noise”. Because it was handy, I used the HadCRUT global surface air temperature dataset as my autocorrelation template. Figure 1 shows a few “red noise” autocorrelated datasets in color, along with the HadCRUT data in black for comparison.

hadcrut3 temperate data pseudodataFigure 1. HadCRUT3 monthly global mean surface air temperature anomalies (black), after removal of seasonal (annual) swings. Cyan and red show two “red noise” (autocorrelated) random datasets.

The HadCRUT3 dataset is about 2,000 months long. So I generated a very long string (two million data points) as a single continuous long red noise “pseudo-temperature” dataset. Of course, this two million point dataset is stationary, meaning that it has no trend over time, and that the standard deviation is stable over time.

Then I chopped that dataset into sequential 2,000 data-point chunks, and I looked at each 2,000-point chunk to see where the maximum and the minimum data points occurred in that 2,000 data-point chunk itself. If the minimum value was the third data point, I put down the number as “3”, and correspondingly if the maximum was in the next-to-last datapoint it would be recorded as “1999”.

Then, I made a histogram showing in total out of all of those chunks, how many of the extreme values were in the first hundred data points, the second hundred points, and so on. Figure 2 shows that result. Individual runs of a thousand vary, but the general form is always the same.

histogram extreme value locations temperature pseudodataFigure 2. Histogram of the location (from 1 to 2000) of the extreme values in the 2,000 datapoint chunks of “red noise” pseudodata.

So dang, the unknown author was perfectly correct. If you take a random window on a highly autocorrelated “red noise” dataset, the extreme values (minimums and maximums) are indeed more likely, in fact twice as likely, to be at the start and the end of your window rather than anywhere in the middle.

I’m sure you can see where this is going … you know all of those claims about how eight out of the last ten years have been extremely warm? And about how we’re having extreme numbers of storms and extreme weather of all kinds?

That’s why I busted out laughing. If you say “we are living today in extreme, unprecedented times”, mathematically you are likely to be right, even if there is no trend at all, purely because the data is autocorrelated and “today” is at one end of our time window!

How hilarious is that? We are indeed living in extreme times, and we have the data to prove it!

Of course, this feeds right into the AGW alarmism, particularly because any extreme event counts as evidence of how we are living in parlous, out-of-the-ordinary times, whether hot or cold, wet or dry, flood or drought …

On a more serious level, it seems to me that this is a very important observation. Typically, we consider the odds of being in extreme times to be equal across the time window. But as Fig. 2 shows, that’s not true. As a result, we incorrectly consider the occurrence of recent extremes as evidence that the bounds of natural variation have recently been overstepped (e.g. “eight of the ten hottest years”, etc.).

This finding shows that we need to raise the threshold for what we are considering to be “recent extreme weather” … because even if there are no trends at all we are living in extreme times, so we should expect extreme weather.

Of course, this applies to all kinds of datasets. For example, currently we are at a low extreme in hurricanes … but is that low number actually anomalous when the math says that we live in extreme times, so extremes shouldn’t be a surprise?

In any case, I propose that we call this the “End Times Effect”, the tendency of extremes to cluster in recent times simply because the data is autocorrelated and “today” is at one end of our time window … and the corresponding tendency for people to look at those recent extremes and incorrectly assume that we are living in the end times when we are all doomed.

All the best,

w.

Usual Request. If you disagree with what someone says, please have the courtesy to quote the exact words you disagree with. This avoids misunderstandings.

 

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
0 0 votes
Article Rating
218 Comments
Asher
April 26, 2014 12:41 pm

It doesn’t have to be a stationary series. It works just as well with a random walk. Try this in R:
rw1<-cumsum(abs(rnorm(100000))*2*(runif(100000)-.5)) # Random walk with random step size
extremelist<-NULL # Vector of positions of extreme values – max and min of window
for(window in 1:1000)
{
vals<-rw1[(10*window-9):(10*window)] # Window of ten values
extremelist<-c(extremelist,which.min(vals),which.max(vals)) # Add positions of new extrema
}
hist(extremelist)
Again you find the extrema at the ends.

cd
April 26, 2014 12:55 pm

Nick Stokes
Suppose you had an autocorrelated random signal and you divide into equally spaced intervals
This may be Willis’ experiment, but that is not what is stated in the opening quote which suggested that even if one binned the extreme values for the entire series (the window length is equal to the series length) for a suite of runs, one would find the extreme values at the end and beginning. If the actual experiment is that the time window has to be less then the total time series, then by how much and is the effect sensitive to window size.
So it’s somewhere up to twice as likely that you’ll get extrema near the ends.
And if you’re right then the same should be true for unremarkable results. Also, the extreme would be binned, for consecutive windows, at either ends of the histogram range so would be “counted” only once for end and beginning. In short, would they not need to be consistently at the beginning and ends to give the “symmetric” Fig. 2.

cd
April 26, 2014 1:17 pm

Frederick Michael
Thanks for the description, I may be being unfair here but describing how you might get such a result by using a specific Markov Process seems very contrived. More importantly, I don’t think that this provides a reason.
If we repeat Willis’ experiment for a window of say one tenth the length of the time series BUT move it continuously through the series (remember ANY window) and bin the extreme values as Willis has done then why on Earth would one find the extremes at the ends.
Also, ANY window means full series too, which as suggested would give typically extreme values at the end.
BTW I haven’t tried to repeat the experiment.

Frederick Michael
Reply to  cd
April 26, 2014 2:42 pm

CD, think about an extension of the three point example I gave. If you have N points and you add point N+1. If point N was the max, the new point just hit point N with a 50% chance it lost its title as the max. In the limit, as N gets large, the probability that the new N+1th point is the max should approach the probability that point N was the max before. So, in the limit, the probability that the endpoint is the max is twice the probability that the point next to it is the max.

April 26, 2014 2:05 pm

Willis spoke precisely of (in his opening paraphrase) “an autocorrelated time series”. I think we do not need the “auto” part of that – it is just correlated (a property – as opposed to uncorrelated). Red noise (brown, integrated white, random walk – all the same) is the first such example that comes to mind. Keep in mind that what Willis plots is NOT an antocorrelation, but a histogram of occurrences of the maximum of many sequences with a PRE-EXISTING correlated property. With a red noise sequence of any length, there is always a frequency of period longer than the chosen length that is not only present but stronger than any “wiggles” we suppose we are seeing in the particular window (instance). This, in many cases, trends the segment to tip up or down. Hence the extremes at the ends. Lovely! Obvious – when we have someone like Willis to point at it! Here was my repeat of the experiment as I posted above.
http://electronotes.netfirms.com/ac.jpg

itocalc
Reply to  Bernie Hutchins
April 26, 2014 2:26 pm

I will attempt an intuitive explanation: Take the end points of any time series of any length and draw a line connecting to two end points. In the absence of other information like 2.3 cycles occur between the endpoints, every point on the line connecting the endpoints is both higher than one endpoint and lower than the other. The line serves as the expected value of the series at a point in time. Since every expected value is both higher than one endpoint and lower than the other, it is not surprising to find the extreme points at the endpoints. It is the most likely outcome by the definition of expectation, though by no means certain. The length of the interval and standard deviation of the process will influence the likelihood of getting endpoint extrema.

Nick Stokes
April 26, 2014 2:22 pm

cd says: April 26, 2014 at 12:55 pm
“even if one binned the extreme values for the entire series (the window length is equal to the series length)”

The concept is that of a stationary random process. There isn’t a series length. You just observe windows. The series should be unaffected by the window you choose. And one way of choosing a window is to consider first a periodic dissection, then choose a single period. The frequency should not depend on how the window was chosen.
In a periodic dissection, you can see how maxima that occur near the cuts have a better chance (almost double – both sides) of appearing as an interval max. So when you then select one of the periodic intervals, the chance is thus biased.
“And if you’re right then the same should be true for unremarkable results.”
No. The point of maxima is that there is only one per interval. That restriction creates the probability difference. There is no restriction on the number of unremarkable results.

cd
April 26, 2014 2:24 pm

Bernie
Firstly, as should have been implicit in the algorithm as I stipulated, one can choose a range of exponents to create different red noise signals. All this changes is the range of the autocorrelation and red noise is autocorrelated (if created using the method being used). Secondly, as for your use of “just correlated” I’m not sure what you mean. Autocorrelation suggests that the degree of correlation between two sets (both derived from the same series) is a function of the lag (the distance between the sampled pairs used to compute the correlation/covariance). Beyond a certain lag distance, this correlation breaks down and the degree of correlation stops being a function of lag (where the autocovariance = 0.5*varaince of the entire series: is this what you mean by beyond the “wiggles”).
Now what your point has to do with this I’m not sure. Can you answer, why would a continuous moving window (or all the data for series of runs) have predominance of extremes at the start and end?

cd
April 26, 2014 2:43 pm

itocalc
end points of any time series
Do you mean the entire set as well? Then…
every point on the line connecting the endpoints is both higher than one endpoint and lower than the other
Obviously not if the end points share the same value. But I take your point.
The line serves as the expected value of the series at a point in time.
Not if the series is stationary. If your two end points have different values then the line will have a gradient. The expected value of any point in stationary set is the mean it does not change across the series.
since every expected value is both higher than one endpoint and lower than the other, it is not surprising to find the extreme points at the endpoints.
This seems confused – I’m not saying it is, I’ve probably misunderstood.
For a stationary set, the expected value at all points is the same – the mean.
The length of the interval and standard deviation of the process will influence the likelihood of getting endpoint extrema.
Again this doesn’t follow, unless you’re suggesting that the autocorrelation function of the set shows that the correlation is dependent on the lag for all possible lags, in which case the series is not stationary! The standard deviation is immaterial in this respect.

itocalc
Reply to  cd
April 26, 2014 3:42 pm

cd, Thank you for the critique.
I was not clear about the level versus the changes in the level. If we know the difference in level from the first measurement to the end measurement, and have no other information, then the expectation would fall on a line connecting the points (all changes are presumed a result of noise). If the changes alone are plotted, and increments are independent (Markov process) then obviously there should be no relationship between beginning and end points. My thoughts were along the line of a Brownian bridge, which at one time I could discuss in all confidence, but even at my young age (not quite 50) my memory fades. The effects of autocorrelation just bring out the dumb in me and it is in my best interest to stay silent.
Assuming independent increments, the difference between the endpoint values as measured in standard errors, and the number of points between endpoints will influence the probability of crossing the endpoint values during the process. This is all looking backward, with expectations of interim levels being conditioned on the endpoints values (again thinking through the lens of a Brownian bridge).

April 26, 2014 2:47 pm

cd –
You need to run some code of your own. It’s simple and you will know exactly what was done because you yourself did it.
You said as well: “Autocorrelation suggests that the degree of correlation between two sets (both derived from the same series) is a function of the lag (the distance between the sampled pairs used to compute the correlation/covariance).”
I think you have described a cross-correlation since you derive both as sub-sequences from the same (presumably much longer) time sequence. If I have a sequence of length 1000 and I correlate samples 200-299 with samples 550 to 649, this is a cross-correlation. The process does not “know” whether it is being correlated with a later part of it own self, or perhaps sunspot numbers. If you insist on autocorrelation, you must do the whole sequence.
And, I emphasize, correlation (or not) is a pre-existing property of the series we are looking at (like red, white, pink) and we are examining it in the time domain – just inspecting the samples. No one is computing and sort of correlation anyway.
This is much simpler than you are making it – I think.

Gary Pearse
April 26, 2014 2:54 pm

Larry Fields says:
April 25, 2014 at 10:11 pm
“Gary Pearse says:
April 24, 2014 at 4:21 pm
“This seems to be an example of Benford’s distribution, or Benford’s Law as it is sometime called.”
Gary,
I wrote a hub about Benford’s Law a couple of years ago. It includes an original theorem, which is based upon the BOGOF Principle. (Buy one, get one free.) It’s very difficult to resist the temptation of shameless self-promotion. Here’s a link.
http://larryfields.hubpages.com/hub/Frank-Benfords-Law
Thanks Larry, enjoyed it.

Nick Stokes
April 26, 2014 3:04 pm

Here’s another version of the problem. Say weeks start on Sunday. What’s the chance of the warmest day of the week falling on a Sunday?
It’s up to twice the chance of a Wednesday, even though Nature cares nothing for calendars. It’s likely that the max was part of a warm spell. Over a year, some warm spells will occur midweek, and be counted as one max. But some will occur at weekends, and will provide maxima for two weeks. Sundays aren’t warmer per se, but will show up more often in the statistics.

cd
April 26, 2014 3:05 pm

Nick Stokes says:
April 26, 2014 at 2:22 pm
isn’t a series length. You just observe windows
There is in the above experiment, the series is dissected into sub-windows. Your explanation needs to address why one would get the same results if you use both discrete windows and continuous windows on the same series. I can’t see how it can.
The series should be unaffected by the window you choose. And one way of choosing a window is to consider first a periodic dissection, then choose a single period. The frequency should not depend on how the window was chosen.
Don’t follow but then I don’t know what you mean by periodic dissection.
No. The point of maxima is that there is only one per interval. That restriction creates the probability difference. There is no restriction on the number of unremarkable results.
No this is wrong. For continuous variable, there will be one value that is closest to the mean of the set. This provides an indicator statistic (0/1) as with min and max.

cd
April 26, 2014 3:14 pm

Bernie Hutchins says:
April 26, 2014 at 2:47 pm
You need to run some code of your own. It’s simple and you will know exactly what was done because you yourself did it.
Maybe. But I have very little time and was hoping for something more akin to a technical link (maths), as being able to reproduce the same kind of results does not explain why.
I think you have described a cross-correlation since you derive both as sub-sequences from the same (presumably much longer) time sequence.
No I’ve described an autocorrelation. You’re bivariate statistic comes from the same series. Cross-correlation samples two different series.
And, I emphasize, correlation (or not) is a pre-existing property of the series we are looking at (like red, white, pink) and we are examining it in the time domain – just inspecting the samples. No one is computing and sort of correlation anyway.
I never said they did. Autocorrelation is a product of any Markov Process such as a random walk as each new value is conditioned on a previous result (did you not mention random walk?).
This is much simpler than you are making it – I think.
It may be, and if Willis is right which I have no reason to doubt, no one has explained why yet. But thanks for your efforts.

Nick Stokes
April 26, 2014 3:31 pm

cd,
“Don’t follow but then I don’t know what you mean by periodic dissection.”
Think of my Sunday example. You have at some location an essentially endless set of daily maxima. You ask – what is the chance of a 7-day period starting with a max for those seven days?
So you think of the records divided into weeks (periodic dissection). Might as well start Sunday. Pick a random week. It’s part of a population of weeks. And by my argument above, there will be more Sunday max’s than Wednesday max’s in that population. Also more Sunday min’s.

cd
April 26, 2014 3:47 pm

Nick Stokes
But some will occur at weekends, and will provide maxima for two weeks. Sundays aren’t warmer per se, but will show up more often in the statistics.
Sorry, if the week runs from Monday to Sunday, why would a warmest day on Sunday count as two and if it occurred in Wednesday it would count as one. If your suggesting the heat carries over (for an autocorrelated series this is a fair assumption) but then the warmest day would be at the start not the end of the weak. So unless your warmest days are more likely on a weekend or end/start of window then your explanation doesn’t work.
In short, I don’t think that works at all. For example, your argument would mean for the same year, (and exact same record) if we decided to start our dissects on a Wendesday that result would change.
Personally, I think the result depends on the window size. If your first order statistic (the mean) is only stationary for a given sub-sampled window length (i.e. the sample mean is invariant under translation for windows above a certain size), then windows less than this size will likely be in part of the series with a local trend. This drift will run across the window so that high and lower values at either side of the window. Therefore, this result only holds for certain window sizes.

April 26, 2014 3:48 pm

cd –
(1) My concern about auto- vs cross- is that these two terms apply to the way a correlation is done. There is only correlation. The process of breaking a length 1000 random sequence into two length 100 sub-segments and correlating these, as I described, is a cross-correlation. But that is just terminology.
(2) The reason you need to take the 20 minutes to write some code is that if you don’t, and the results look fishy (as they apparently do to you), you won’t be clear on (A) exactly WHAT was done by someone else or (B) what the results MEAN. So where is the fish? Writing your own experiment eliminates (A) and allows you to immediately explore the inevitable “What if we were to….” questions. The word “obviously” may then also come up in your mind. Doing this was very useful for me.
(3) If you look at red signals, I think you won’t have the slightest doubt about the essential “Why”. The exact mathematics is an issue beyond that.
Best wishes.

Nick Stokes
April 26, 2014 4:07 pm

I happened to have on hand Melbourne daily max from may 1855 to Nov 2013. I counted the days on which the weekly max occurred (omitting 17 weeks with missing readings). The results were:
Sunday 1657
Monday 1185
Tuesday 896
Wednesday 814
Thursday 917
Friday 1224
Saturday 1581

cd
April 26, 2014 4:15 pm

Bernie
My concern about auto- vs cross- is that these two terms apply to the way a correlation is done. There is only correlation. The process of breaking a length 1000 random sequence into two length 100 sub-segments and correlating these, as I described, is a cross-correlation. But that is just terminology.
Sorry Bernie this is just wrong (and confused)…you don’t have to take my word for it:
http://coral.lili.uni-bielefeld.de/Classes/Summer96/Acoustic/acoustic2/node18.html
The reason you need to take the 20 minutes to write some code is that if you don’t, and the results look fishy (as they apparently do to you), you won’t be clear on (A) exactly WHAT was done by someone else or (B) what the results MEAN.
I never said the results look fishy. I don’t need to write any code to understand what has been done. Willis has spelt out exactly what he did.
But since you’re getting quite “sanctimonious” I have written code to do this sort of work (as part of my job) all it would need is a simple executable with a single wrapper function to put it all together and repeat his experiment. But then I’m not at work – I’m at home now and don’t want to do it, particularly as I am assuming Willis is correct. And when I’m at work and I have some down time from time-to-time I don’t want to start writing code for every technical issue raised on a blog. And again, writing and building that executable will not explain why he’s getting his results without spending even more time on it.
Doing this was very useful for me.
In what respect? You haven’t been able to explain why this is the case.
If you look at red signals, I think you won’t have the slightest doubt about the essential “Why”. The exact mathematics is an issue beyond that.
No that doesn’t help.

cd
April 26, 2014 4:21 pm

Nick

I happened to have on hand Melbourne daily max from may 1855 to Nov 2013. I counted the days on which the weekly max occurred (omitting 17 weeks with missing readings). The results were:
Sunday 1657
Monday 1185
Tuesday 896
Wednesday 814
Thursday 917
Friday 1224
Saturday 1581

That is a very interesting meteorological result, but it doesn’t prove your point, quite the opposite. If I were to dissect my time series from mid-week to mid-week then the result would look like this:
Wednesday 814
Thursday 917
Friday 1224
Saturday 1581
Sunday 1657
Monday 1185
Tuesday 896
The part of the window with the weekly max are in the middle not the ends.

Nick Stokes
April 26, 2014 4:40 pm

cd,
“That is a very interesting meteorological result, but it doesn’t prove your point, quite the opposite. If I were to dissect my time series from mid-week to mid-week then the result would look like this:”
No, it has nothing to do with meteorology. It describes the Sun-Sat max. If you shifted to mid-week, it changes. In fact, counting with Wed as the first day:
Wed 1570
Thu 1084
Fri 885
Sat 867
Sun 1020
Mon 1287
Tue 1561

April 26, 2014 5:06 pm

cd –
After I said that the issue of cross- vs auto- is a matter of terminology, you provide a link to definitions! Not only do I know the definitions, I know what they MEAN.
Please apply your definitions to the following two sequences:
w1 = -0.3708 0.8942 0.0703 0.4039 0.7501
w2 = 0.6957 0.0537 0.6148 -0.2130 0.9235
These may, or may not, be extracted from a longer sequence:
w3 = -0.3254 -0.3708 0.8942 0.0703 0.4039 0.7501 0.9725 0.7706 -0.1903 -0.2291 0.6957 0.0537 0.6148 -0.2130 0.9235 -0.9398 0.9075
Is the correlation between w1 and w2 auto- or cross-? Would the results be different, or tell you anything?
As for a computer package, you hardly need anything like that. Have you looked at my Matlab code? Not the details, you don’t even have to know Matlab to see that it is simple, short, and that just about anyone could read it (like BASIC). No fancy functions, only a dozen lines, since most of it is commented out or for display.
Sorry if I am sounding sanctimonious to you. Apologies if I have crossed the line between persistently trying to be helpful (a habit as an educator) and showing impatience with a lack of progress.

RACookPE1978
Editor
April 26, 2014 5:19 pm

Nick Stokes says:
April 26, 2014 at 4:40 pm (replying to)

cd,
“That is a very interesting meteorological result, but it doesn’t prove your point, quite the opposite. If I were to dissect my time series from mid-week to mid-week then the result would look like this:”

No, it has nothing to do with meteorology. It describes the Sun-Sat max. If you shifted to mid-week, it changes. In fact, counting with Wed as the first day …

Ah, but your effort does prove Nick’s point, and – somewhat to my surprise, the infamous time-of-observation-bias (TOBS) that has so corrupted the surface station old records: The two ends of the data stream (Sunday’s high and Saturday’s high) ARE higher because they “pick up” hot days left over from a hot Friday preceding a hot Saturday, and a hot Sunday followed by a hot Monday.
So, what would your data look like of you plotted – not a “weekly high” but a monthly high as a function of “day” ? With a constantly changing length of month, and a constantly varying number and length of the hot and cold fronts across a long month, you will not see any difference in day-of-week.

Nick Stokes
April 26, 2014 5:34 pm

“So, what would your data look like of you plotted – not a “weekly high” but a monthly high as a function of “day” ?”
Well, day of week is an unrelated cycle, so nothing expected there. There would be an end-of-month effect, but confounded with the annual cycle, with which months and weather are aligned.
TOBS hasn’t corrupted records. You can still get data without TOBS adjustment. But it’s clearly biased.