Extreme Times

Guest Post by Willis Eschenbach

I read a curious statement on the web yesterday, and I don’t remember where. If the author wishes to claim priority, here’s your chance. The author said (paraphrasing):

If you’re looking at any given time window on an autocorrelated time series, the extreme values are more likely to be at the beginning and the end of the time window.

“Autocorrelation” is a way of measuring how likely it is that tomorrow will be like today. For example, daily mean temperatures are highly auto-correlated. If it’s below freezing today, it’s much more likely to be below freezing tomorrow than it is to be sweltering hot tomorrow, and vice-versa.

Anyhow, being a suspicious fellow, I thought “I wonder if that’s true …”. But I filed it away, thinking, I know that’s an important insight if it’s true … I just don’t know why …

Last night, I burst out laughing when I realized why it would be important if it were true … but I still didn’t know if that was the case. So today, I did the math.

The easiest way to test such a statement is to do what’s called a “Monte Carlo” analysis. You make up a large number of pseudo-random datasets which have an autocorrelation structure similar to some natural autocorrelated dataset. This highly autocorrelated pseudo-random data is often called “red noise”. Because it was handy, I used the HadCRUT global surface air temperature dataset as my autocorrelation template. Figure 1 shows a few “red noise” autocorrelated datasets in color, along with the HadCRUT data in black for comparison.

Figure 1. HadCRUT3 monthly global mean surface air temperature anomalies (black), after removal of seasonal (annual) swings. Cyan and red show two “red noise” (autocorrelated) random datasets.

The HadCRUT3 dataset is about 2,000 months long. So I generated a very long string (two million data points) as a single continuous long red noise “pseudo-temperature” dataset. Of course, this two million point dataset is stationary, meaning that it has no trend over time, and that the standard deviation is stable over time.

Then I chopped that dataset into sequential 2,000 data-point chunks, and I looked at each 2,000-point chunk to see where the maximum and the minimum data points occurred in that 2,000 data-point chunk itself. If the minimum value was the third data point, I put down the number as “3”, and correspondingly if the maximum was in the next-to-last datapoint it would be recorded as “1999”.

Then, I made a histogram showing in total out of all of those chunks, how many of the extreme values were in the first hundred data points, the second hundred points, and so on. Figure 2 shows that result. Individual runs of a thousand vary, but the general form is always the same.

Figure 2. Histogram of the location (from 1 to 2000) of the extreme values in the 2,000 datapoint chunks of “red noise” pseudodata.

So dang, the unknown author was perfectly correct. If you take a random window on a highly autocorrelated “red noise” dataset, the extreme values (minimums and maximums) are indeed more likely, in fact twice as likely, to be at the start and the end of your window rather than anywhere in the middle.

I’m sure you can see where this is going … you know all of those claims about how eight out of the last ten years have been extremely warm? And about how we’re having extreme numbers of storms and extreme weather of all kinds?

That’s why I busted out laughing. If you say “we are living today in extreme, unprecedented times”, mathematically you are likely to be right, even if there is no trend at all, purely because the data is autocorrelated and “today” is at one end of our time window!

How hilarious is that? We are indeed living in extreme times, and we have the data to prove it!

Of course, this feeds right into the AGW alarmism, particularly because any extreme event counts as evidence of how we are living in parlous, out-of-the-ordinary times, whether hot or cold, wet or dry, flood or drought …

On a more serious level, it seems to me that this is a very important observation. Typically, we consider the odds of being in extreme times to be equal across the time window. But as Fig. 2 shows, that’s not true. As a result, we incorrectly consider the occurrence of recent extremes as evidence that the bounds of natural variation have recently been overstepped (e.g. “eight of the ten hottest years”, etc.).

This finding shows that we need to raise the threshold for what we are considering to be “recent extreme weather” … because even if there are no trends at all we are living in extreme times, so we should expect extreme weather.

Of course, this applies to all kinds of datasets. For example, currently we are at a low extreme in hurricanes … but is that low number actually anomalous when the math says that we live in extreme times, so extremes shouldn’t be a surprise?

In any case, I propose that we call this the “End Times Effect”, the tendency of extremes to cluster in recent times simply because the data is autocorrelated and “today” is at one end of our time window … and the corresponding tendency for people to look at those recent extremes and incorrectly assume that we are living in the end times when we are all doomed.

All the best,

Usual Request. If you disagree with what someone says, please have the courtesy to quote the exact words you disagree with. This avoids misunderstandings.

0 0 votes

Article Rating

218 Comments

Inline Feedbacks

View all comments

Scottish Sceptic

April 25, 2014 12:48 am

Michael D says: I suspect that the explanation might be as simple as follows: b) When you cut a chunk from a long-time-period Fourier component, there is a good chance that you will cut a chunk that is either increasing or decreasing throughout the chunk. … Sorry – not as simple to explain as I had hoped. A drawing would be easier.
Not a bad explanation Michael.
But perhaps the simplest explanation is that because it is a random walk – it is extremely likely that the last point of the random walk will not be at the start – and that the longer the random walk, the more likely it is to be further away from the start. So the points most likely to be furthest apart are those at the beginning and those at the end of the random walk. (And the average will be in the middle)
Or … to use a simple analogy … if two people are lost in a desert, and they just set off to look for each other with no idea where they are going (i.e. random), there is far more chance of them ending further away from each other than moving closer.
So, in most real world situations (not science-labs where students only seem to be taught about a very unusual type of “white noise”), random fluctuations tend to make things get further apart.
It is true of Pooh sticks (sticks thrown in a river) which tend to diverge. Gases tend to diffuse, rivers tend to change their course over geologic tims, evolution tends to make plant and animal species change over time. So, e.g. the chance of evolution spontaneously bringing about a diplodocus is vanishingly small.
So, it is the norm, for the beginning and end of a plot of a natural system to tend to diverge and it is abnormal for them to be stay the same.
The bigger question is not why does the climate vary – because all(?) natural systems vary, but why has the earth’s climate been so remarkably stable that we are here.
And perhaps just as important, is why are science students not taught about real world noise systems and is this why climate scientists incapable of understanding real world noise?
PS. I learnt about real world noise, not within the physics degree but from my electronics degree.

DonV

April 25, 2014 1:36 am

Willis, excellent thinking and very clear explanation. However, I am curious. Does this phenomenon hold true regardless of the size of the second sample? You chose 2000. What if you chose 1000? 1717? 3000? Does it hold true if the second datasets aren’t sequential, but rather overlap? or are chosen randomly? This is going to bug me now until someone comes up with the proof, and explains it to me.

Willis Eschenbach

Author

April 25, 2014 1:47 am

Lloyd Martin Hendaye says:
April 24, 2014 at 10:25 pm

Suggest plotting random-recursive “auto-correlated” Markov Chains, wherein chance-and-necessity determine growth-and-change. For the record, global hedge funds have long adapted quant-model algorithms to Markov-generated series as proxies for trading volume.

I doubt greatly whether that would make a difference, so I think I’ll leave that as an exercise. Too many more interesting things.
Best of luck, and thanks,
w.

Willis Eschenbach

Author

April 25, 2014 1:56 am

Seattle says:
April 24, 2014 at 10:26 pm

I think I may have found a mathematical explanation for this.
For a Wiener process (a random walk comprising infinitesimally small random steps), the “Arcsine laws” apply: http://en.wikipedia.org/wiki/Arcsine_laws_(Wiener_process)
Per that page, the arcsine law says that the distribution function of the maximum on an interval, say [0,1], is 2 / pi * arcsin(sqrt(x)).
Differentiating that expression yields the probability density 1/(pi*sqrt(x)*sqrt(1-x))
This yields a plot that looks quite like your histograms!
https://www.wolframalpha.com/input/?i=plot+1%2F%28pi*sqrt%28x%29*sqrt%281-x%29%29

Dang, Seattle, that is most impressive. It sure looks like you actually calculated a distribution function for the location of the extreme values. Sweet.
That’s valuable because that can give us exact expected values for numbers of extremes …
Onwards,
w.

Willis Eschenbach

Author

April 25, 2014 1:58 am

Seattle says:
April 24, 2014 at 10:42 pm

The arcsine law is pretty easy to use. For example, the chance of a maximum (or, equivalently, minimum) being in the first 1/3rd of the interval is
2 / pi * arcsin(sqrt(1/3)) = 39.2%
and it’s the same with the last 1/3rd of the interval, due to symmetry. The “middle” third only has a 21.6% chance (the remaining amount).

Excellent, thanks for that.
w.

Willis Eschenbach

Author

April 25, 2014 2:02 am

David A says:
April 24, 2014 at 11:01 pm

I do not follow the logic in a true random series. When throwing a fair die, each of the six values 1 to 6 has the probability 1/6. Assume that each throw generates a different number for six throws. Is the one and the 6 any more likely to be the first or last throw?

Good question, David. The answer is no. Remember that this is only true for autocorrelated series, not independent random series.
w.

Willis Eschenbach

Author

April 25, 2014 2:05 am

Steve C says:
April 24, 2014 at 11:58 pm

A passing linguistic thought: rather than calling it the “Extreme Times Effect”, I’d suggest that it be called the “End Times Effect”. It just seems a shade more appropriate, considering the ever-popular use of the phenomenon to “prove” that we are now in the End Times and are therefore All Doomed!

Excellent. It has more weight, with all these folks claiming that’s where we are. With your permission I’ll change the head post. Thanks.
w.

Willis Eschenbach

Author

April 25, 2014 2:09 am

Greg says:
April 25, 2014 at 12:13 am

Seattle:

“But which kind of power spectrum does the climate have?”

This assumption that red noise (which is the integral of white noise) is the base-line reference against which all climate phenomena should be measured is erroneous. It is often used to suggest that long period period peaks in spectra are not “statistically significant”.

Thanks, Greg. Actually, if you look above you’ll see that I used an ARIMA model. In this case, I used two levels of AR and MA coefficients (lag-1 and lag-2). Because the AR and MA coefficients are calculated from the HadCRUT dataset, this provides a good match to the statistical properties of that particular dataset.
However, if we were talking about doing a monte carlo analysis of e.g. river flows, we’d use different ARIMA coefficients, calculated from actual river data.
w.

Greg

April 25, 2014 2:13 am

W: Good question, David. The answer is no. Remember that this is only true for autocorrelated series, not independent random series.
So all he has to do is add the dice throws , as I replied above.

Willis Eschenbach

Author

April 25, 2014 2:23 am

DonV says:
April 25, 2014 at 1:36 am

Willis, excellent thinking and very clear explanation. However, I am curious. Does this phenomenon hold true regardless of the size of the second sample? You chose 2000. What if you chose 1000? 1717? 3000? Does it hold true if the second datasets aren’t sequential, but rather overlap? or are chosen randomly? This is going to bug me now until someone comes up with the proof, and explains it to me.

In general yes, it works no matter how long a window you choose.
To see why it’s true, consider a very simple system, where the window is only three data points long.
Let’s use a random walk, with a step of either plus one or minus one, plus some small error. We have only four possible walks:
0, 1, 2
0, 1, 0
0, -1, 0
0, -1, -2
In the first and last cases, we have extremes at the two ends.
In the two middle cases, we have one extreme at one end and one in the middle.
Total from the four equally probable outputs shows that six of the eight extremes end up at the endpoints, while only two of the eight extremes end up in the middle …
Seattle has pointed out the actual distribution math above. Even this very simplified example follows the math. From his math we’d expect 3.2 out of eight at each end (40%), and 1.6 out of eight in the middle (20%). Instead of 3.2 and 1.6, the results are 3 and 2, which are the closest whole numbers for the actual situation.
w.

Greg

April 25, 2014 2:27 am

“Thanks, Greg. Actually, if you look above you’ll see that I used an ARIMA model.”
Yes I saw that , though you did not specify what sort of ARIMA. My point was that Seatlle has hit on the right formula for straight red noise but this will be a bit different for a more complex model like you used.
It will certainly be huge step in the right direction compared to erroneous assumption of a flat distribution and I think viewed in that context late 20th c. will be within expected bounds. This is what Keenan was banging on about and eventually got an official admission into the parliamentary record via a House of Lords statement.
In fact I think his main point was that the usual red noise model was not the best choice.
Of course none of these random statistical models allow for the constraining effects of the Plank response but its a damn good step in the right direction.

Greg

April 25, 2014 2:38 am

“I used two levels of AR and MA coefficients (lag-1 and lag-2). ”
Just out of interest , could you post details of the actual model that you used?

James from Arding now Armidale

April 25, 2014 3:29 am

Prompted from Seattle’s example and slightly off topic but maybe interesting is;
https://www.wolframalpha.com/input/?i=wattsupwiththat.com+vs+www.realclimate.org+vs+www.skepticalscience.com
I keep forgetting how much fun wolframAlpha is 🙂

James from Arding now Armidale

April 25, 2014 3:30 am

And how much I learn from Anthony and Willis!

Martin A

April 25, 2014 4:33 am

Willis, what you have found empirically seems counter-intuitive, therefore something interesting to try to understand.
If a random signal is stationary, my intuition leads me to expect (rightly or wrongly) that the middle of a segment chopped from the signal should have the same characteristics as the ends of the same segment. If I have understood what you said, you have found otherwise.
You said “this two million point dataset is stationary, meaning that it has no trend over time, and that the standard deviation is stable over time.”
(1) My recollection from playing with time series years ago, is that the definition of ‘stationary’ is that *all* statistics of the time series are independent of time. This is a bit different from your definition of no trend and std dev is time independent. For example, I could put white noise through a filter that removed a range of frequencies and then sweep the center frequency of the filter. This would result in a nonstationary time series by thge definition I quoted (but stationary by the definition you give).
(2) I understand that you generate red noise by putting white noise through a linear filter with transfer function 1/s – ie a pole at the origin of the complex plane. Can the output of the filter be stationary (in the definition I give above)? [I don’t know – I’m asking.]

Kasuha

April 25, 2014 4:48 am

“… in fact twice as likely, to be at the start and the end of your window rather than anywhere in the middle.”
____________________________
I have to disagree with this statement. I did a simple graphical analysis of your graph:
http://i.imgur.com/X0ht7Ch.png
and my conclusion is that chance that extremes will be within 15% of the length of the interval from either edge (covering 30% of the interval length; 6 out of 20 columns) is approximately 44%. It is definitely more than 30% which would be the case if the distribution was flat. But it is about half more probable over uniform distribution, definitely not twice.

Robany

April 25, 2014 5:45 am

I was interested to see what would happen if you used perfectly trendless, noiseless data. I think some folk above have expressed it in words but I wanted to test it empirically. The IDL code for this is:
pro extrema
compile_opt idl2
nPoints = 2000000
x = findgen(nPoints)
period = 3000.0
data = sin(2*!PI*x/period)
sampleWidth = 2000
extremes = lonarr(sampleWidth)
for i=0, nPoints-sampleWidth-1 do begin
sample = data[i:i+sampleWidth-1]
sampleMax = max(sample, maxIdx, subscript_min=minIdx)
extremes[maxIdx] += 1
extremes[minIdx] += 1
endfor
plot, extremes, psym=1
end
IDL because I happen to have a license for it at work and I don’t know R but I think it should be easy to translate.
For any sampleWidth less than the period, the most likely place for extremes is the beginning and end with a flat distribution between those points. The height of the flat part relative to the ends grows as the sampleWidth grows. Any sampleWidth over the period produces odd, stepped distributions that always start high. This is an artefact of the max() function returning the index of the first maximum/minimum values where more than one identical maximum/minimum value is found. I’m sure this could be proved mathematically by someone smarter than me.
However, it underlines Willis’ point about supposed climate extrema and living in extreme times. Even for perfectly cyclic, trendless, noiseless data if your length of your sample is shorter than the cycle period you will find extremes at the ends of the sample is the most probable outcome.
I haven’t looked at what happens if you add shorter cycles but it would be trivial to add higher harmonics to the base sine wave.

commieBob

April 25, 2014 5:52 am

If you’re looking at any given time window on an autocorrelated time series, the extreme values are more likely to be at the beginning and the end of the time window.

That depends on the width of the window. Wiki gives the following definition for autocorrelation:

… It is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, …
wiki

If the window is narrow compared to the period of the underlying signal then the window will usually not contain the signal’s peak values. In other words, the underlying signal will either have a positive or negative slope for the whole window.
On the other hand, if the window is wide enough we can easily find examples where the extreme values do not come at the beginning and end of the window.
Example – The window contains exactly one cycle of the underlying periodic signal. In that case the signal’s waveform at each end of the window will have the same value and, depending on the phase, the extreme values will be somewhere within the window.

Roger Burtenshaw

April 25, 2014 5:58 am

Is Willis’s observation an example of:-
As soon as scientist can measure something new; then:-
a) It is always bad!
b) it is always getting worse!
c) It is always caused by humanity!!
and
d) It could be fixed by throwing more money in the scientist’s direction!!!

michael hart

April 25, 2014 5:59 am

Scottish Sceptic says:
April 25, 2014 at 12:48 am
The bigger question is not why does the climate vary – because all(?) natural systems vary, but why has the earth’s climate been so remarkably stable that we are here.
And perhaps just as important, is why are science students not taught about real world noise systems and is this why climate scientists incapable of understanding real world noise?
PS. I learnt about real world noise, not within the physics degree but from my electronics degree.

I think it is partly inherent in the subject. A subject where low-frequency/long-duration of variations is longer than a humans attention span, career or life span, probably will not invest the same effort in understanding noise as an electronic engineer.
Similarly a I suspect a chemist is probably more accepting/alert to the possibility of being wrong than a climate modeler by virtue of experiments being much quicker. They get more experience of being wrong.
Physicians are also accustomed to having patients die on them.
The real world teaches at different speeds.

tadchem

April 25, 2014 6:05 am

If one takes random time-series (X, Y) data and breaks it into two parts – first half and second half – the mean Y values for the two halves will generally NOT be identical. This will result in a non-zero slope (statistically and physically insignificant, but nonetheless present and a real property of the data set).
The extrema of high and low Y data will be more probably distributed appropriately between the high and low halves. We may repeat this argument for the data series broken into fourths (quartiles), eighths (octiles), and so on – ad infinitum.
The logic of this recursion accounts for the observation that the extreme high Y values will be found nearer the opposite end of the data from the end nearer the extreme low Y values.

Itocalc

April 25, 2014 6:20 am

Economist Eugen Slutsky showed that random processes can result in cyclic process, such that Fourier analysis would find the cycle, but it is just random noise. The article’s title is The Summation of Random Causes as the Source of Cyclic Processes, translated to english around 1936. Here is a brief article discussing the impact on economics: http://www.minneapolisfed.org/publications_papers/pub_display.cfm?id=4348&
Perhaps this has something to say about all the oscillations connected to weather and all the speculation about future cycles in the weather.

ferdberple

April 25, 2014 6:28 am

davidmhoffer says:
April 24, 2014 at 5:56 pm
=========
This appears to be the correct solution. Cut a waveform into small enough segments and you are likely to have a min or max (extremes) at the ends.

Chris Wright

April 25, 2014 6:30 am

charles nelson says:
April 24, 2014 at 10:34 pm
“…..Has anyone subjected Warmist Climate Data to the Benford Law test?…..”
That thought has occurred to me.
However, I doubt if any climate scientists simply sat down and invented the data. It’s not necessary. They can use cherry-picking and ignore any inconvenient data, or they can ‘adjust’ the data. I imagine that, while Benford can detect purely made-up numbers, it may not detect data that has been systematically adjusted (by systematic, I mean the same adjustment was applied to all the data).
I’ve thought about it, and I think that Benford probably doesn’y apply to Willis’ fascinating findings.
One (rather boring) explanation of Willis’s findings did occur to me, other posters may have arrived at a similar explanation:
If you cut out part of a long data series, the selected section will almost certainly have an overall positive or negative trend, even if it’s random (e.g. the drunkard’s walk). So, if the trend is positive, the early numbers will tend to be lower and the later numbers will tend to be higher, and vice versa.
I’m not sure if this real effect is needed to explain some of the recent claims about ‘records’. Although there has been no global warming in this century we’re still very near the top. Therefore it’s very easy for short-term temperature excursions (which are often large) to set new records. Of course, it’s a complete scam: they know that most people, when they hear about new records being set, will assume the climate is still warming, when of course it isn’t.
Records should have no place in science. The only thing that matters is the trend.
By the way, for anyone not familiar with Benford’s Law, I suggest you Google it – now.
Chris

ferdberple

April 25, 2014 6:31 am

tadchem says:
April 25, 2014 at 6:05 am
===========
Also the correct solution. As you slice the segments smaller you increase the odds of the extremes being at the ends.

« Previous 1 … 3 4 5 6 7 … 9 Next »

wpDiscuz

Related Posts

When Primate Politics Turn Deadly: ‘Civil War’ Shattering Uganda’s Ngogo Chimps

Lightning bolts on Jupiter pack more than 100 times the power of Earth’s flashes

Animals’ knowledgeable inherited behavior

Vitamin C and Climate