Uh oh, a significant error spotted in the just released IPCC AR5 SPM

From the “(pick one: 90% 95% 97%) certainty department, comes this oopsie:

Via Bishop Hill:

=============================================================

Doug Keenan has just written to Julia Slingo about a problem with the Fifth Assessment Report (see here for context).

Dear Julia,

The IPCC’s AR5 WGI Summary for Policymakers includes the following statement.

The globally averaged combined land and ocean surface temperature data as calculated by a linear trend, show a warming of 0.85 [0.65 to 1.06] °C, over the period 1880–2012….

(The numbers in brackets indicate 90%-confidence intervals.)  The statement is near the beginning of the first section after the Introduction; as such, it is especially prominent.

The confidence intervals are derived from a statistical model that comprises a straight line with AR(1) noise.  As per your paper “Statistical models and the global temperature record” (May 2013), that statistical model is insupportable, and the confidence intervals should be much wider—perhaps even wide enough to include 0°C.

It would seem to be an important part of the duty of the Chief Scientist of the Met Office to publicly inform UK policymakers that the statement is untenable and the truth is less alarming.  I ask if you will be fulfilling that duty, and if not, why not.

Sincerely, Doug

============================================================

To me, this is just more indication that the 95% number claimed by IPCC wasn’t derived mathematically, but was a consensus of opinion like was done last time.

Your article asks “Were those numbers calculated, or just pulled out of some orifice?” They were not calculated, at least if the same procedure from the fourth assessment report was used. In that prior climate assessment, buried in a footnote in the Summary for Policymakers, the IPCC admitted that the reported 90% confidence interval was simply based on “expert judgment” i.e. conjecture. This, of course begs the question as to how any human being can have “expertise” in attributing temperature trends to human causes when there is no scientific instrument or procedure capable of verifying the expert attributions.

The IPCC's new certainty is 95% What? Not 97%??

So it was either that, or it is a product of sleep deprivation, as the IPCC vice chair illustrated today:

IPCC_vicechair_tired_tweet

There’s nothing like sleep deprived group think under deadline pressure to instill confidence, right?

0 0 votes
Article Rating

Discover more from Watts Up With That?

Subscribe to get the latest posts sent to your email.

144 Comments
Inline Feedbacks
View all comments
Bart
October 2, 2013 10:47 pm

Lund@hotmail.com says:
October 2, 2013 at 8:44 pm
“Da**it, dude: we are not [retarded].”
I think you may be…
“Bart: you make no sense to me.”
I expect not. You are delving into a new discipline you obviously know nothing about, and your first reaction is to deny there is anything more for you to learn. That’s a really… shall we be nice and say imprudent?… thing to do.
“Rather, I find your frequency domain gibberish (j in the states) non-cool.”
That pretty much says it all. You don’t do frequency domain. Got it.
“But what did you say: random walks are subject to high -frequency noise cancellation? Please tell us more.”
The roll off of gain with frequency due to integration is one of the most elementary control actiions there is. In Laplace notation, the transfer function of an integrator is 1/s, where s is the Laplace variable. Evaluating that function at s = j*w, where w is the radial frequency and j is the square root of -1, the gain falls off as the reciprocal of frequency, with a phase shift of -90 degrees, due to the j in the denominator.
It is trivial to see how this affects, e.g., sinusoids. The integral of cos(w*t) is (1/w)*sin(w*t). The higher the frequency, the smaller its integrated amplitude.
This is so absurdly simple, it is utterly amazing that you would stick your neck out without first asking politely for clarification. It is beyond basic.
But then, there’s a good reason Clemson U was never on my short list for schools I would attend. Stick to football, guys. Mathematics, apparently, isn’t your strong suit.
I tell you what, here is a tutorial on PID control design from a real school. PID stands for “proportional-integral-derivative”, and it is just about the most basic and widespread control technique available. There, you will see some discussion of the frequency response of an open loop with an integral control element.

Bart
October 3, 2013 12:34 am

I doubt this doofus calling himself “Lund” is actually Dr. Robert Lund of Clemson. Nobody granted such a position would make such a fool of himself, unless things at Clemson are even worse than most people assume.
So, for whatever person who has appropriated his title, before you say something even stupider like “what does the frequency response of an integral have to do with the frequency content of a random walk,” I will again point out that a Gaussian random walk is equivalent (and often the result of) sampling the output of an integrating process fed by wideband noise, in the limit as that input noise bandwidth approaches infinity, i.e., the standard Wiener Process.
Random walks look like the top plot here. Or, the plot here, for a non-Gaussian case. As is plainly evident, higher frequency motion is attenuated in each of these cases, in the latter because discrete accumulation, like continuous integration, attenuates higher frequencies.
This property is also immediately evident from the autocorrelation function, which I provided previously for the sampled data case: E{x(t_k)*x(t_n)} = sigma^2 * min(t_k,t_n). The cross correlation coefficient of nearby points is sqrt(min(t_k,t_n)/max(t_k,t_n)), which will be near unity when t_k is near t_n. That means the points tend to stay in the same neighborhood for an extended time, and fail to jump around significantly in narrow time intervals, i.e., their frequency content is weighted toward the low frequencies.
But then, this is obvious in the power law of -2 which such a process produces in a PSD estimate. I mean, this is really, really basic stuff.
So, guys… you’ve embarrassed your institution enough. How about you run along and play somewhere else now.

October 3, 2013 8:55 am

0) OK Class, you’ve had your fun. Cool it. No more comments.
1) Bart: You at least have the covariance function of a random walk right:
Cov(X_t, X_s)=\sigma^2 min(t,s).
Taking t=s show that the variance of X_t is \sigma^2 t. Ergo, the ratio of the varaince of X_n to the varaince of X_1 is n. That is the whole premise on why a random walk, and hence why an ARIMA(3,1,0) model is inapprorpiate. The issue has nothing to do with frequency domain properties of time series. That is non-sequiter.
2) Please stop with the insults. It torques me that you’ve insulted my math skills and my university. Really.

Bart
October 3, 2013 9:25 am

Robert Lund says:
October 3, 2013 at 8:55 am
“You at least have the covariance function of a random walk right:”
I had it right when you were still in diapers. I stated it well before this comment when you had first entered in. Apparently, you were in a blood frenzy, looking forward to tearing into someone who didn’t share your viewpoint, so you glossed over it.
“That is the whole premise on why a random walk, and hence why an ARIMA(3,1,0) model is inapprorpiate.”
It’s a flawed premise. Over a finite interval, AR processes with long time constants can behave essentially like a random walk. If the aim is prediction in the near term, there is nothing wrong with it.
“It torques me that you’ve insulted my math skills and my university. Really.”
It was intended to. It torques me that you paid so little attention to the things I stated, and made to ridicule me based on your incomplete knowledge and lack of experience outside your narrow field. Has the lesson been learned?

October 3, 2013 10:27 am

” Over a finite interval, AR processes with long time constants can behave essentially like a random walk”
This is more jibberish. An AR process with long time constants? I can’t even begin to try to make sense of this. Do you even know what an autoregression is? Do tell us how a long time can be constant. And you wonder why you’re being ignored.
I look at it this way, Bart: You’re better at insults than math.

Bart
October 3, 2013 10:39 am

Robert Lund says:
October 3, 2013 at 10:27 am
[trimmed. Mod]
Do you even know how to derive the frequency response of a discrete AR system? Have you ever even heard of the Z-Transform? Do you have any idea how engineers design digital control systems? Do you know what a time constant is?
[trimmed. Mod]

October 3, 2013 11:22 am

“Do you even know how to derive the frequency response of a discrete AR system? Have you ever even heard of the Z-Transform? Do you have any idea how engineers design digital control systems? Do you know what a time constant is?”
What you mean to ask me is “Do I know how to derive the spectral density of an autoregression (your word discrete is inappropriate)”? The answer is yes (use a transfer function argument with a causal linear process driven by white noise, the latter having a constant spectral density). Do I know what a Z-transform is? Yes (better called a power series transform). I also know about Laplace transforms, characteristic functions, moment generating functions, etc (I do teach differential equations)…………..
Do I know what you are trying to say? Not in the slightest.

Bart
October 3, 2013 12:03 pm

Robert Lund says:
October 3, 2013 at 11:22 am
“Do I know what you are trying to say? Not in the slightest.”
Then, ask questions, instead of accusing me of not knowing what I am talking about. If you had asked nicely to begin with, we wouldn’t have had all this nastiness.
A simple example may help. Consider the difference equation
x(k+1) = 0.999*x(k) + w(k)
where w(k) is white noise. The time constant of this system is -T/log(0.999) = 995T, where T is the sample period. If you look at the output of this over a time less than a time constant, it will be very close to a random walk.

Bart
October 3, 2013 12:11 pm

-1/log(0.999) = 999.5

October 3, 2013 12:41 pm

Bart,
Respectfully, this whole thread is about the nuances between a random walk, a fractionally differenced random walk, and a causal AR(1) model. Some of us here have decades of proficiency with the topic. I truly don’t know what you mean to say most of the time. Like in the example above, there is no period T. An AR(1) with autoregressive coefficient of .999 is stationary.
Here’s the deal: an autoregression with an autoregressive polynomial that has roots close to the unit circle will exhibit more persistence than one with roots far from it. Often, a sample path of such a series could be mistaken for a random walk. What we are trying to tell you is that there is something in between an AR(1) and a random walk, dubbed an ARFIMA model, that is the most appropriate error model here. And its stationary (a random walk is not).
Look, I’m sorry my time series class picked on you. It was a bad idea to tell them about this thread. Can we move on?

Bart
October 3, 2013 1:12 pm

Robert Lund says:
October 3, 2013 at 12:41 pm
“Like in the example above, there is no period T. An AR(1) with autoregressive coefficient of .999 is stationary.”
Then, how are the samples at step k and n separated in time? This is how digital control systems are constructed – we sample sensor data at uniform intervals, and apply a control signal back in a feedback loop at that uniform sample interval.
Yes, it is ultimately stationary. But, you wouldn’t know that from a finite sample of data with record length on the order of less than a time constant.
“Can we move on?”
Yes, we can move on.
Don’t take my earlier taunts too seriously. They were intended as a wake-up call. Clemson is a fine university, and I know people I respect who studied there. I have known idiots who attended MIT. The best engineer I ever worked with was from Purdue – that is not a plug, I did not go there. I judge a person by what he or she can do, not by what school he or she managed to get into and out of.
I, also, have many decades of experience with noise modeling and handling in a wide variety of electro-mechanical systems. We have to design systems which work to exacting specifications. I am very good at making them do that.
I am fluent in these topics within the argot of my milieu, which evidently makes for a communications problem. But, once upon a time, I delved into them extensively in an academic setting. My earlier texts included Larson & Shubert, Karlin and Taylor, and Doob.
Since graduating and entering the practical world, I have had very little use for Ito and Stratonovich. Most of the time, I have seen fBm used as a crutch to explain processes which, upon closer examination, bear the marks of poor data collection technique. IMHO, it is a GUT theory of noise. Sort of like string theory – a mathematically elegant artifice, but of little practical value.
But, we could argue about that all day long. I take it as read that you disagree. The bottom line is the same, no matter which of us is closer to the truth. If the apparent ~65 year process is, as I maintain, a result of random excitation of an oceanic-atmospheric mode, then temperatures are poised to go down. If it is more closely a process of fractionally integrated noise, then temperatures are poised to go down, because this fBm obviously would have Hurst coefficient > 0.5, and so would tend to keep going in the same direction it was currently going for an extended time.

Spence_UK
October 4, 2013 1:33 pm

because this fBm obviously would have Hurst coefficient > 0.5, and so would tend to keep going in the same direction it was currently going for an extended time.
No, that is not how fractionally integrated noise behaves.
I’m all for making predictions from models, but you need to ensure that (1) you understand how the models actually behave and (2) your predictions need confidence intervals. Without these, predictions are worthless.

Bart
October 5, 2013 11:22 am

The main difference between fractional Brownian motion and regular Brownian motion is that while the increments in Brownian Motion are independent, the opposite is true for fractional Brownian motion. This dependence means that if there is an increasing pattern in the previous steps, then it is likely that the current step will be increasing as well. (If H > 1/2.)

Spence_UK
October 5, 2013 1:24 pm

Bart, just look at the example plots (H=0.75, H=0.95) in the wiki article you just linked to. You will notice those plots have many local minima and maxima.
You know what local minima and maxima means? It means that if the current step goes up, the following step might just go down.
Long term persistence (H>0.5, H<=1.0) has a defined population mean, and although it can spend arbitrary periods of time to one side of that mean, it does *not* follow that the current step direction is dependent on the previous one
Also note that the change in step constitutes a change in the first derivative. Note that differentiation is the equivalent of dividing the power spectral density by a value linearly proportional to f. Since the power spectral density of LTP is 1/f, the first derivative of a long term persistent can be white, i.e. independent.
Demetris Koutsoyiannis has a nice term of phrase for this. He notes that the expression "long memory" often associated with long term persistence is a misnomer. In fact, the behaviour of long term persistent series is not one of memory, but much more one of amnesia. But the error you make is a common one.
If you want to see an example of a prediction based on fractionally integrated noise, I recommend you look at UC's blog here:
http://uc00.wordpress.com/2011/08/30/first-ever-successful-prediction-of-gmt-3-years-done/
UCs prediction is based on half integrated noise, back in 2008. Please note the central prediction dips slightly downward, conversely to your incorrect understanding. FWIW my instinct is that his confidence intervals are too narrow, but the presence of CIs allows his prediction to be tested. Please note UC’s recommendations on making predictions.
When your prediction rises to this standard, we can see if you can do as well as UC’s did.

Spence_UK
October 5, 2013 1:49 pm

An example to help you better understand Bart: a random walk exhibits greater persistence than either a fractionally integrated time series or an autoregressive time series. In fact a random walk can wander far further than fractionally integrated noise; the random walk does not have a defined population mean, whereas fractionally integrated noise does.
I would hope, since you understand how a random walk is generated (the algorithm is rather simple), you would recognise that such a claim that the direction of the current step is dependent on the previous step is quite incorrect. The next step direction in a random walk is random – by definition! So the direction of steps is independent from sample to sample.
Fractionally integrated time series are no different in this regard. One complication of fractionally integrated time series is what constitutes the next step – scaling properties and self similarity and all. It’s a complex topic.

Bart
October 5, 2013 3:25 pm

Spence_UK says:
October 5, 2013 at 1:24 pm
“Note that differentiation is the equivalent of dividing the power spectral density by a value linearly proportional to f.”
I assume you were in a hurry, but I believe you meant to say “multiplying”, and differentiation multiplies the PSD by f squared.
“…the random walk does not have a defined population mean, whereas fractionally integrated noise does..”
I think we are possibly speaking of two different things, and need to more carefully delineate them.
First of all, it is not true that “the random walk does not have a defined population mean”. The expected value of a random walk, specifically the accumulation from zero of zero mean independent increments, is in fact zero. It is the excursion from zero which is expected to increase with time.
I suspect the property you were referring to was stationarity. But, fBm is not stationary. I am not sure about the process you and Dr. Lund refer to as “fractionally integrated noise”, but both you and he seem to maintain that it is stationary. Frankly, I do not see how this can be when the spectrum appears to still have a singularity at zero, but I have not looked closely at this yet.
In any case, if you have a difference of opinion with the Wikipedia page to which I linked, perhaps you should sign up as an editor and make your disagreement known.

Spence_UK
October 5, 2013 5:26 pm

Bart, quite right, I originally wrote something different then edited it and garbled it. Yes, multiplying rather than dividing. But the key point here is that persistence does not result in the property that you claim exists (that the direction of the current step is tied to the previous step). I hope we are clear on that point now, having given both emprical examples and the underlying principles.
As for the rest, I am comfortable that what I say is correct, but beware I am using technical terms with specific meaning.
A random walk does not have a defined *population* mean. It is not a stationary process. Note population mean is quite different to the sample mean.
Secondly, fractionally integrated time series (“noise” is not really appropriate and I try not to use it, although occasionally I slip into bad habits, especially when commenting on blogs…) can be stationary processes. They have a defined, fixed population mean. However, the sample mean is a poor estimator of the population mean (and, on top of that, does not improve much averaging).
For a more thorough treatment of the concept of stationarity in the context of long term persistence I would recommend the following presentation:
Hurst-Kolmogorov dynamics and uncertainty

Bart
October 5, 2013 8:19 pm

Spence_UK says:
October 5, 2013 at 5:26 pm
“You will notice those plots have many local minima and maxima.”
Yes, but the point is what is likely. Sooner or later, they are likely to switch direction. The question is, how long do they tend, on average, to go largely in one direction or the other before switching to go the other way? I specifically do not mean every bump or bobble which switches directions, but the longer term, quasi-trends. This is largely determined by how strongly succeeding points are positively correlated on a particular timescale.
“But the key point here is that persistence does not result in the property that you claim exists (that the direction of the current step is tied to the previous step).”
I don’t think I see that, at least not yet from the point of view of this particular argument. As I mentioned, the differentiated PSD is weighted by the frequency squared, so you cannot get a flat spectrum except in the particular case of random walk. It is indeed true that a random walk is expected neither to go up nor down based on where it was previously heading – it is expected to stay the same because it is a Martingale – but that is the special case of H = 1/2. If you differentiate a process with H > 1/2, you still end up with a downward slope in the PSD, which suggests continued long range positive correlation.
IIRC, one of the links you presented previously estimated H = 0.92, so it appears to me there is still, from this point of view, long range correlation which is significantly positive for an extended interval within the temperature series. This interpretation appears to jibe with what the Wikipedia excerpt to which I linked was stating.
Getting back to the PSD question, one of the reasons I have always been bothered by descriptions of fBm is that it is generally rooted in these power law PSD descriptions. But, the PSD is not really even defined for non-stationary processes, so all we are seeing in PSD estimates is basically a 1-d projection of a 2-d entity. It’s a little like creatures of Flatland trying to work out what 3-d creatures look like based on the various cross-sections they observe. Now we, as 3-d creatures, could do that with enough cross sections, but the Flatlanders have no conception of the 3rd dimension, and so can never visualize it within their sphere (or, circle) of comprehension. Similarly, we have no widely utilized 2-d analysis tool of which I am aware which would allow us to fully comprehend what is going on in every case.
I see, e.g., no reason that the processes which produce 1/f signatures in PSD estimates should have a unique, all-encompassing description. Many non-stationary processes can produce approximately 1/f behavior when processed through a PSD estimation routine. I think better tools which observe the full dimensionality are needed. I could say a bit more on this topic, based on some of my memories of trying to hash out such an approach back when I was studying these topics, but the memories are a bit faded, and it would probably be difficult to get across the concepts in this venue. It had something to do with 2-d Laplace transforms, but that is as far as I can go into it at this moment, for whatever it is worth.
“Secondly, fractionally integrated time series … can be stationary processes.”
Hmm… that’s a little less general than your previous statement. I will have to read your link, and reaquaint myself with these things to either find out precisely what you mean, or ask questions which would help elucidate it. I doubt that will happen before this thread gets closed, so maybe we will take it up again at a later time.
But, again, I still believe that the two coincidences in the temperature data set to which I have referred previously indicate that there is a resonance involved, rather than just a random fractal drift, and this indicates that it is likely that temperatures in the next few decades will behave similarly to the era between roughly 1940-1970. I was hoping to make the question moot from your point of view, but it appears that sort of weak agreement will not happen on this thread.
But, I appreciate our civil discussion, as much as I regret the ugliness which passed between myself and Dr. Lund. If the clock runs out on us before there is time to say so, thanks for your time and insights.

Spence_UK
October 6, 2013 11:10 am

Bart, thanks for the discussion as well, and I think all three of us (you, me, Dr Lund) would probably get on pretty well (even if we disagree on some points) if we were to meet over a beer rather than over the internet. Such is the nature of this debate.
I was thinking more about your observation that the first derivative multiplies through by 1/(f sq). This is a point we are in agreement on (once you corrected my silly errors above). When you put white noise through this, you get greater amplitudes at high frequencies than low. This means that we should expect differences to reverse.
I did a quick experiment to confirm this, and indeed randomly generated white noise exhibits anti-persistence in its first derivative. This can be thought through from a probabilistic perspective as well; consider three drawn samples from a iid gaussian random number generator. We then filter out those cases where the first difference is positive (i.e. sample 2 is greater than sample 1). For this pattern to continue, sample 3 must be greater than sample 2. So sample 3 must be the maximum of the three samples. This happens just one in three times. So for white noise, we are twice as likely to reverse the step than we are to continue it. This is quite consistent with what we have discussed on derivatives.
We also know and agree a random walk step direction is independent random (which is obvious by the algorithmic definition of a random walk, but confirmed by the behaviour of the derivative.
So I ran a short procedure in MATLAB to artificially generate flicker noise, and tested the probability that a positive step would be followed by a positive step. In fact I tested all 3 cases with a sample of 1000 points and my results were:
White noise 36% (expected 33%)
Flicker noise 39% (expected ??)
Random Walk 51% (expected 50%)
As you can see, flicker noise (a form of fractionally integrated time series) sits between white noise and a random walk in terms of the dependency on step direction. Flicker noise is more likely to reverse direction than continue its current direction, although the probability is close to 50% so it takes a reasonable number of samples to confirm this.
The other thing to be careful of is fractionally integrated time series show different properties at different scale, and are continuous systems, so the concept of a “step” can be defined but is potentially misleading.

Bart
October 6, 2013 3:17 pm

Spence_UK says:
October 6, 2013 at 11:10 am
“Such is the nature of this debate.”
And, the nature of the internet, which suspends the normal bounds of propriety we observe when dealing with one another directly.
“When you put white noise through this, you get greater amplitudes at high frequencies than low. This means that we should expect differences to reverse… Flicker noise is more likely to reverse direction than continue its current direction, although the probability is close to 50% so it takes a reasonable number of samples to confirm this.”
But, doesn’t flicker noise have H < 0.5, which is the boundary noted in the Wikipedia article? So, isn't your finding actually consistent with this?

Spence_UK
October 6, 2013 11:29 pm

No, flicker noise is H>0.5 and H<1 (aka 1/f noise, or excess noise). All of these have higher spectral power density at the lower frequencies in comparison to the higher frequencies so exhibit long term persistence.
The wikipedia article is incorrect in its statement. Fractionally integrated noise is more likely to reverse direction than continue in the current direction. The probability of this lies between white noise (67% likely to reverse) and a random walk (50% likely to reverse).
Note this also explains UC's prediction quite nicely. A reversal is slightly more probable than not, so the central prediction is slightly down in comparison to the late 20th cent. warming.

Bart
October 7, 2013 9:35 am

OK, apparently my interpretation of H is not quite right. This particular branch of stochastic processes has not been my bag for… decades. But…
“So I ran a short procedure in MATLAB to artificially generate flicker noise, and tested the probability that a positive step would be followed by a positive step.”
What we really want is not the conditional expectation of one sample at a time, but of many. What is the likelihood of an overall trend slope being, say, positive in the future, given that it has been positive in the past? And, what timelines are associated with the past trend, and the projected one?
I do not really have the time to formulate a precise statement of the question I am trying to ask which can be tested. But, given that the correlations are all positive, I suspect that some measure capturing this inchoate thought might well prove the Wikipedia statement correct in some sense. I am, at least, predisposed to believe that the author of the statement had something upon which to base it. Even if he was wrong, I don’t think we can make the determination until we know precisely what he meant.

Spence_UK
October 7, 2013 12:31 pm

Bart, I cannot think of any reasonable definition that would result in the statement in wikipedia being true. Remember that fractionally integrated time series are self-similar – a “trend” at one scale is simply a step at another scale. If it holds at one scale, it will hold at all.
Of course it is always difficult to know exactly what was intended without a formal definition. But I cannot think of any situation where the wikipedia statement is correct.

Spence_UK
October 7, 2013 1:55 pm

Another note – the correlations are all positive, but that is with respect to the absolute values, not the relative change in value (step). That is, if one value is above the population mean, then the next is likely to be also above the population mean. That does not mean the rate of change will be.

Bart
October 7, 2013 6:44 pm

Spence_UK says:
October 7, 2013 at 12:31 pm
Spence_UK says:
October 7, 2013 at 12:31 pm
The question is fairly moot from my perspective. But, how about we just try a simple calculation. Let me see…
Given the normalized autocorrelation function
E{x(t2)x(t1)} = 0.5 * ( abs(t2)^2H + abs(t1)^2H – abs(t2-t1)^2H )
Then,
E{(x(2) – x(1)) * (x(1) – x(0))} = 2^(2H-1) – 1
So, the succeeding increment is expected to be the same sign if 2^(2H-1) > 1, i.e., if H > 0.5. Mmmm… Seems to say Wikipedia is on track, I think.