Guest post by Kevin Kilty

Introduction

I had contemplated a contribution involving feedback diagrams, systems of equations, differential equations, and propagation of error ever since Nick Stoke’s original contribution about a system of differential equations involving feedback way back in June of this year. A couple of days ago the user Bartemis posted a pair of differential equations in a post of Roy Spencer which provided some inspiration on tying all these thoughts together in one posting. Finally, an added inspiration came from the controversy about Pat Frank’s recent contribution. Without taking a stand on the numerical values Pat calculated, or his approach, I hope to demonstrate why he and his critics, Nick for example, are really discussing different things; and why Pat’s take on this matter deserves very careful consideration. Here goes.

Let’s consider the following system of equations:

(1)

(2)

We can view this as a model of water vapor feedback where; *T *is a surface temperature, *C *is a concentration of water vapor, and *a,b,c,d *are constants. The set of equations is a system of first order differential equations, but non-linear. One can also view individual terms as second order if we differentiate the first equation, and substitute into it the second. For instance, in the example of temperature (*T*):

(3)

This, too, is non-linear but now a second order differential equation.

Thus, we can look at the temperature problem as part of a first order

system, or as a second order differential equation. It doesn’t matter except that the first order system is easier to deal with mathematically.

1. Making a linear approximation

The system of equations has two steady solutions, which a person can verify by simple substitution. These are (*T*_{0 }= 0*, C*_{0 }= 0) or (*T*_{0 }= (*bc/da*)^{1/3 }*, C*_{0 }= *c/d *· *T*_{0}). The first is a trivial solution of no interest. The second becomes a point around which we will make a linear approximation to the original equation set. The algebra is a little tedious and has little bearing on the issues at hand. But if we call *ζ *a small deviation from the steady solution in water vapor, and *θ *a small deviation in temperature the final form of the set of equations is.

(4)

(5)

These are valid near the stationary solution (*T*_{0 }= (*bc/da*)^{1/3 }*, C*_{0 }= *c/d*·*T*_{0}). Doing as we did to produce the second order Equation 3) we arrive at the following linear second order approximation.

(6)

2. Feedback Block Diagrams

Before continuing toward my main purpose, I’d like to show the relationship of the differential equations, above, to feedback diagrams over which people have spilled electrons galore, although only green ones, on these pages. Figures 1a and 1b show two possible block models of the second order differential equation in *θ*. The block models, and the differential equations are just different descriptions of the same thing. One has no mojo that the other doesn’t have. They have different utility. For example, engineers must turn a system into electronic or mechanical hardware that realizes the system, and the block diagram is useful for making this transformation.

Figure 1. Figures a, and b show alternative block models of the second order differential equation in *θ*. In a) the model consists of two integrators, which turn *θ*^{¨ }into *θ*, and the feedback loops contain gain blocks. In b) there is a filter that functions as a leaky integrator. Block representations are not unique.

3. Stability of the Linear System

We are in a position now to apply Nick’s stability analysis. Finding the eigenvalues of a two by two matrix is relatively easy. It just makes use of the quadratic formula. The interested reader might consult Wolfram Math World online (mathworld.wolfram.com/Eigenvalue.html) which shows eigenvalues in this instance explicitly.

(7)

There are two negative eigenvalues for some combinations of (h,b,c,d), meaning; the linearized system is stable, and the original non-linear system is then stable against infinitesimal disturbances. As Nick Stokes indicated, initial errors in this system will damp out. This, however, is not the full story. In my opinion it is the correct answer to a wrong question. The question of stability is not just a matter of the behavior of matrix A, where:

but also the question of what occurs with particular input to the system. In a stable system like A we can state that a bounded input produces a bounded output. That is nice to know, but bounded does not answer the more complex question of whether the output is useful for a specific purpose. This is a domain that propagation of error addresses. It is, I sense, the position Pat Frank was taking in his essay, and while I don’t speak for him, one cannot dismiss the importance of what he was trying to do.

4. The Real Concern of Error Propagation

Let’s return to the linearized system (Eqs. 4 and 5 ). The system really doesn’t do much of anything interesting because there is nothing to drive it. We must have a vector of driving terms, one involving the driver of temperature, and possibly one to drive water vapor. Without this all the solution ever does is decay back to the steady solution–i.e. its errors vanish. But this misses several important considerations. In design of control system, the engineer has to deal with inputs and disturbances to the system. Thus, the solar constant varies slightly and pushes the solution away from the equilibrium one.[1] We would put this in the vector **U** in Equation 8). There are undoubtedly disturbances driving the amount of water vapor as well. The next level of difficulty is having a model that is not fully specified. For instance, El Nino is not part of the state vector, but it does supply a disturbance to temperature and humidity. Thus it belongs in vector** e **in Equation 8)**.** Perhaps these missing parameters provide random influences which then appear to have come from the solar constant or from water vapor. Finally, by being only a model, we cannot possibly know true values of the matrix elements of A; we estimate them best we can, but they are uncertain.

A more realistic model of what we are dealing with looks like this state space model involving the vector of two state variables, temperature and humidity (**X**), and the drivers and random errors of input (**U **+ **e**). Just to be complete I have noted that what we observe is not necessarily the state variables, but rather some function of them that may have passed through instruments first. What we observe is then some other vector, **Y**, which might have its own added errors, **w**, completely independent of errors added to the drivers.

(8) **X**˙ = A · **X **+ B · (**U **+ **e**)

(9) **Y **= C · **X **+ D · (**U **+ **w**)

Even though it is a more realistic model, it still doesn’t describe propagation of error, but its solution is needed machinery to get at propagated error. What we would like to estimate, using a solution to these equations is an expectation of the difference between what we would observe with a best model, which we can’t possibly know exactly, and what the model equations above produce with uncertainties considered.

Here is a steady solution to our state variables: **X **= A · **U **+ **e**. The matrix G comes from a combination of the matrices A and B, the details of which do not matter here.[2] As a matrix we can say simply that G looks like this:

5. An Estimate of Uncertainty

There is no purpose to becoming bogged down in excessive mathematics. So, let‘s focus attention on only the equation for temperature, *θ*. It is apparent that we do not have a complete or accurate model of temperature. Thus, let whatever the equations 8) and 9) produce as a solution be called an estimate of temperature. We use the symbol *θ*^{ˆ }for this. The caret symbol is what statisticians generally use for an estimator. Then let the true temperature we would have found from a perfect model be *θ*. Even though we don’t actually know the true *θ *we figure our model is not too bad and so *θ*^{ˆ}is nearby despite uncertainties.

Generally people use a Taylor series to build a linear approximation to *θ*^{ˆ}, and calculate error propagation from it. This Taylor series is written in terms of partial derivatives of all uncertain independent variables and parameters, the *p _{i}*s, in the following.

(10)

But in our specific case the terms of G, the *g _{ij}*, are coefficients of the linear approximation, with the drivers, the elements of the vector (

*U*+

*e*), as inputs. Thus our first order Taylor series looks like:

(11) (*θ*^{ˆ}− *θ*) = *g*_{11 }· (*u*_{1 }+ *e*_{1}) + *g*_{12 }· (*u*_{2 }+ *e*_{2})

As an estimate of propagated error what we require is the variance of (*θ*^{ˆ}−*θ*) because, now having random input, this difference has become a random variable itself. Variance is defined in general as *E*((*Z*−*E*(*Z*))^{2}); where *E*(*…*) means *expectation value *of the random variable within the parentheses.[3] Therefore we should square equation 11) and take the expectation value, so that what resides on the left hand side is the expectation value of variance we seek. Call this *S _{θ}*

^{2}. The right hand side of equation 11) produces many terms as there are uncertain parameters, plus cross products.

(12)

*S _{θ}*

^{2}=

*g*

_{11}

^{2}·

*E*[(

*u*

_{1}+

*e*

_{1})

^{2}]+

*g*

_{12}

^{2}·

*E*[(

*u*

_{2}+

*e*

_{2})

^{2}]+2·

*g*

_{11}

*g*

_{12}·

*E*[(

*u*

_{1}+

*e*

_{1})·(

*u*

_{2}+

*e*

_{2})]

The terms within square brackets are variances or covariances. Even if the expectation values of the random inputs are zero, their variances are not. Thus, despite having a stable system of differential equations, the variance of the state variables probably will not tend to zero as time progresses.

There is a further point to discuss. The matrix A is not known exactly. Each element has some uncertainty which equations 8) and 9) do not indicate explicitly. One way to include this is to place a uncertainty matrix in series with A; which then becomes A + Z. Z is like a matrix of random variable which we assume have expectation values of zero for all elements, but once again do not have zero variance. This matrix will produce uncertainty in G, through its relationship to A. A complete propagation of error takes some thought and care.

6. Conclusion

The value of *S _{θ}*

^{2 }which results from a complete analysis of the contributors to uncertainty when compared to the precision needed, is what really determines whether or not model results are fit for purpose. As I wrote in a comment at one point in Pat’s original posting, the topic of propagation of error is complex; and I was told that it is indeed not complex. I think this discussion shows that it is more complex than many people suppose, and I hope it helps reinforce Pat Frank’s points.

7. Notes:

(1) G = (A −*λ*·I)^{−1}B See, for instance, Ogata, System Dynamics, 2004, Pearson-Pretice Hall.

(2) Mototaka Nakamura, in his recent monograph, which is available for download at Amazon.com, alludes to small variations in the solar constant. This would go into the vector **U **as an example.

This is explained well in Bevington, Data Reduction and Error Analysis for the Physical Sciences, McGraw-Hill, 1969. Bevington illustrates a number of simple cases. Neither his, nor any other reference I know, tackles propagated error through a system of linear equations.

“I hope to demonstrate why he [Pat Frank] and his critics, Nick for example, are really discussing different things”Well, you are discussing something different too, namely differential equations. This is an essential part of the story untouched by PF; GCMs are in fact solving DEs.

” The question of stability is not just a matter of the behavior of matrix A,… but also the question of what occurs with particular input to the system.”They are totally related, as I set out in my earlier article. You are making the theory much more complicated than it needs to be, and missing what it can actually tell you.

If you have a system of linear DE’s

y’=A(t)y

where A is an n*n matrix and y a n*1 vector, it will have n linearly independent solutions. You can combine a basis of such vectors into a matrix Ω(t) (columns are solutions), so that:

Ω’=A(t)Ω

You can start at t=0 with Ω=I (matrix identity). Then for any solution g(t), g(t)=Ω(t)*g(0). Ω(t) is the matrix that transforms the space at t=0 into that at t.

Now if, as in your case, A is constant, then Ω(t)=exp(A*t)

using here a matrix exponential. That can either be evaluated as a power series in the usual way, or by the eigen decomposition:

A*V=V*Λ

where Λ is the diagonal matrix of eigenvalues and V the corresponding eigenvectors. Then

exp(A*t)=V*exp(Λ*t)*V⁻¹

where exp(Λ*t) is just the diagonal matrix of corresponding eigenvalues *t.

So that is the answer to your question, as far as initial values are concerned.

“A more realistic model of what we are dealing with looks like this state space model involving the vector of two state variables”You want a driver term. That is the inhomogeneous system

y’=A(t)y+d(t)

As I said in my earlier post and in comments, that has an explicit solution in terms of Ω:

y=Ω(t)∫Ω⁻¹(u) d(u) du where the integral is from 0 to t. You could add any solution of the homogeneous equation Ω(t)*B for arbitrary square matrix B.

When A is constant, this just becomes a convolution

y=∫exp(A*(t-u) d(u) du

and with the eigendecomposition, you can reduce this to scalar integrals, when it becomes exponential smoothing if the eigenvalues are positive.

So you have explicit expressions for both initial differences and drivers along the way. You can say how the initial domain is mapped, and so you can say how an uncertainty interval is stretched or shrunk. If either is stochastic, then the integral will be evaluated as a variance expression, as with your equation (12).

As I also said in my earlier article, this analysis is OK for studying instabilities in GCMs and CFD, but it all rapidly gets too complicated. In fact chaos supervenes, which is far more benign than the name implies. The upshot is that, while you absolutely have to take account of what the underlying DE is doing, the only effective way is empirically. Test what it does to possible error patterns by trial. That is the widely used process of ensemble solution.

The matrix exponential link should be this (Wiki)

A lot of the theory with fundamental systems (Ω) is here

As I pointed out in earlier discussions. it is important to be clear about what ensemble we are talking about. Looking at an ensemble of runs from the same model is a valid way of examining its sensitivity to initial conditions and to ranges of unknown “parameters” .

This does not apply to looking at an ensemble of single runs from various models of arbitrary origin and suggesting that the mean of all these runs is meaningful.

IPCC use of “ensemble” is predominantly the latter. Yet another deceptive word game, used to infer that the mean of a crock of shyte somehow smells better than its constituent parts.

The IPCC uses “an anecdotage” of projections under the belief that all the projections are relatively accurate to start (assumption and false) to presumably produce a composite of all the projections (like the NHC does to predict hurricane track, again shit-bad pseudo-science; I’ve called them out and predicted exact tracks based on simply watching water vapor/surface temps from Goes with the occasional precise time of turn and final heading) and provide a “trajectory” when the mathematics of the individual projections (not models) the IPCC use are not in the same reality like the NHC’s are.

That’s where the problem is: the IPCC and it’s idiotic supporters are *changing reality* on their equations to try to produce the output they want and their outputs are NOT applicable to This Planet. Since each of the projections uses different alterations of reality their output cannot be composited. It’s like claiming that 5/7 and 6/9 equal 11/8. It would look like it does since the enumerators are added and the denominators are averaged but that very act of bad DUHrithmatic ends up being wrong. In woodworking it’s actually close enough and it’ll keep your ship floating most the time but in projections it’s just adding an error to multiply.

Greg

“This does not apply to looking at an ensemble of single runs from various models of arbitrary origin and suggesting that the mean of all these runs is meaningful.

IPCC use of “ensemble” is predominantly the latter. Yet another deceptive word game, used to infer that the mean of a crock of shyte somehow smells better than its constituent parts.”

Hmmmh.

Don’t think I wouldn’t understand your point, though your tone is somewhat disappointing.

But… what about looking at this, Greg?

http://www.globalwarming.org/wp-content/uploads/2016/02/Christy-modeled-versus-observed-temperatures-mid-troposphere-just-trends-1979-2015-Jan-2015.jpg

Where is the difference? Does John Christy not follow exactly the same path?

{ Apart from the fact that WUWT’s regular commenter Olof Reimer, who seems to be perfect in using the model corner in KNMI’s Climate Explorer, has shown many times a far better correlation between observations and model output! }

Rgds

J.-P. D.

I don’t think I am expressing any disagreement with what you say. But it has seemed to me that you have argued against Pat Frank by insisting that the eigenvalues of the matrix A are the entire story and I am trying to illustrate that they are not. If they were, then the uncertainty in the model would tend to zero with time, but in fact it won’t given the many sources of uncertainties.

By the way, I agree that generating an ensemble using Monte Carlo methods would be a good approach, but there are folks who state it does not produce a “distribution” of possible outcomes. They argue this in favor of explaining away disparity between model and observations.

Kevin,

” If they were, then the uncertainty in the model would tend to zero with time, but in fact it won’t given the many sources of uncertainties.”The “many sources” don’t change that dependence. I think Pat’s analysis is wrong, but for the present argument, it is based on A=0. That is, past errors, which count as disturbances in de solution, don’t attenuate. And that is generally not true, if you think about the way the atmosphere works. Hurricanes fade, and a month or so later, no trace remains. Where are the snows of yesteryear? Etc. Almost all the eigenvalues of A correspond to dissipative effects.

The interesting possibilities for eigenvalue a of A are:

1. Re(a)>0. Exponential growth, which is unphysical. It indicates error in the program.

2. a=0. There are some, and they relate to conserved quantities like mass and energy. They actually do behave in the way suggested, and can build up (or diminish) indefinitely. This is a real but recognised problem. We know what the value of the conserved quantity should be, and it is monitored and corrected if necessary. Of course, the whole basis of the PDE is structuring it to maintain conservation, so correction should not, in theory, be needed. However, nothing is quite perfect, and some drift occurs and needs to be countered. Look up mass and energy fixers in the documentation.

3. Re(a)<0, but only just, and Im(a) not zero. This represents the wave solutions which are physically important. Re(a) represents the wave dissipation.

4. Other Re(a)<0 – the modes corresponding to relatively fast dissipation. Important is the rate of dissipation of turbulence – ie eddies subject to viscosity.

“…That is, past errors, which count as disturbances in de solution, don’t attenuate. And that is generally not true, if you think about the way the atmosphere works. Hurricanes fade, and a month or so later, no trace remains. Where are the snows of yesteryear? Etc. Almost all the eigenvalues of A correspond to dissipative effects…”

Wow Nick. Those are events in REALITY. The subject here is model calculations. They do not include and solve (well, numerically approximate solutions) all of the differential equations governing weather and climate. And climate models don’t simulate hurricanes in the first place, so that makes your examples even worse..

But “a month or so later,” traces of hurricanes can certainly remain. Land-falling hurricanes and tropical storms are part of the water cycle for bodies of water in the mainland, soils and groundwater, etc. Or maybe you think evapotranspiration is a non-entity.

And then there is Lorenz’s butterfly.

“And then there is Lorenz’s butterfly.”

Another grossly mis-named effect. A butterfly, even an entire swarm of them, has no impact on even local conditions. Their impact is so small that it is quickly overwhelmed by prevailing wind currents. If a single butterfly had such an effect, could you imagine what kind of effect an elephant would have by flapping its ears?

Nicjk writes

You’re asking yourself the wrong questions. Your understanding is fundamentally based on the assumption the atmosphere is modeled correctly.

“Your understanding is fundamentally based on the assumption the atmosphere is modeled correctly.”They do pretty good weather forecasts. But the argument here is just that they dissipate perturbations in an Earth-like way.

Weather forecast models are not the same as climate models. And let us kow how good weather forecast models are looking a few months ahead. If we want to talk errors, that may be a good place to start.

“Weather forecast models are not the same as climate models. And let us kow how good weather forecast models are looking a few months ahead. If we want to talk errors, that may be a good place to start.”

You don’t need to go more than a few hours for weather models to go haywire. Nick’s “pretty good” is an extremely low bar.

“Weather forecast models are not the same as climate models. “They solve the same equations in similar ways. Climate models usually run with lower resolution. Some, like GFDL, are used in both capacities.

Nick writes

No they dont. The equations to solve climate change are weather models …and more.

Its the “and more” part that makes them climate models and its the “and more” parts that are broken.

Yeah, and that lower resolution eliminates hurricanes, which was an example you gave earlier.

GFDL can be run in hydrostatic and non-hydrostatic modes of course…big difference between the two.

But keep portraying it as the exact same model.

Here’s a particular gem about GFDL’s FV3 “dynamical core”…

“…FV3 was ‘reverse engineered’ to incorporate properties which have been used in engineering for decades, but only first adopted in atmospheric science by FV3…”

And another (my CAPS for emphasis)…

“…A mass, momentum, and total energy conserving algorithm is developed for remapping the state variables periodically to an Eulerian terrain-following coordinate to perform vertical transport, and to avoid layers from becoming infinitesimally thin. AS LONG AS THE LAYER THICKNESS IS POSITIVE, the model retains stability…”

So differential equations governing physical laws in the model could allow the thickness of air in a column of atmosphere to go negative and make the model unstable? How is that dissipating perturbations in an Earth-like way?

“So differential equations governing physical laws in the model could allow the thickness of air in a column of atmosphere to go negative and make the model unstable? “The differential equations don’t allow it. A badly discretised implementation could allow it, and it would lead to instability. The run would terminate. They are saying that their algorithm does not have such a defect.

The FV3 reference seems to be to using a semi-implicit solver to deal with vertical sound waves. This is indeed common in CFD, but challenging in GCMs (slow). Apparently they have achieved enough speed to make it practical. Good.

Tim,

Nick is *still* trying to conflate “error” with “uncertainty”. While the effect of errors may attenuate that does not mean that uncertainty about the outputs do!

In essence Nick is trying to say that the models have *no* error because any errors disappear over time. Therefore there can be no uncertainty about their results.

Like you say, Mick wants us to assume along with him that the models are correct!

Tim writes

Early on in the debate the AGW crowd pushed the meme that whilst the weather couldn’t be known far in advance, the climate could be known because the associated weather averaged out and their energy accumulations and forcings were creating believable weather under the conditions projected. With the unstated assumption the weather was being accurately modeled under those projected conditions.

The reality is that the GCMs need to much, much more accurately account for the energy accumulated and weather resulting than weather models need to do.

It much, much harder to project climate over a century than it it is to project weather for a few days.

Weather models would go off the rails in 1-2 days if they were not updated with new obs every 6-12 hours (see manuscript by Sylvie Gravel et al.) Thus they are entirely different than climate models that do not have this ability . Also climate models are run at much lower resolutions so they use much larger dissipation than wetaher models.

Here you’ve touched-upon a inherent disconnect of the feedback, signal analysis aporoach with thermodynamics. In particular, since the atmospheric system (into which I include both the oceans and ice masses) is ever in only

steady-stateand notequilibrium, the thermodynamics are not reversible and thus are path functions. Consequently, the use of a state varible model is inheriently flawed.I do want to be clear that I do

notout-of-hand discard the feedback, signal analyis approach for a model is just a model. As you noted, the observable atmospheric system is substantially stable and perturbations do damp-out. The transfer function approach is a well-travelled road and can be useful to feel-out candidate contributions. I’m good with all of that.However, one should be clear about assumptions (both implicit and inherent) and keep them in mind as the thinking progresses. Starting with very few terms (states) in the model, one must embrace a negative feedback construct. Otherwise, the model would be

a prioriunstable and the predictions would be on the express to absurd land. Being consistent with the observations is a definite bonus.However, is that proof that a particular element of the model is indeed negative-feedback stable? It actually is not. The barrier to making this argument is the concept of

observability. That is, one cannot predict (observe) the state variables from the output. The system overall is stable. May be all the contributing factors are also stable. Maybe not. One can infer the strong suspicion of the stability from the observations. One, however, cannot argue it from model. State variable transfer function math does not support that the observed output is unique to only one state-variable solution.An accumulating error would fall into the same murky pit. Conceptually, the thermodynamic path function(s) open the door to accumulating variance. Does that preclude that either (a) other mechanisms zero-out that variance or (b) stochastically they average-away over time? Not in the least. Unfortunately, one cannot illuminate that question examining the observed output. Bummer.

The negative feedback assumption is likely a good assumption. But don’t lose sight that it is an assumption.

I’ve been watching this unfolding discussion for a while and truly rejoice that the scientific debate is moving forward (albeit slowly and with frustrations as normal). There are so very few venues in which these interesting topics are being subjected to the scientific method.

Please keep-up the good work!

The particulars of any one hurricane or thunderstorm may dissipate over months, just as the path of one person walking once through a field of grass is soon lost. But the sum of all hurricanes/thunderstorms over time is not zero. There are significant transports of heat energy from the ocean surface to high in the atmosphere, above much of the CO2, and of course much redistribution of water, significant if from ocean to land. Both these effects may be significant for climate over time.

The signal from a single storm becomes lost in the noise. Yet its effect persists. The maths for this process must be so chaotic as to be incalculable. using empirical measurements will likely beat math (or GCM’s) for this aspect of climate science.

The faith put in modelling in modern science is warranted where the maths are calculable and confirmable in the real world. I am skeptical that modelling with uncorroboratable math is very useful in climate science for accurate prediction of the future beyond either very short time periods or allowing for huge uncertainties.

The NHC model during a previous hurricane showed the hurricane going north and the

ECMWF model (supossedly the best model) incorrectly showed it going south along the coast.

That is error due to mistakes in the dynamics and physical parameterizations and is not random.

Hi Nick,

Are you saying that past errors in the solution of temperature attenuate? And wouldn’t that imply that past uncertainty in the physics incorporated in the solution of past temperature also attenuates? If errors attenuate, what’s the point of stepping the GCMs through time?

Frank from NoVA

If you run an imperfect model long enough, it becomes perfect!

Weii Nick I am still waiting for you to admit your mistakes in your

error propagation post:

1) The use of excessive dissipation in climate models means they are modeling something closer to molasses than air (see continuum error analysis on your error propagation post).

2) The use of columnar forcing in climate models means that they are mimicking a discontinuous continuum solution violating the numerical analysis requirement that the continuum solution be differentiable (see tutorial on your error propagation post ). This destroys the accuracy of the numerical approximation.

3) The use of arbitrary dissipation allows the climate modelers to adjust the linear growth rate of the perturbation due to increased to any rate they want.

You assume that the continuum partial differential equation is physically correct and accurately approximated.

Neither is true for climate models. They are closer to modeling molasses than air , i.e., there is a huge continuum error because of the excessive dissipation. And the numerical method does not meet the minimum requirement for accuracy, i.e. the continuum solution being approximation is not differentiable.

Very nice article. It beggars the question as to whether we know enough about three key factors in climate to accurately predict/forecast/project its likely evolution?

Firstly, that we have identified ALL the individual elements that go to make up the climate, from Co2 to oceanic currents, volcanoes to the Sun? Are there some missed completely, or not given enough prominence, or that come into prominence as another temporarily fades?

The second element, do we know enough about all the characteristics of each player in the climate equation as they interact with each other?

Lastly of course, is there some ‘black swan’ or ‘wild card’ that periodically changes everything?

Without knowing all these aspects of climate, or at the least the majority of the important ones and their variables under all circumstances, it seems difficult to make highly complex equations and model the output.

tonyb

I agree, as it seems to me the influence of the oceans on atmospheric conditions/climates is vastly underrepresented.

The simple maths is too much for me anyway do delve into, but I think

WordPress removed some symbols, and right now I am pondering how to make some repair from where I sit so early in the morning.

The only way I know to do that is to take a screen shot of the equation when all the correct symbols appear and are in place and put that into the WP format.

Perhaps using “pre” and “/pre” would work?

(Substitute the quotes with greater than/less than signs. Use the “Test” page to see if it works.)

Kevin

Use the latex2wp routine to convert latex to word press. It has worked here (see my mathematical equation responses to Nick on his error propagation post).

Jerry

Kevin, thanks.

While I cannot and never could speak math this fluently, I think I understand the thrust of your article.

Please look at the article again to ensure it is complete, as it seems to me in the second last para at least four symbols or terms do not display. This may be a WordPress problem.

Kevin:

Greg

It’s OK, it’s just the standard switch between 2 first order equations and one second order. Just differentiate (1) and substitute with (2). But I don’t see any real reason for wanting to go to 2nd order; usually I would try to reduce to first if needed.

Thanks Nick, I realised on re-reading that by “first order system” he was referring to a system of equations, not a first order physical system.

Greg,

I guess your objection is that C is not entirely eliminated. You can do that with a further substitution using (1).

I was trying to clarify a discussion found back at the original hyperlink…plus show that diagrams versus system of equations are the same in principle is all.

C and T are simply coupled. One would have to go to the trouble of uncoupling them. There was some discussion at one of the hyperlinks about first versus second order is all, and I am simply showing it doesn’t matter.

.

having some king of robust models don’t prove they can give the actual predictions.. just they can give you a “robust” result or a mere result ..but far from reality…

important to know that..but…not enough..

very funny people discuss this 20 years after ..it should have been preliminary to any climate models results discussion.

“it should have been preliminary to any climate models results discussion”How do you know it wasn’t?

Do you know if it was? If so, a reference would be useful.

it was of course..among scientists.. that is not the point.

and modelers know.

i meant when you show people a “ensemble of models runs” you must explain what it simply means.

propagation of errors..multiplicity of models..assumptions…

If anybody can provide details of a reference work on the theoretical foundations, analysis of and applications of ensembles of model results for verification and validation of the predictive utility of a model, I would appreciate reading it. If there are references from fields other than climate science those would be especially useful.

Good luck with that. I’ve been trying to find a plain-English explanation of gridded area weighting for months now, with no sucess. I’ve gone to NOAA, the GISTEMP pages, and even BEST’s and Nick’s site, but apparently the details are buried in the R code for me to reverse engineer.

“I’ve been trying to find a plain-English explanation of gridded area weighting for months now”My contribution is here. Global averaging is integration. You can do it by figuring an estiamted function by averaging within a grid cell, so the function is equal to the mean in that cell. Ten you just add up the areas multiplied by those means.

“Good luck with that”

I hear what you are saying James.

This (ensembles of model runs as a QA/QC process) came up 10 years ago on TheAirvent and, after following a very lengthy thread with many contributors, my take home view was that a thorough, bottom up theoretical framework doesn’t exist. It all came across as an ad hoc methodology developed on the fly to try to make sense of the already built and running GCMs. Some years later, I recall a thread at Climate etc which did not change my mind.

Perhaps this would be acceptable in a research environment as a learning exercise and process but I fail to see how it is a sufficient basis for verification of the models on which we are building global economic and energy policy.

If Nick, or anyone else, can point to the established and accepted theory references that underlie ensemble modelling as a validation and verification method, I’d appreciate the links.

Nick,

I’m reading your pages now, and following the links within. Don’t you find sentences like this problematic?

If the data doesn’t exist, it can’t possibly be known that the poles are warming at a higher rate. If the NOAA GCHNM v4 data is correct, there are only nine stations above 75N. According to the ghcnm.tavg.v4.0.1.20191006.qcu.dat file for today, only three of them have data to present

There are only 34 stations between 75N and 80N, and only 14 of them have data to present. That’s 17 stations for all of the Earth above 75N. Across the pole there is a huge area where no stations are within 1200 km of one another to even do interpolation.

How can any kind of real analysis be done with holes this gaping?

A final question: when it’s impossible to know the true answer — and it’s impossible to know the Earth’s average anomaly at any given time — how does one know that one method of gridding is better than another?

” Don’t you find sentences like this problematic?”No. The point is that HADCRUT, by omitting those cells, arithmetically assigns to them global average temperature. But, while data may be sparse, it clearly says that the Arctic is warming much faster. You can’t get a perfect treatment, but you can do a lot better than HADCRUT. There is just no reason at all to suppose that the Arctic is warming at global average rate.

“How can any kind of real analysis be done with holes this gaping?”Grid cells are arbitrary. They are a convenience when you have enough data in them to estimate an average. If not, you have to look further afield. Quality declines, but you don’t suddenly know nothing.

“when it’s impossible to know the true answer …”It’s always impossible to know the

trueanswer in science. You only get an estimate, and very often one based on sampling. There are ways to test integration methods by comparison – here is an example. Here is another, in which I explore the effect of culling data in all sorts of random ways. You can go from 5000 nodes (GHCN V3+SST) down to about 500 and still retain reasonable agreement.I’ll be writing a couple of posts in coming days with what I think is an even better method, with comparisons.

Nick,

Unfortunately, your link to the WUWT site is a 404. I read the other link from your site, and my take away from it is that one really can’t know if one grid method is producing a more accurate result. What seems to stand in for “better” is “

How few samples can I use and still get a comparable result?”I misspoke when I said “true value” previously. What I meant, and failed to explain more thoroughly, was that the average temperature of the Earth (or average anomaly, if you prefer) is not like the platinum-unobtainium meter stick that used to be the standard until they started using wavelengths of light.

One knew the length of something like that far better than one would ever measure with a Craftsman tape measure, and so one could experiment with various methodologies and see if one method was significantly more accurate than another. That can’t be done with global temperature, because it’s constantly changing, and no one can know its value to the level that a meter is known.

Finally, I’ve taken a closer look at the far northern stations in the GHCN Monthly file. There are only twelve stations from above 75 North that are still reporting. Two of them have negative July trends over the past thirty years, and ten others are negative or level over the past 10-15 years.

This hardly looks like “much faster” warming. These also aren’t anomaly comparisons, they are just looks at”What have these particular stations been doing over the past three decades?”

The answer, from twelve stations in the farthest north, appears to be: Not much warming.

James,

Sorry about the first link, which is the more important one – this should work.

As to arctic stations, you can get a more comprehensive view here, although unfortunately it is GHCN V3. It shows shading for annual trends, and will give numbers for individual stations if you click on them. There is a GHCN V4 version here where the shading clearly shows the Arctic warming, but doesn’t show the stations.

Nick,

Do the closest stations to the north pole showing flat or negative trends for July over the last 10 years count for nothing?

MOD,

There is always a problem with the conversion from MSWord to WordPress. This time it is that WordPress removed the symbols for the Matrices A, B, and D in the text and formulas, which makes some of the discussion here very difficult.

I give up.

Kevin

Kevin: could you put the MSWord document(s) in the cloud, and post a link to it(them)?

Kevin,

See above post. Write the equations in latex and then use latex2wp to convert to HTML that can be directly posted into the comment.

Kevin,

There also seems to be converters from docs to latex? Just search online

Figure 1 shows the block diagrams of two feedback control systems. I have two objections to that kind of analysis for the climate.

1 – As Monckton points out, there is the problem of defining the reference level. Once you have defined the gain of the various elements, the reference level is actually a non issue. That’s the rub though. The reference level matters big time when you are calculating gains.

2 – That kind of analysis takes no account of energy inputs. It assumes an infinite power supply. The trouble is that we have a very limited amount of energy. It can basically do two things. It can create sensible heat or it can evaporate water and become latent heat. As far as I can tell, the feedback analysis used by climate scientists assumes that all the energy will be used to create sensible heat. Correct me if I’m wrong but I can’t see any way feedback analysis takes a limited energy supply into account.

Also – If you’re trying to impute system gains from data, you’re stuck with overall system gain. There’s no practical way to guess the forward and reverse gains. In fact, if you have a misbehaving feedback control system the first thing you do is break the feedback loop so you can evaluate the two paths independently.

Feedback analysis was invoked by Hansen so he could create runaway greenhouse warming. The climate is a passive circuit. It has no active elements and no gains in excess of unity. I can’t think of an engineering problem where you would invoke feedback analysis in such a case.

The climate does seem to have tipping points as we bang into and out of glaciations. Feedback analysis is probably the wrong way of dealing with the situation.

You have hit the jackpot. The sun must either be the “energy supply” or the energy input. It can’t be both. If the sun is the power source then that is all you’ll ever get out of the system. If the sun is the input, then you cannot have an active device and feedback is meaningless.

]I would rather think of the “system” as a simple transistor with the sun as the power supply. The atmosphere as a collection of components setting the bias, i.e., the operating point. The output would be read as a proxy for temperature.

C and T are simply coupled. One would have to go to the trouble of uncoupling them. There was some discussion at one of the hyperlinks about first versus second order is all, and I am simply showing it doesn’t matter.

I have sent fix the best I can to make the missing symbols reappear from where I am this time of the morning. Perhaps the moderator will see my replacement file.

I am neither a mathematician nor scientist, neither a meteorologist nor statistician, but a complete layman. I have, nevertheless, been able to observe weather over seven decades living in different climatic zones. I have learnt to reason carefully and logically and apply this when examining discussions and articles from disciplines unrelated to my specializations including those on climate.

In one sense weather and climate is easy: Day and night, the four seasons that keep following each other, heat and cold, floods and droughts, storms and tranquil days with nothing new under the sun. However, while others commenting may want to correct me, I believe the problem lies in coming up with mathematical formulas and models that accurately capture what is happening in one of the most complex systems in our world. While our scientists make great strides in understanding more about weather and climate, I believe it is the height of intellectual arrogance to think they will one day completely understand these phenomena. That is also why I think it foolish to squander billions to attempt to engineer climate rather than spending this on adapting to and even using changes to our advantage.

I think that sums it up very nicely.

When well over 90% of climate models diverge so significantly from real world results, this the propagation of a meta error of design. It is proof of institutional groupthink. The modelers as a group want something to be true but cannot yet find the magic factors. They seek not science but ideology. And they seek to impose it on the rest of us.

Quite correct Edwin. It’s astounding that there are still arguments about models. They do not represent reality and are way off the mark when it comes to even try and “model” reality. This has been proven and shown and compared to reality and observations. Models fail. Unfortunately some hate to accept the truth and continue to deny facts and truth. Monckton has also shown this mathematically. THEY FORGOT THE SUN- They have FAILED.

Frank was dealing with systematic errors, not random errors, which he proved were operating.

Thank you DocSiders

It was really taken aback when Pat’s analysis of systematic errors was dismissed by Nick when Nick said that the sets of outputs from the model did not demonstrate that the inputs were uncertain.

Pat is talking about uncertainties and Nick is talking about input variability. Some or other output can be inherently stable but quite uncertain.

As for the model outputs, replicating an uncertain resulting value tells us nothing, literally, about the uncertainty of the inputs propagated through the calculations. Nick above tries to justify the statement that small perturbations are dampened by the model and therefore the outputs are certain. That is a form of parameterization, placing limits on variables, and as nothing to do with error propagation which is a characteristic of the system. Limiting the parameter values has nothing at all to do with propagated uncertainty and does nothing to affect it. No system self-reduces propagated uncertainty.

It is hard to find a simple analogy of Nick’s take on this. Suppose I have a thermometer outside that was dropped and the glass tube shifted in its holding clips. I do not know how much, but probably 5 or 10mm. I read it every day at noon. It is the same temperature every day, almost. I average the readings. I can look 5 to 10mm down from the reading and estimate the true value but I don’t know how much it moved.

I decide to shift the values 7.5mm. Nick says the uncertainty about the average of all answers is low because I do the same thing each time – repeat the same process. Pat says that if I am uncertain about all the values recorded, I am more uncertain about the average.

Even if it was exactly the same temperature every day at noon, the uncertainty about my final average has to reflect the uncertainty about the 7.5mm adjustment I am making because I do not know what the real offset should be. Getting the same reading every day for a year does not make any of them, nor the average, correct.

A collection of model runs holding that the globe will warm 3 deg C in 100 years, ±50 deg C uncertainty, has two elements: the process of calculating the output, and the propagated uncertainty based on the uncertainties of the myriad inputs.

Nick also said the uncertainties self-cancel! Some will be high some will be low so they cancel out. Well, yeah, except when they don’t.

Let’s not confuse the values in the range of inputs with the propagated uncertainties about those inputs. The set of outputs will vary mechanistically across the explored ranges for each input, and the uncertainties will add in quadrature and will envelope all the outputs in a big grey zone on uncertainty.

In the case of climate models, that uncertainty rapidly (ten years?) reaches a value larger than the claimed effect. Game over.

Crispin,

“As for the model outputs, replicating an uncertain resulting value tells us nothing, literally, about the uncertainty of the inputs propagated through the calculations”“propagated through the calculations”. For that, you need to look at what the calculations do. Pat looked at what a calculation that he devised did.

In fact, the general application of the “propagation of error” theory that was cited from Vasquez and others is appropriate, provided you look at the right function. They look, via Taylor expansion etc, at what happens when you map the space of a measurand with a function. For DE’s, the function is the one I indicated above, Ω(t)Ω⁻¹(u), which maps values at time u to time t. Ω is a matrix of solutions of the DE.

kribaez gave an appropriate analogy in the earlier thread. Suppose you watch, from afar, someone drop a tennis ball on the court, and not pick it up. You are uncertain about the height above ground initially, and during its descent. But you have much less uncertainty about its height above the ground after 10 mins.

“when Nick said that the sets of outputs from the model did not demonstrate that the inputs were uncertain”A mystifying notion of uncertainty has been spread which claims that it can be unobservable. You can, supposedly, be uncertain of something, even though it seems to be behaving quite predictably. This puzzled Roy Spencer too. There really isn’t any way of quantifying uncertainty except by observed (or calculated) unpredictability. And so, if someone produces a good predictor, as Pat Frank claimed his Eq 1 is, then it is hard to say that the result is hugely uncertain.

Now of course it is always possible that there is a consistent systematic error. Maybe your version of the standard metre is wrong, or the freezing point of water, or whatever. But I see no way of quantifying that uncertainty, which can’t be seen as variation in the output. And you certainly can’t compound it by adding in quadrature, which would be appropriate for a random variable.

And Frank showed that the same systematic error exists in all thevmodels. Systrmatic errors do not tend to cancel out.

Question:

Consider a systematic error of say…positive water vapor feedback trippling the CO2 forcing…when in fact the feedback is zero (it is not zero this is a question through an example).

The CO2 + the incorrect 3x’s positive feedback creates an expected rise in temperature far higher than the new actual equilibrium temperature…Then in the next iteration the new incorrectly high starting temperature creates even more false water feedback than the first iteration.

How does the fact that the models assume an Earth in thermal equilibrium (an extremely dubious claim given the ever changing climate of the Earth) erase a growing temperature error that feeds back on itself? and would drive a calculated run-awsy error (aka “crash”) in just about any feedback analysis I’ve ever worked with (high speed precision servo motor control) ?

Nick,

“For DE’s, the function is the one I indicated above, Ω(t)Ω⁻¹(u), which maps values at time u to time t. Ω is a matrix of solutions of the DE.”

A function can be completely deterministic and stable. That does *not* mean that it’s deterministic outputs are uncertain when compared to reality.

“But you have much less uncertainty about its height above the ground after 10 mins.”

Bad example. The function describing the fall of the ball becomes discontinuous when the ball stops. The forces applied to the ball require a completely different function with a completely different output at some point (i.e. when it hits the ground).

“You can, supposedly, be uncertain of something, even though it seems to be behaving quite predictably. ”

Except the models don’t match reality. So the predictability requirement is not met. Again, being deterministic is not a guarantee of uncertainty.

“But I see no way of quantifying that uncertainty, which can’t be seen as variation in the output. And you certainly can’t compound it by adding in quadrature, which would be appropriate for a random variable.”

And again, being deterministic is not a guarantee that there is no uncertainty. Pat *did* quantify at least one uncertainty factor, +/- 4Wm^2. And adding in quadrature is *not* a sufficient criteria to be a random variable. Root-sum-square is *NOT* the same as root-mean-square.

Tim, my admiration. You’ve been a real stalwart for the integrity of scientific practice.

You’ve gotten the analysis exactly correct.

I’ve added a comment about the source of the mistake Nick and the climate modeling folks are insistently making. The comment is apparently in moderation, but should appear around here somewhere.

In essence, they are incorrectly applying the judgment criteria of inferential statistical models to deductive physical theory. This is the source of their incessant recourse to random error.

It’s an example of Terry Oldberg’s equivocation fallacy.

Nick, “

A mystifying notion of uncertainty has been spread which claims that it can be unobservable.”It’s only mystifying to people who know nothing of propagated error.

I discussed this supposed conundrum in my WUWT essay

Do Climate Projections Have Any Physical Meaning?, here:“

The consensus sensibility will now ask: how is it possible for the uncertainty bars [on a simulation of 20th century air temperatures] to be so large, when the simulated temperatures are so obviously close to the observed temperatures?Here’s how: the multi-model average simulated 20th century hindcast is physically meaningless. Uncertainty bars are an ignorance width. Systematic error ensures that the further out in time the climate is projected, the less is known about the correspondence between the simulation and the true physical state of the future climate..”…

[No climate model can] produce a unique solution to the problem of the climate energy state. … This means, for any given projection, the internal state of the model is not known to reveal anything about the underlying physical state of the true terrestrial climate

This is why there is a large physical uncertainty in a simulation that apparently reproduces known observables. The simulation tells us nothing about the underlying physics of the state. The ignorance is profound. The uncertainty is large, even though it is invisible to inspection.

“

This puzzled Roy Spencer too.”The core of Roy’s argument centered on the notion that a calibration error statistic perturbs a model. His entire critique of my work is misconceived.

“There really isn’t any way of quantifying uncertainty except by observed (or calculated) unpredictability.”Model calibration error propagated through a futures prediction, Nick. That’s the way to quantify the uncertainty in a prediction.

I re-post here the comment of poster nick (another Nick), in full:

“

Pat, for what it’s worth, but I as a physicist am shocked by the sheer incompetence that seems to be present in the climate community. The method you are using is absolutely standard, every physics undergraduate is supposed to understand it – and usually does without any effort. The mistaken beliefs about error propagation that the climate guys show in all their comments are downright ridiculous. Kudos to you for your patience explaining again and again the difference between error and uncertainty. Really hard concepts to grasp for certain people. (I’m usually much more modest when commenting, but when people trumpet BS with such a conviction, then I can’t hold myself back)“I’d say that knowledge of significant digits can be added to that as well.

“The ignorance is profound. The uncertainty is large, even though it is invisible to inspection.”I rest my case.

Take a fair coin. Heads add 1C, tails subtract 1C. That is your random error.

Plot the result over time. Even for this extremely simple model the result will not converge on zero the way you might expect.

Rather, you will see a series of random walks about zero with the maximum excursion increasing with time.

The sum of these walks tends to zero as Nic suggests.

However the maximum excursion increases with time a as Pat has correctly identified..

The problem for all forecasting is that you cannot tell if the excursion from zero is due to forcings or the error.

It is very simple to implement this in excel to confirm.

Frank showed the existence of systematic non-random error in the models. Small random errors do tend to damp out..just the opposite with systematic (dyed into the wool) errors.

+1

Take a fair coin. Heads add 1C, tails subtract 1C. That is your random error.

Plot the result over time. Even for this extremely simple model the result will not converge on zero the way you might expect.

Rather, you will see a series of random walks about zero with the maximum excursion increasing with time.

The sum of these walks tends to zero as Nic suggests.

However the maximum excursion increases with time a as Pat has correctly identified..

The problem for all forecasting is that you cannot tell if the excursion from zero is due to forcings or the error.

It is very simple to implement this in excel to confirm.

“Rather, you will see a series of random walks about zero”That assumes the contribution of each score adds undiminished into the present state. And usually it doesn’t; that is the point of the DE theory. Present weather is not the simple sum of past weather. The effects of past weather fade.

Nick writes

This is based on the assumption that the earth’s energy levels will quickly revert to a mean (less than 17 years of either warming or cooling). And that prior to all this CO2, the earth’s energy levels were at equilibrium.

There is no consideration that a region (eg Greenland) can have a significantly different energy level for a significant amount of time (ie Viking settlement)

AFAIK GCMs dont support a habitable Greenland without AGW.

“…And usually it doesn’t…”

So now it’s only “usually?”

“The effects of past weather fade?” The Grand Canyon is an effect of past weather not fading, but having a cumulative effect . Glaciers are an effect of past weather not fading. Pretty much every feature on planet Earth is a result of past weather not fading.

Present weather may not be the

simplesum of past weather, but the effects of weather certainly do add up.““The effects of past weather fade?” The Grand Canyon is an effect of past weather not fading, but having a cumulative effect .”Yes, but it isn’t part of “present weather”.

Nick, do you suppose the climate of that region would be the same without that gigantic ditch? Would Greenland’s be as it is without two miles of ice sitting on it?

The present weather systems of the Earth are a direct result of past weather — and climate — in the region.

Aren’t climate models looking at the long picture?

Nick –> I’m sorry but these geographical features DO AFFECT weather. I live near the Kansas River and its basin. I am old enough to have recognized how this feature affects thunderstorms and frontal boundaries. This river basin is also small compared to bigger rivers.

Ask balloon enthusiasts and glider pilots what happens when they fly over these types of geographic features. I’ve sat and watched balloons sink precipitously as they enter the basin.

These affects are just another facet that is missed in GCM’s and consequently are another cause of uncertainty in their outputs.

“Nick, do you suppose the climate of that region would be the same without that gigantic ditch?”Quite possibly not. But

changesto the ditch are not affecting weather or climate on any timescale on which we are concerned with them.Nick –> You equivocated in your answer. James didn’t talk about “changes”, he said that these features do have affects on weather and over time affect the climate of those areas.

Simply dismissing these facts is not appropriate. Doing so leads many to believe that all you’re interested in is making your computer game come out the way you wish, not the way it should be.

“Yes, but it isn’t part of “present weather”.”

Right. Present weather does not have effects, only the past weather had. Gee, the mental gymnastics Nick has to do in order to preserve his cargo cultist beliefs…

Nick, you discard Re (a) greater than zero for the reason of exponential growth which is in non-physical.

Yet climate is the result of muti-variate non-linear processes.

Re( a) could be close to zero and greater than zero if simultaneously held in check by other emergent or non-linear processes.

LOL… So the science *is* settled, except that we cannot even agree to how to determine the propagation of error of the “simple” systems we use to model a complex chaotic natural system.

There is one and only one true test…make predictions, wait enough time, measure the delta of predictions from observations. The models have been running now for…30 years? And so far they have consistently produced too much warming, even after several “tweaks”. The models have proven useless, time and again, on their predictive powers for only 10 years out. How can any competent scientist put faith into their predictions 100 years out?

Scrap the current models and start over without building in all the bias for CO2, and one can likely make more accurate predictions. The “CO2 controls everything” hypothesis is just wrong. People need to get over their bias and accept this, or they will never make any progress towards useful science.

I’ve been having an exchange with a someone in another forum; a university-level machine learning expert who works on climate models and is very offended by my paper.

In the discussion, I realized after examining this person’s published papers that predictive uncertainty is seen from a perspective of statistical inference. This is where one has a statistical model trained on past behavior and evaluates it by generating a probability distribution of calculated possible future behavior. The inference is that the past defines the future.

The scientific approach to predictive uncertainty is from a perspective of deduction from physical theory. In this alternative one wants to know how closely the theory predicts the physically correct answer and how reliable the theory is in getting to that answer.

The scientific approach is a totally different way to think about prediction. The scientific approach is not inferential at all. Inferential extrapolation is theory-free.

The scientific approach is deductive from falsifiable physical theory. It is not centrally concerned with inferentially calculated variety. It is concerned with causally accurate physical theory.

Predictive uncertainty from the inferential statistical view and from the deductive scientific view are mutually exclusive.

Statistical inference rests entirely on the substratum idea that the future will be like the past. The model is trained on a data set of past observables. It is then extrapolated forward, varying some inputs, to get some probability distribution of possible future events.

The fully statistical model is a-causal. There is no base in physical theory. Inferences about uniformity do not constitute a physically causal theory.

An engineering model includes as much physics of the system as is useful (or known), but nevertheless uses prior data (experimental data most often) to parameterize an inferential model, so as to reproduce the experimental observables and interpolate the system behavior between the data points.

This describes the current structure of climate models. They include physics but are inferential models. They are trained on past observables and are used to infer a probability distribution of future climate observables. This is fully a ‘past-defines-the-future’ approach.

With this insight, I understand why climate modelers view predictive uncertainty to be the probability distribution of model expectation values that infer future observables. I also now understand why they describe model variability about a mean as ‘propagated error.’

Neither of those usages are appropriate to the physical sciences (or any of science).

In a physical theory, predictive uncertainty is an estimate of how narrowly the theory predicts what should be a single-valued observable. Predictive uncertainty then becomes an interval indicating the region wherein the correct single-value is likely to be located.

But accuracy of the uncertainty interval is revealed only in the event of an experiment. The correct value may be in a wing of the interval, or even may not be located in the uncertainty interval at all.

In science, predictive uncertainty is never a probability distribution of model expectation values inferred from past behavior. That distribution is precision.

In science, predictive uncertainty is an interval that estimates how well understood is the physics beneath the theoretically deduced prediction. That estimate is accuracy.

The behavior of an inferential model in predicting various outcomes is not how accuracy is estimated in the sciences. Accuracy is solely determined by how closely a prediction comes to a true value.

When the true value is unknown, for example the resonance energy for some speculative subatomic particle, one can calculate a probability distribution of possible energies based upon current physical theory (e.g., the Standard Model). That probability distribution is theoretically deductive; not statistically inferential.

Physicists can do an experiment at that energy. But there’s no way to know where in that predicted energy interval the resonance will appear, or even if it will appear there at all, or anywhere else.

So, that prior prediction interval is not an accuracy measure. It’s an uncertainty measure. The accuracy is determined by whether the resonance appears, and if so, where. That is, accuracy is determined by experiment, after the fact of estimating the uncertainty of the prediction.

Prior to the experiment, the width of the prediction interval can be estimated, again empirically, by reference to a calibration experiment that reveals how accurately the theory predicted the resonances of known particles of perhaps similar energy.

That empirical reliability interval is calculated from the (predicted minus known observed) difference, also taking into account whether critical values that condition the theory are known accurately.

Applied to climate models, an estimate of predictive uncertainty could be derived from a calibration experiment showing the difference between simulated and known observables, with added uncertainty to come from model predictive variability as determined by varying poorly constrained parameters through their uncertainty interval.

The total predictive uncertainty of a single-value prediction would be represented as (+/-)calibration uncertainty, (+/-)model variability, or, alternatively, (+/-)sqrt(sum of the two uncertainty variances)].

Clearly, in modeling a sequential series of states using a predictively deficient physical theory, and in which the uncertainty of each subsequently predicted state starts from the predictive uncertainty of the prior state, the predictive uncertainty must increase with state predictions, and propagates through the serial predictions as the root-sum-square.

That is, in science predictive uncertainty is not intrinsic to the behavior of the model. It is extrinsic and determined externally by accuracy with respect to experiment (well-known observables).

This is how a calibration experiment is derived. It is a direct empirical measure for how well a physical theory reproduces a known experimental result.

The error revealed by calibration does not depend on the inner behavior of the model for its meaning. It takes its value and meaning only by the external conformance of theory and result. The calibration error statistic that results defines the predictive uncertainty of the model.

This difference is what powers my critics from climatology. They are applying the judgments used in the context of their inferential models to an estimate of the predictive uncertainty of a deductive physical theory in the context of science.

They are applying the criteria of model inferential precision to judge an estimate of model deductive accuracy. Such criticisms are entirely inappropriate.

They are making the mistake Terry Oldberg so often pounded home: the equivocation fallacy. They are powering their objection by use of the wrong definition of predictive uncertainty. In essence, they are equivocating accuracy with precision.

The folks who infer the values of future observables using statistical inference seem locked into their view.

They are insisting that science is inferential.

It is not.

“…The inference is that the past defines the future…”

Ironic since Nick maintains that the models rapidly make past conditions irrelevant to the present and future.

Michael,

The models do because they are closer to a heat equation because of the excessive dissipation.

A heat equation cannot be run bacwards in time.

Jerry

Ironic, too, because he thinks everything is dissipative

except CO2.” the models rapidly make past conditions irrelevant to the present and future”The models thus behave just like the atmosphere, as they should. Weather today is not determined by the weather in 1970. Or even in 2018.

“Ironic, too, because he thinks everything is dissipative except CO2.”As I said, events dissipate except where conservation is involved. CO2 is subject to mass conservation.

Dumb comment. The “mass balance” argument has been thoroughly debunked. I’m surprised to see you make it, Nick. I thought you were smarter than that.

Conservation of mass has never been debunked.

It’s not a closed system.

Pat, that is an interesting and expansive comment, but the Eliza Doolittle in me, as on the previous thread, makes me ask if you can “show me the mathematics”. That is, for your two contexts about uncertainty, can you characterize them in equations, as Kevin Kilty has attempted to do, except I would be happy with bare bones linearized examples? This would help me (and others I am sure) compare the two and understand why you believe your interpretation and usage is correct.

Rich.

Rich,

You seem to like math. Ok, take a derivative pricing model, one that numerically solves a PDE in n-space and parameterize it:

– It’s exact – for your market parameters, you get the same result every time you run the model

– It’s precise – you and I (and everyone else) get very similar results using similar market parameters

– It’s not accurate – the future states of the market parameters are not known

Would you “bet the ranch” on your model?

Frank, yes I like math, but you didn’t supply any. But I’ll try to answer your question anyway. No, I wouldn’t bet the ranch, because in analogy to global warming, its future “forcing” has huge uncertainty, so even if there was no propagation of error from year to year, there could be a huge future hit. In GW, the future forcing from CO2 is assumed predictable from emissions scenarios, and the interesting question is whether the uncertainties from clouds add up from year to year. Please see my comment below on the testing of that.

Thank you Rich. No, I didn’t supply any maths, but am pleased you picked up on the analogy between GCMs and DPMs anyway. I’ll grant that future CO2 predictions are accurate if you’ll concede that past, present and future GCM projections of cloud fraction are inaccurate. And btw, as political belief in CAGW, and all of it’s proposed policy solutions, are chained to the GCMs, you are betting the ranch.

Frank, you are making a bit of an assumption there. I don’t have any belief in CAGW at all. It’s just that the jury of which I am the foreman is not yet ready to deliver the verdict that Pat has proven his case. OK, it’s a jury of 1 🙂 And that is because I do not see a logical, mathematical flow from one theory to another to results. Below I am trying to simplify the problem to its bare essence, but this has not yet been accepted.

Rich, “

That is, for your two contexts about uncertainty, can you characterize them in equations,”What are you talking about, Rich? What two contexts about uncertainty?

Pat, OK, in your own words from above, the two contexts to which I refer are:

A. This is where one has a statistical model trained on past behavior and evaluates it by generating a probability distribution of calculated possible future behavior. The inference is that the past defines the future.

B. The scientific approach to predictive uncertainty is from a perspective of deduction from physical theory. In this alternative one wants to know how closely the theory predicts the physically correct answer and how reliable the theory is in getting to that answer.

I hope that is clearer for you. So, can you write equations for the simplest sort of model which demonstrate the difference between these two? If you can, then we have something from which to proceed. Preferably you would use my notation M_i(t), R(t), B(t;z,s) but of course I cannot constrain you. I think I understand A., but not B.

(The purpose of this comment is to explore with a simple example the relationship between error and uncertainty, and ask about testability.)

Tim, I see that you have been commenting on this thread. I had prepared a long reply to your comment on the older thread https://wattsupwiththat.com/2019/09/19/emulation-4-w-m-long-wave-cloud-forcing-error-and-meaning/#comment-2811991 , but unfortunately the thread was closed when I got there. I don’t know why it got closed, because I felt we were making some progress, and I found your 12 inch ruler example quite interesting. I know I made a snarky comment about your maths, but with more interaction I’ve been able to see that you raise some quite tricky questions about statistical modelling. For example I have had to think long and hard about how to do any testing on those 12 inch rulers under the assumption that their factory might have made assertions about their “uncertainty”.

Here is a summary of where I think we got to.

1. We have a discrete time model M(t) = a M(t-1) + B(t;z,s) (*), where ‘a’ is like Nick’s eigenvalue, which I’ll take to be 1 for now in line with Pat’s paper, and B(t;z,s) is a random variable with unknown mean z and variance s^2.

2. You objected that B is just a single value in an interval, say from –s_0 to +s_0, so putting your words into mathematics (*) should be written as M(t) = M(t-1) +/- s_0.

3. You wrote “uncertainty doesn’t change from iteration to iteration”, from which I understand that you would consider “uncertainty” to be the distribution of a random variable rather than the r.v. itself. This is quite consistent with the case of an interval, where the distribution would be U(-s_0,+s_0), a uniform distribution.

Let’s return to the ruler example and see if we can make progress. I order a batch of 10 rulers and in actuality they lie in the range (11.3,13.3)”; the only way I “know” this is because the factory did calibration experiments and told me that these cheap rulers had an uncertainty, some would call it “error”, compared with 12” which is uniform in (-0.7,+1.3)”. The actual range for the 10 rulers would probably be something like 11.41 to 13.22 inches, but we don’t know. Given what we have been told we have 10 rulers of lengths which are random variables X_i in the uniform range (11.3,13.3).

Now, I want to build a train table 10 feet long, and to help me all I have is these 10 rulers. Consider 3 ways of using the rulers.

1. Choose just a single one at random and use it 10 times. The uncertainty relative to 10 feet is 10(-0.7,1.3) = (-7,+13)”, with mean 3” and standard deviation 5.77”.

2. Lay the rulers side by side, sort them, and choose the 4th longest one. Its expected error is

-0.7+(4/11)(2) = 0.02727…”; I don’t know its standard deviation but I could work it out, and I’ll guess it’s about 0.1”, so the uncertainty after using it 10 times is some calculable distribution with mean 0.27” and standard deviation about 1”. That’s an improvement over method 1.

3. Use all 10 rulers, once each. The expected length is 123” and the standard deviation is sqrt(10/3) = 1.82”, so the uncertainty relative to 120” is centred on 3” and has s.d. 1.82”. (This s.d. appears worse than method 2’s, but I may have underestimated that one’s.) To get the exact uncertainty distribution we would have to do what is called convolving of distributions to find the distribution of the sum_1^10 (X_i-12). This distribution is piecewise linear with the linear intervals being (-7 + 2i, -5 + 2i) for i from 0 to 9 inclusive. The uncertainty is now definitely a distribution rather than an interval, because if one insists on merely calling it the uncertainty interval (-7, 13), then that implies the result is uniformly probable within that interval, but that ignores the fact that some actual partial cancellation of errors in the rulers will have occurred, and the middle “pieces” are more probable than the outer pieces. However, the full convolved distribution is unwieldy, so instead it is usual to calculate and quote a mean and standard deviation for it, and sometimes assume an approximate normal distribution.

Now we come to the question of testability. If we believe the factory, and we are rich enough, we could order 1000 rulers and find that the 350th shortest ruler is really rather good for our purposes. But the factory might have employed some dubious mathematics to arrive at its U(-0.7,+1.3)” assertion. Can we actually test their veracity?

In any particular scenario, it depends on what arithmetic is available. We don’t have any accurate rulers, because we would have used them instead of these cheap ones. Nevertheless we have our eyes, with which we can sort rulers by length, so if we take the 7 shortest of the 1000, which should measure about 11.3*7 = 79.1” long, and the 6 longest ones, which should measure about 13.3*6 = 79.8” long, then we expect the latter to add up to longer than the former. If that does not happen, and many other more carefully calculated tests can be devised, then we can doubt the factory’s claim. If at least the uniformity part of the claim is correct, we can almost certainly get a good estimate of the ratio of the longest to shortest ruler.

How can this relate to Pat’s paper? For any GCM i it gives an output temperature M_i(t) at time t. Then the “rulers” are M_i(t)-M_i(t-1) over the ranges of i and t, and for a fixed i it is clear that he is using method 3 to combine them. These are real numbers calculated to huge precision most of which is meaningless, but the question is do these numbers conform to what Pat would predict from his +/-4 W/m^2 uncertainty? There is actually more arithmetic available for testing than there is in the case with the rulers.

There may be complaints that I have again misunderstood “uncertainty”. But at time t there are 3 different types of thing to play with: M_i(t) which is model output, M(t) which is Pat’s emulator output, and R(t) which is reality. If uncertainty is a probability distribution on outcomes, then it presumably says something meaningful about M_i(t)-M_i(t-1), or possibily about M_i(t)-R(t)-M_i(t-1)+R(t-1). If uncertainty is not that, then what is it and how is it of interest?

Rich –> I can give you a less complicated example of uncertainty.

Tmax = 85 +/- 0.5 degrees and Tmin = 52 +/- 0.5 degrees. What is the real average? The calculated average is 68.5 degrees. Yet using common physical properties from chemistry and physics the average would be reported as 69 degrees (think significant digits and how they are ignored by climate scientists).

Now how about uncertainties. Could all the temps be at the high range? Those temps have the same probability as the reported temps. We would then get an average of 69 degrees. No reason to round this since it is already 2 sig figs. Could all the temps be at the low range? Those temps also have the same probability as the reported temps. We would then get an average of 68 degrees. Heck, you define the points you want within the +/- 0.5 range and they are all just as likely as any other combination. In statistical terms this is because each measurement is its own population. You must compute the uncertainty when averaging the same way you do when combining different populations.

You can report this as 68.5 +/- 0.5 degrees. However, it is important to include the uncertainty so that everyone knows that you have no idea where the real temperature lies. An average temperature of 68.9538 degrees is just as likely as 68 degrees. There is simply no way to know the ACCURATE value. Climate scientists like to take the average with an extra digit of precision without describing the uncertainty. How they justify this is beyond me. Some talk about the central limit theory and/or random errors canceling. They don’t seem to realize that this requires measuring THE SAME THING with the SAME DEVICE multiple times.

Where your ruler analogy falls down is that temperatures are transient. They are measured at a single point of time and then the temperature changes to another value. You can’t compare them and choose the ones that fit a statistical calculation like you have done because there is only one value (data point) of each temperature. The next temperature measurement is a new one and has its own uncertainty. It can not affect the measurement of the previous temperature.

It’s more like having one elastic ruler that changes length each time you use it. And, the changes result in a uniform distribution, that is, any ruler length has an equal probability to any other ruler length within the boundaries you set. That is the definition of uncertainty. It doesn’t go away, nor does it cancel out. And, since you are making iterative measurements the uncertainty will grow each time you make a measurement. You won’t know if your final measurement is at the lower end or the higher end, or somewhere it between. In your words, with 10 consecutive measurements you could be 7 inches short, 13 inches long, or anywhere in between (120″ -7″/+13″).

Reply to Jim Gorman Oct7 8:25am

Firstly, I thought it was Tim Gorman, but now I see Jim Gorman. Are you the same person? Brothers?

In your average of Tmax = 85+/-0.5 and Tmin = 52+/-0.5, you didn’t notice from my earlier comment that the sum of n uniforms is not uniform but is piecewise linear over n intervals. In the case here, n = 2, and the average should read 68.5 + X (or 68.5+/-X since X is symmetric about 0) where X has the probability density function (think P[X ‘=’ x])

f(x) = (1+x) if -1<x<0, (1-x) if 0<x<1

whereas uniform would be f(x) = ½ for -1<x<1. Naturally they have different variances: the latter is 1/3, the former is by direct integration 2(1/3-1/4) = 1/6, which makes sense because Var((A+B)/2) = (Var(A)+Var(B))/4 = Var(A)/2 if A and B are i.i.d. This would matter if you wanted to combine uncertainties.

You say “They don’t seem to realize that this requires measuring THE SAME THING with the SAME DEVICE multiple times”. I’m afraid that is not correct. Tmax and Tmin above were different things, possibly measured by the same device, possibly not, but the variance of the mean was less than the variance of the individual components.

As for rulers versus temperatures, and transience, my rulers (actually yours since I recall you introduced them in the previous blog thread) might be transient. After I lay down a ruler and mark its endpoint, it might shrivel up (I did say they were cheap rulers). The transience has nothing to do with the statistical properties of the ensuing uncertainty.

You say “And, since you are making iterative measurements the uncertainty will grow each time you make a measurement. You won’t know if your final measurement is at the lower end or the higher end, or somewhere it between. In your words, with 10 consecutive measurements you could be 7 inches short, 13 inches long, or anywhere in between (120″ -7″/+13″).”

If the 10 consecutive measurements were with different rulers, you have now confused the interval in which the error must lie and the probability distribution of the error. The interval is what you say, but the distribution is not uniform, since uniform+uniform != uniform. Consider 2 dice: their sum is uncertain between 2 and 12. But 7 has probability 1/6 whereas 2 and 12 each have only 1/36.

1. “n”is not equal to 2. Each temp reading is a stand alone population of 1. You need to review how to combine independent populations.

2. You keep wanting to use error calculations rather than deal with uncertainty. Uncertainties are not random errors. You can not reduce them by averaging independent measurements.

3. If you are sure that the variance of the mean is less than each independent measurement then take the plunge and tell us exactly what value would specify for the average and the uncertainty. You are hiding behind equations without knowing how they should be used.

4. Please show how each temp reading when averaged can assume smaller values than the actual uncertainty shown. You do realize that it is necessary to end up with less variance in each of the components for the average to have a smaller variance? Again you are mistaking random errors of the same thing measured multiple times for uncertainty.

5. Your example of two dice has no relation to what is being discussed. We are not discussing probabilities of some combination. You need to address if you can have a 1/1 or a 6/6 or a 1/6 or 3/2 or any other combination and which combination is more likely than another.

6. You just won’t admit that uncertainties in an iterative procedure accumulate will you? Please answer the questions about the temps being all highs or all lows or something in between and what the actual probability of any combination is! Then give us a numerical calculation with an average and uncertainty using your equations.

7. Lastly, give us a discussion of significant digits and how it is used in physical measurements. How is uncertainty addressed with sig figs?

“You keep wanting to use error calculations rather than deal with uncertainty. Uncertainties are not random errors.”Said endlessly. But never an explanation of how to do calculations with this non-random uncertainty.

“We are not discussing probabilities of some combination. You need to address if … and which combination is more likely than another.”???

Reply to Jim Gorman Oct7 8:39pm

1. “n”is not equal to 2. Each temp reading is a stand alone population of 1. You need to review how to combine independent populations.

R: Yes, Tmin is a population of 1, Tmax is a population of 1, and when you add them together to get the mean you have used 2 independent populations of size 1. So the error in the mean comes from adding n uniforms where n = 2. QED

2. You keep wanting to use error calculations rather than deal with uncertainty. Uncertainties are not random errors. You can not reduce them by averaging independent measurements.

R: What Nick said. And if you cannot show me any mathematics to support what you say, then all you have is words. Though words can be useful, science relies on mathematics. Still, even with words, can you define “uncertainty” in respect of Tmin and tell us how it differs from “error”?

3. If you are sure that the variance of the mean is less than each independent measurement then take the plunge and tell us exactly what value would specify for the average and the uncertainty. You are hiding behind equations without knowing how they should be used.

R: In answer I am going to use +/-_ to denote a uniform interval, +/- to denote a single standard deviation, and +/-* to denote a probability distribution symmetric about 0. Therefore +/-_x equates to +/-*U(-x,x) equates to +/-(0.5774x). In your example of Tmin = 52+/-_0.5 and Tmax = 85+/-0.5 I would specify Tmean = 68.5+/-*f where f is the density I described earlier, f(x) = (1+x) if -1<x<0, (1-x) if 0<x<1; I can then simplify that to 68.5+/-sqrt(1/6) = 68.5+/-0.41 (keep the 2nd decimal digit in case of use in further calculations, but drop it in publication). There is no meaningful +/-_y value because the error is definitely not uniform.

4. Please show how each temp reading when averaged can assume smaller values than the actual uncertainty shown. You do realize that it is necessary to end up with less variance in each of the components for the average to have a smaller variance? Again you are mistaking random errors of the same thing measured multiple times for uncertainty.

R: Your second sentence is contradicted by the maths I gave earlier: Var((A+B)/2) = (Var(A)+Var(B))/4 = Var(A)/2 if A and B are independent and identically distributed.

5. Your example of two dice has no relation to what is being discussed. We are not discussing probabilities of some combination. You need to address if you can have a 1/1 or a 6/6 or a 1/6 or 3/2 or any other combination and which combination is more likely than another.

R: My dice example was purely to exemplify how adding 2 uniform distributions, whether it be die 1 plus die 2 or Tmin plus Tmax, does not lead to a uniform distribution, and therefore error distributions or uncertainty thingummyjigs cannot usefully be described by uniform distributions.

6. You just won’t admit that uncertainties in an iterative procedure accumulate will you? Please answer the questions about the temps being all highs or all lows or something in between and what the actual probability of any combination is! Then give us a numerical calculation with an average and uncertainty using your equations.

R: I don’t know what gives you the idea that I deny accumulation of uncertainty thingummyjigs. (I’ll drop that appellation – I only put it in to emphasize that no-one has given me a satisfactory mathematical description of “uncertainty”.) Var(A+B) = 2 Var(A) is exactly accumulation (in quadrature as Nick puts it), and it is what Pat is doing in his Equation (6). I might, however, in some circumstances refute that A and B are i.i.d. and then 2 Cov(A,B) comes into the equation, which can either increase or decrease the result.

7. Lastly, give us a discussion of significant digits and how it is used in physical measurements. How is uncertainty addressed with sig figs?

R: The uncertainty would be either a +/-_x value if it is a simple value arising from one source of error, or a +/-x value if it is a standard deviation arising from many sources, wherein the error sum of squares rule applies. This distinction between two plus/minus notations is my own concoction of today, because your detailed questions require that clarification. When one has arrived at a +/-x value it is a personal choice how to collapse that into significant figures for publication. I have a current gripe with the UK Met Office because they have stopped reporting hourly temperatures to 0.1°, which makes it harder if not impossible to do interesting analysis in a world in which we are frantically told that a further warming of 0.7° or so will be disastrous. So I believe they are reporting one sigfig too few.

You still don’t get it. We’re not talking about the error of the mean. We are talking about the uncertainty in the recorded measurement. Here is a simple question for you to answer.

Take Tmin as 52 +/- 0.5 degrees.

Is the real temperature as likely to be 51.5 degrees as it is to be 52.5?

No equivocation, just a simple yes or no.

Reply to Jim Gorman Oct8 5:55am

“Take Tmin as 52 +/- 0.5 degrees. Is the real temperature as likely to be 51.5 degrees as it is to be 52.5? ”

No, I don’t need to equivocate on that one: yes, under assumptions that most reasonable people would make, the real temperature is equally likely to be between 51.50 and 51.51 degrees as between 52.49 degrees and 52.50 degrees, or indeed between 51.73 and 51.74 degrees. That is the nature of a uniform distribution.

However, Jim, it was you at Oct7 8:25am who started looking at the average temperature, and I have merely been following your lead whilst showing you that for that value:

* Its uncertainty is not uniform

* The standard deviation associated with its uncertainty is less than those of Tmin and Tmax

P.S. You will note that I used small, centidegree, intervals. This is because the likelihood, or probability as we say in the trade, of the temperature being what you ask for, 51.5, i.e. 51.500000000000… is zero, so it is trivially the same as the probability of it being 52.50000000000…

Rich, “

M(t) = a M(t-1) + B(t;z,s) (*), where ‘a’ is like Nick’s eigenvalue, which I’ll take to be 1 for now in line with Pat’s paper, and B(t;z,s) is a random variable with unknown mean z and variance s^2.”That equation is not in line with my paper. I do not calculate temperature T_i in terms of T_i-1.

I do not have a random variable term.

You wrote to Tim, “

You wrote “uncertainty doesn’t change from iteration to iteration”, from which I understand that you would consider “uncertainty” to be the distribution of a random variable rather than the r.v. itself.”The calibration error statistic is

notthe distribution of a random variableYou have done this same thing over and over

and over, Rich. Imposing your model on my work. It’s really tiresome.Your analyses begin wrongly. They conclude wrongly.

You have yet to deal with what I have actually done or grant that I meant what I actually wrote.

I believe that, succinctly, is Tim’s point.

Pat, in response to your Oct7 8:47am I’ll use R1, P1, R2 to denote elements of our conversation so far.

R1: “M(t) = a M(t-1) + B(t;z,s) (*), where ‘a’ is like Nick’s eigenvalue, which I’ll take to be 1 for now in line with Pat’s paper, and B(t;z,s) is a random variable with unknown mean z and variance s^2.”

P1: That equation is not in line with my paper. I do not calculate temperature T_i in terms of T_i-1.

R2: Are you quite sure about that? In the previous thread we established that “Equation (5) is a change of temperature across 1 step (a Δ_1) plus one additive 13.86”, in other words

13.86+T_i-T_{i-1} = x

where x is a known quantity calculated from forcings plus an error component. After moving the 13.86 to the other side, that looks awfully similar to (*) to me.

P1: I do not have a random variable term.

R2: Whatever you call it, you have a u_i in Equation (5). The only way you can get to use “sum in quadrature” in Equation (6) is to treat u_i as a random variable.

R1 (to Tim): “You wrote “uncertainty doesn’t change from iteration to iteration”, from which I understand that you would consider “uncertainty” to be the distribution of a random variable rather than the r.v. itself.”

P1: The calibration error statistic is not the distribution of a random variable.

R2: If it is neither a random variable, nor the distribution of a random variable, then what is it such that n instances of it can be combined in Equation (6) exactly as if it was? It quacks like a duck. Is it actually a fox?

“Neither his, nor any other reference I know, tackles propagated error through a system of linear equations.”

https://lmgtfy.com/?q=propagated+error+through+a+system+of+linear+equations

The first 5 links are Ads.

After that is a link to propagation of uncertainty, link to a wiki page about uncertainty and then another about uncertainty.

Perhaps a link to exactly what you want someone to read might be more useful.

Don’t bother with facts, Anonymoose. Uninformed opinions are preferrable! ;>)

I actually meant originally to post the set of equations

dT/dt = -a*T^4 + b*C

dC/dt = c*T

It was intended that C represent CO2 (hence the C). Because, that is the dynamic we see in the data: the rate of change of CO2 concentration is proportional to temperature (actually, more appropriately, temperature anomaly with the appropriate baseline).

http://woodfortrees.org/plot/esrl-co2/mean:12/from:1979/derivative/plot/uah6/scale:0.18/offset:0.144

Salby noted this, and he is absolutely right. There is no doubt about it. The somersaults and contortions otherwise intelligent people go into to deny this obvious fact have been a sight to see.

So, a more appropriate model is the linearized system

dTa/dt = -a*Ta + b*C

dC/dt = c*Ta

This system has no stable equilibrium for a, b, and c greater than zero. That means we CANNOT have this combination, because the Earth would have runaway to a saturation condition eons ago. We manifestly know that a and c are greater than zero, so one possibility is that b is less than or equal to zero.

Is that possible? Yes, it is. Because CO2 has both a warming and a cooling potential, and both potentials increase with increasing concentration. CO2 heats the surface by blocking outgoing radiation. But, it also accepts heat energy from other atmospheric constituents via thermalization and radiates it away. Which potential dominates in the present climate state? I suspect they are roughly in balance, because natural systems tend to evolve to such balanced states (Le Chatelier and all that).

What other possibilities might there be? We could add a negative feedback onto C:

dTa/dt = -a*Ta + b*C

dC/dt = c*Ta – d*C

but such a term is not observable in the last 60 years of data from MLO, and if sensitivity of temperature anomaly to CO2 were significant, we would have been seeing apparent instability over this timeline. So, for all intents and purposes, we need not concern ourselves with such a term at this time.

We could try another feedback – water vapor seems likely enough:

dTa/dt = -a*Ta + b*C + c*W

dC/dt = d*Ta

dW/dt = e*Ta – f*W

But, stability here requires that the water vapor feedback be negative in such a way as to overwhelm any positive feedback from CO2, and that is completely contrary to the notion that water vapor feedback is positive, which the narrative needs to boost climate sensitivity to alarming levels. Significantly negative water vapor feedback, in fact, would reduce the aggregate sensitivity to negligible levels.

Any other addition that can stabilize this plant leads down the same road – a powerful negative feedback to counteract the instability induced by positive feedback of CO2 would render the climate sensitivity negligible.

The upshot is, aggregate climate sensitivity to rising CO2 concentration is net negligible – either CO2 itself actually produces net cooling, or some other negative feedback is overwhelming it. Temperature anomaly rose in the last century because of either external forcing or constructive reinforcement of internal heat storage and release mechanisms. If we wait long enough, this too shall pass.

footnote: I put out the original equation

dT/dt = -a*T^4 + b*C

dC/dt = c*T

just to make the point that T^4 radiation won’t rescue the system from instability. No matter how large this feedback gets, this 2nd order system is still unstable. So, we really need only focus on the linearized perturbative system to draw the conclusions of the preceding post.

Bartemis,

A very interesting argument.

dTa/dt = -a*Ta + b*C

dC/dt = c*Ta

I agree with you that the system you define with dC/dt = cTa is unstable, and therefore is unlikely to be a valid representation. You are really left with three alternatives:-

(i) b = 0 CO2 concentration has no effect on flux

(ii) There is an additional negative flux feedback which is dependent on C. (An additional feedback which is solely dependent on temperature will not resolve the stability problem.)

OR

(iii) dC/dt = cTa is not valid or is an incomplete representation of CO2 solution characteristics.

Of the three possibilities, (iii) seems to be the most likely explanation, given that solution equilibria are normally dependent on the partial pressure of the solute in the gas phase. Your equation has no physical limit to CO2 concentration even at fixed temperature.

In fact, the only steady state possible is Ta=0 and C=0. The first is possible, but the second is rather drastic.

These are perturbations with no drivers.

It is observably valid over the timeline of interest. I have addressed this point.

And, the equation does have a limit if b is less than zero.

Bartemis,

“the equation does have a limit if b is less than zero”

Agreed. I should have said of (i) the first alternative:- b<= 0 Co2 has no effect on flux or has an overall cooling effect.

"It is observably valid over the timeline of interest." That's the $64 question. I looked briefly at the problem a few years ago. The amplitude of intrannual variation is much larger than inter-annual, which presents some serious problem in interpreting the results. While there is a strong relationship between dC/dt and T, most evident over short timeframes or higher frequency variation, I found that I could not rule out the presence of a "dissipation" term dependent ultimately on C. If you believe that you or someone else has done this, can you point me to a reference? Thanks.

I think it is ruled out by the simple fact that, when you scale the annual mean dCO2/dt to match the higher frequency variations in T, the low frequency – specifically the trend – matches, too. I think the odds of that match being coincidence are vanishingly small.

That trend would not be there if there were a significant dissipative term. Moreover, were there such a dissipative term, it would not allow our inputs to accumulate, either.

Such matching can be precarious. The higher-frequency cross-spectral coherence is weak, at best. It’s at the lowest frequencies that coherence becomes significant; however, the cross-spectral phase, when adjusted for first-differencing of yearly data, shows CO2 consistently lagging T. Indeed, it’s hard to make any empirical case for CO2 being the physical driver at climatic time-scales.

“The higher-frequency cross-spectral coherence is weak, at best.”Hardly surprising with all the noise and other influences in these bulk quantities, and the fact that dCO2/dt is numerically derived. In fact, it’s amazing that it’s as good as it is. It indicates strong SNR across the entire frequency band.

Kevin,

There are IMO, at least two substantial problems with Pat’s analysis, and I do not believe that your analysis here is particularly relevant to either of those problems.

You may find this article by Oberkampf et al useful:- https://www.researchgate.net/publication/222832976_Error_and_uncertainty_in_modeling_and_simulation (There is a pdf available if you search.)

The main value of the article is that it sets up a general framework for identifying error and uncertainty. It distinguishes between different sources of uncertainty, aleatory and epistemic, across the different stages of a modeling and simulation project. In practice, in real life problems, the distinction between aleatory and epistemic uncertainty is often blurred.

All numerical models suffer from at least five sources of error:-

(a) Definition of the governing equations (i.e. are the physics appropriate and complete for the problem?)

(b) Data uncertainty in parameters and inputs

(c) Rounding in finite precision arithmetic

(d) Error in the conversion of the governing equations into numerical form (esp. truncation errors in space and time)

(e) Errors in the solution routine, the numerical algorithm used to solve the equations (esp explicit vs implicit solution algorithms, backward stability and solution ordering for coupled systems)

Most books and courses on numerical analysis are focused on (c), (d) and ( e). If you want a book that covers the concepts of stability, conditioning and error propagation in a very readable way, and you have $50 to spare, I would recommend “Numerical Linear Algebra” by Trefethen and Bau. (SIAM. 1997).

Eigenvalue analysis, combined with convergence tests, is very useful when making decisions with regard to (d) and (e). However, with respect to (b), eigenvalue analysis will only tell you something about the main (vector) direction of errors or perturbations in the formulation. It will not tell you much about uncertainty in the outputs arising from data uncertainty, nor anything about correlation between the inputs. Hence, to map the uncertainty envelope in outputs, a MC approach is normally required, applied to the joint distribution of inputs.

Returning to Pat’s analysis, Pat is making no attempt to assess (c), (d) or (e) in the AOGCMs. (If he were to do so, he would find like other authors that the AOGCMs have very poor qualities in predicting spatially distributed state variables, with unknowable levels of aleatory and epistemic uncertainty.) Instead, he is drawing his inferences from his ability to emulate the aggregate response of an AOGCM – specifically looking at its areally-weighted average surface temperature and areally-weighted average flux forcing. This is a model of a model. An implicit assumption in Pat’s analysis here is that the initial model, the AOGCM, is yielding meaningful results AT THE SAME AGGREGATE LEVEL. He is accepting ad argumentum that the AOGCMs are meaningful – that in (a) the AOGCMs meet adequacy requirements – in order to demonstrate that the uncertainty around those results is substantial.

In a model of reality, it is always essential to consider (a) above – is the model a reasonable and adequate representation of reality? Are there more appropriate models? In Pat’s analysis, this question becomes “Is Pat’s emulator a reasonable and adequate representation of the AOGCM aggregate response? Is there a more appropriate model for his purpose?” The answer to the first question IMHO is ‘no, it is not an adequate representation’; for adequacy, any emulator here requires at the very least the ability to distinguish between an error in a flux component, an error in net flux and an error in forcing. You cannot assess the uncertainty arising from a variable Z if your equation or mapping function contains no reference to variable Z. And to the second question ‘yes, there does exist a whole class of more appropriate emulation models all of which are founded on the instantaneous net flux balance’; any of these models present a demonstrably more credible representation of AOGCM aggregate response than does Pat’s model. Importantly, they all lead to a substantial calculation of uncertainty in temperature projection, but one which is different in form and substance from Pat’s.

The second major problem in Pat’s analysis really is about basic statistical quadrature when there is covariance between the variables, or in this instance, strong autocorrelation with time . Pat seems to believe for reasons which still elude me that, when dealing with calibration statistics, the normal rules of quadrature can be suspended.

I gave an example in the previous thread of a simple system given by

Y = bX

First problem: X carries an uncertainty of (+/-)2 (assumed here to be a 1-sigma from a Normal distribution) but b is known exactly. What is the uncertainty in Y when X is measured at 100?

Second problem: b carries a 1-sigma uncertainty of (+/-) 2 and X is measured very accurately. What is the uncertainty in Y when X reaches 100?

Third problem: Over a sequence of 100 unit time-steps, the incremental change in X value is measured. Each ΔXi carries an independent 1-sigma measurement error of (+/-)2. If b is known exactly, what is the uncertainty in Y introduced by these 100 measurements?

Fourth problem: This is a variant on the first problem. Over a sequence of 100 unit time-steps, X is measured. Each measurement of X carries a 1-sigma measurement uncertainty of (+/-)2. An operator then records the change in X, ΔXi = Xi – Xi-1, from its previous measurement value. If b is known exactly, then what is the uncertainty in Y introduced by these 100 measurements?

Comparison of the first and the second problems highlights the difference between an uncertainty in the variable and in the parameter. The first problem yields a stationary error variance, with 1-sigma variation of (+/-)2b. The second yields a variance which is propagated proportional to X^2. The 1-sigma variation when X = 100 is (+/-)200.

The first and the third problems highlight the critical importance of the measurand. The first problem has a stationary variance, while the third problem yields a variance which propagates proportional to n, the number of steps. The 1-sigma variation in Y is (+/-) 2bsqrt(n) .

Comparison of the third and fourth problem highlights the critical importance of including covariance in the calculation of uncertainty. In both problems, we end up with a sequence of ΔXi values which are summed to yield the final X value, but the third problem has a variance which propagates in proportion to n, while the fourth problem yields a stationary variance.

Pat’s attempt to deal with this problem was very telling. He wrote (about the fourth problem above):-

“The way you wrote it, ΔXi = X_i(+/-)2 – X_i-1(+/-)2, which means the uncertainty in ΔXi = sqrt[(2)^2+(2)^2] = (+/-)2.8.

Now we take Xi-1(+/-)2+ΔXi(+/-)2.8, and the new uncertainty in X = sqrt[(2)^2+(2.8)^2] = (+/-)3.5.

However, I take your point that the uncertainty in each re-measured X is constant (+/-)2 even though the magnitude of X may vary.”

Pat’s response here is tellingly incorrect; indeed he has set up a paradox whereby the 1-sigma uncertainty in X is (+/-2) and also (+/-3.5) simultaneously. Further forward calculation using the same logic sees a growing uncertainty in X, despite the fact that it is always measured to an accuracy of (+/-)2.

Let us untangle the paradox. Pat correctly computes the 1-sigma uncertainty in the difference equation Xi – Xi-1 as (+/-)2.8. These are two independent measurements. However, the ΔXi values are not independent of each other in this fourth problem. Mechanistically, each Xi value is calculated here as:- Xi = Xi-1 + ΔXi = Xi-1 + (Xi – Xi-1) so the actual realised error in Xi-1 is eliminated leaving only the uncertainty in the Xi measurement.

Statistically, the errors in the sequence of ΔXi values are lag-1 autocorrelated with a correlation coefficient of -0.5. So we can write

Var (Xn) = Var(X0) +Var (ΔX1) +2Cov(ΔX1, X0)+ Var(ΔX2) + 2Cov(ΔX1, ΔX2) + Var (ΔX3)+2Cov(ΔX2, ΔX3)+…+Var(ΔXn) +2Cov(ΔXn-1, ΔXn)

The covariance in errors between ΔXi and ΔXi-1 is -0.5Var(ΔXi), where Var(ΔXi) is the error variance in any ΔXi term. The covariance in errors between ΔX1 and X0 is equal to –Var(Xi). And since Var(ΔX1) = 2Var(Xi), then almost everything in the above expression cancels out leaving

Var (Xn) = Var(X0) = Var(Xi) for all values of i.

The simple lessons here are

(i) There is a need to distinguish between uncertainty in parameters and uncertainty in (state) variables. They have radically different effects.

(ii) There is a need to distinguish between a calibration measure on an accumulated series and a measure on an integrating series. The measurand is critically important.

(iii) There is a need to include covariance in the uncertainty calculation. Autocorrelation translates into covariance.

(iv) Overall, the fact that a measure is described as a “calibration” does not offer a recipe for uncertainty calculation.

kribaez –> “[i]Third problem: Over a sequence of 100 unit time-steps, the incremental change in X value is measured. Each ΔXi carries an independent 1-sigma measurement error of (+/-)2.”[/i]

You folks that are dwelling on math are missing the forest for the trees. Uncertainty in a measurement is a statement that says you don’t know where in a range a reading actually falls. It is not a measurement of variation like standard deviation or variance and is not amenable to statistical manipulations.

Your 1-sigma measurement error implies that a range of values lies within 68% of the mean which also implies a distribution of values like a normal distribution. Uncertainty means the actual value lies somewhere within the range stated but there is no way to know where. There is no distribution of values other than a uniform one. Any value you pick between the low and the high uncertainty is just as likely as any other. There is no sigma in a statistical sense. I guess you could say 1-sigma covers the entire possible range.

Let’s try another analogy one more time. I give you 3 rods to measure with a ruler whose uncertainty is +/- 1 inch. You have 100 guys do measurements on each rod. You plot the points and lo and behold you get a normal distribution for each so you calculate the variance, the standard deviation, the mean, and the error of the mean. You find out the following: the mean of the first rod is 11.75 inches, the second one is 11.94 inches and the third 12.00. Each of the measurements have a stddev of +/- 0.025 inches and the error of the mean is +/- 0.00125 inches. Pretty good precision huh?

Now tell us just exactly what the true value is with relation to the uncertainty of the ruler. All you’ll be able to say is that the length is somewhere between 11 inches and 13 inches.

Uh oh, got carried away.

1st rod –> 10.75 inches to 12.75 inches

2nd rod –> 10.94 inches to 12.94 inches

3rd rod –> 11.00 inches to 13.00 inches

But the L&H estimates of modelling error upon which Frank bases his propagation argument are NOT about cumulative uncertainty of stochastically INDEPENDENT, static measurements with a consistently mis-scaled ruler, nor about the ultimate reliability of GCMs. They’re about one aspect of cloud-effect modeling, whose AUTOCORRELATED spatio-temporal variability is compared to satellite measurements over a 20-year interval, with the discrepancy reported as the RMS value of 4 W/m^2.

The chronic inability to comprehend this stark difference while preaching superior understanding is staggering.

Jim,

It is easy enough to repeat each of the four problems with a Uniform distribution. It does not change the conclusions to be drawn from the results. The only reason I chose a Gaussian was to avoid switching between range definition and standard deviation. The variance of a Uniform distribution on [-Z, +Z] is (2Z)^2/12 and the standard deviation is Z/sqrt(3). The algebra for quadrature of the error variances remains unchanged in the examples.

Whether you use a Uniform or a Normal distribution, variance is a measure of spread, not of location. Uncertainty can be characterised as a range, a standard deviation or a variance. None of these statistics specify a mean.

“Uncertainty can be characterised as a range, a standard deviation or a variance.”

Up-thread I have characterized it as +/-_ (uniform in a range), +/- (a standard deviation), and +/-*f (with f an actual probability density). A beautiful new notation, huh?

You haven’t answered the questions I have asked. Why?

Jim,

I should also add that you need to find a new supplier for your 12 inch rulers.

Reply to Jim Gorman Oct8 11:17am

Jim,

“You folks that are dwelling on math are missing the forest for the trees.”

We are doing it because it is important; good science relies on good mathematics. You see a forest, we want to estimate how many trees are in that forest so as to help our lumber industry.

“Uncertainty in a measurement is a statement that says you don’t know where in a range a reading actually falls. It is not a measurement of variation like standard deviation or variance and is not amenable to statistical manipulations.”

That is your opinion, but I don’t think you’ll find it shared by Pat Frank, whose work we have been discussing. His Equation (6) is exactly a statistical manipulation on uncertainty. Though if you want to insist that uncertainty is a range outside which the true value cannot lie, then yes, there’s not much stats you can do on that, but in the real world statistics provide valuable estimates which allow people to save money or whatever it is they are trying to do.

“Uncertainty means the actual value lies somewhere within the range stated but there is no way to know where. There is no distribution of values other than a uniform one. Any value you pick between the low and the high uncertainty is just as likely as any other.”

All that is belied by the earlier example of laying 10 rulers, each with uniform uncertainty +/-1”. According to you, the total length has uniform uncertainty +/-10”. But this is not true: the uncertainty is now the convolution of 10 uniforms, which is bell shaped in an approximation to the normal distribution. In the case of 2 rulers, I actually wrote down the distribution, which is a triangle, not flat.

“There is no sigma in a statistical sense. I guess you could say 1-sigma covers the entire possible range.”

No, in the above example the entire possible range is +/-10” but the sigma is sqrt(10/3).

“Let’s try another analogy one more time. I give you 3 rods to measure with a ruler whose uncertainty is +/- 1 inch. You have 100 guys do measurements on each rod. You plot the points and lo and behold you get a normal distribution for each so you calculate the variance, the standard deviation, the mean, and the error of the mean. You find out the following: the mean of the first rod is 11.75 inches, the second one is 11.94 inches and the third 12.00. Each of the measurements have a stddev of +/- 0.025 inches and the error of the mean is +/- 0.00125 inches. Pretty good precision huh?”

Yes, with only one 12+/-1” ruler, used to measure a rod of true length x”, the uncertainty in the result, which might be 1.043x for example, will be +/-x/12” plus an uncertainty (small, as you aver) from errors made by each guy relative to the given ruler.

You have totally missed the point. Uncertainty in the measuring device simply can not be reduced by multiple measurements, even of the same thing by the same device. The example I tried to show you should show that. Heck you could even make 1000 measurements and they all come up with the same value each and every time. Your variance, std dev, and error of the mean would all be zero. However, what is the uncertainty? Can you give us a calculation and a reference that reduces the uncertainty (not the error) of +/- 1 inch by making 1000 measurements of the same thing with the same device?

I never meant to imply that you can not use statistical means to evaluate the effects of uncertainty. However, you can not use the normal “error” calculation that assume random +/- errors in some kind of distribution where the errors mostly cancel out and give you a more accurate mean or “true value”. You may get more precision this way, but you won’t get better accuracy.

That’s one of my bug a boos about averaging temperatures. When temps are recorded as something like 35 +/- 0.5 degrees the +/- range IS NOT an error range, it is an uncertainty range. You can not average it with another reading and then calculate an error of the mean and say you reduced the error. You must take the average of the possible high temps and the average of the possible low temps to find the range of uncertainty in the average.

Try answering the questions I have asked with real numbers. That will mean more to you than discussing theoretical math. It will make you appreciate random errors/precision versus uncertainty/accuracy.

Reply to Jim Gorman Oct11 11:08am

Jim, I am back now, and have been thinking about your questions, and will try to write something tomorrow. Before then, though, I need to know what the output resolution of your thermometer is. Assuming that it is a digital output, is the gap between possible outputs, 1, 0.5, 0.2, 0.1 degrees? I am assuming 1 for now, but I think it might make a difference. If it was 0.1, for example, then its output might be rapidly fluctuating in the range 24.5 to 25.4. Old analogue speedometers had that sort of characteristic.

First I’m going to note that Pat Frank hasn’t responded to my Oct7 1:41pm, after 3 days. I suppose he is either on vacation, or is thinking hard, or realizes that I have valid points and questions which he is unable to answer satisfactorily.

Returning to Jim Oct9 11:54am, I have been patient with you, you not so much with me. “You have totally missed the point.” Please can you explain with reference to my recent reply in what way did I miss the point? It may be best to keep some reference to the “rulers” example we have been discussing.

Thanks in advance,

Rich.

Hi Rich,

Just taking a belated look to see how the debate has progressed. I commend all involved for being patient, and may I add civil in their comments. As I’m not the sharpest knife in the drawer, math wise, was hoping you (or Jim) could take a stab at the following questions, both of which are relavant to a GCM’s solution (?) for temperature at a specific grid point, namely:

1) does T(t) have any dependence on T(t-1)?

2) does T have any dependence on cloud fraction?

Thank you!

Frank,

The answer to both of your questions is yes, and I think that this is uncontroversial between Pat and any of his critics. The main debate is really about the nature of those relationships, and it is this which determines how uncertainty in cloud fraction translates into uncertainty in temperature projection.

Any statistical model fitted to the observed or the modeled temperature series shows autocorrelation. Alternatively an energy balance model fitted to a GCM result will show very directly a strong lag-1 autocorrelation in temperature. Here is such a result for a single body formulation subjected to an arbitrary forcing series F(t):-

Tn = F(tn)/ λ *(1 – exp(- Δ t/ τ)) + Tn-1 * exp(- Δ t/ τ)

Taken from my post here:https://wattsupwiththat.com/2019/09/19/emulation-4-w-m-long-wave-cloud-forcing-error-and-meaning/#comment-2807383

Cloud fraction has multiple effects on both shortwave and longwave fluxes. Overall, it is generally agreed that the presence of clouds (vs no clouds) brings a net cooling effect. However, the flux responses are not simple single-valued functions of cloud fraction, they are also heavily dependent on the distribution of clouds – which makes the problem wickedly challenging. In the CMIP models, the estimated range of feedback response lies in the range of [-0.5, +0.7] w/m2/k. In other words, in some models temperature-dependent cloud changes tend to add to net incoming flux, while in other models the reduce net incoming flux. The models cannot agree even on the sign of cloud feedback.

Hi kribaez,

Thank you for your detailed response. I recall that a monte-carlo simulation of a simple process like T(t) = T(t-1) + e(t), where e(t) is a Gaussian error term, has a set of realizations whose uncertainty scales with the square-root of time. Apparently the GAST output from any given GCM doesn’t exhibit this type of behavior due to tuning, but shouldn’t the uncertainty of clouds, as you indicate, at least be inherent in how we view the model’s results?

Thank you.

kribaez, thanks also from me. You have saved me the bother of a substantive reply, and moreover, done a better job with it 🙂

My position is from mathematical rigour, trying to simplify the uncertainty principles (no, not quantum mechanics!) to a logical flow, and then ask what happens if some of the simplifying assumptions are violated. I don’t know as much about the GCMs as you do. I’m away for a couple of days, but on return will try to formulate some “rules” around Jim Gorman’s “rule(r)s”, and a mathematical flow for Pat’s argument.

Rich –> I am trying to reduce this to simple numbers so everyone can understand.

1 – I tell you the temperature is 25 +/- 0.5. Is the +/- 0.5 an uncertainty or a measurement error?

2 – Do measurement errors, when measuring the same thing with the same device, follow a random distribution?

3 – Can you statistically deal with random errors obtained when measuring the same thing with the same device?

4 – If the interval given is an uncertainty, when measuring the same thing with the same device, do you get a random distribution around a mean that is attributable to the uncertainty, in other words do you have a random variable because of the uncertainty?

5 – Do you calculate an error interval before or after taking actual measurements?

6 – Do you calculate uncertainty before or after actually taking a measurement?

Unless we can agree on what these two fundamental things are, we’ll never agree as to how they affect measurements nor how to treat them mathematically.

Here is a good document to read.

http://www.ni.com/en-us/innovations/white-papers/06/understanding-instrument-specifications—-how-to-make-sense-out.html

Please note the following:

“Accuracy — a measure of the capability of the instrument to faithfully indicate the value of the measured signal. This term is not related to resolution; however, it can never be better than the resolution of the instrument.”

“Accuracy of an instrument is absolute and must include all the errors resulting from the calibration process.”

“Precision — a measure of the stability of the instrument and its capability of resulting in the same measurement over and over again for the same input signal.”

Frank,

“I recall that a monte-carlo simulation of a simple process like T(t) = T(t-1) + e(t), where e(t) is a Gaussian error term, has a set of realizations whose uncertainty scales with the square-root of time. Apparently the GAST output from any given GCM doesn’t exhibit this type of behavior due to tuning…”

You are correct that the variance in your expression increases with time. The standard deviation or error range scales with the square root of the number of timesteps.

The GAST output from a GCM does not exhibit this type of behaviour, but that is not a result of tuning. The tuning primarily defines the median response of the GCM, and the GAST output from any single run contains no information on uncertainty growth, a point that Pat has made (correctly) a number of times.

Each GCM reports on a number of prescribed “experiments”, for example, 20th century historic, 1% per annum growth in CO2, instantaneous quadrupling of CO2, etc.

Each of these experiments involves multiple runs with minor variations in initial conditions, selected by varying the kick-off times from the minimum 500 year spin-up – the PIControl run for the GCM. These runs are then often averaged for reporting purposes. The GAST output from these multiple runs does show some evidence of uncertainty propagation but, with typically only 5 to 10 runs “per experiment”, the sampling is wholly insufficient to define the analytic form of the propagation. Moreover, there is no prescribed requirement to test for systematic bias in any of the inputs, and so this type of input error remains invisible to these mini-sampling tests.

They would not be invisible to full MC analysis on the GCMs. However, since it is impractical to run full MC experiments on the AOGCMs, there really is little choice other than to make use of high-level emulators to test uncertainty propagation arising from uncertainty in data and parameter inputs. Such tests support the existence of large uncertainty arising from cloud parameterisation, but do not support the shape of uncertainty propagation suggested by Pat. An error in a flux component (like for example LWCF) translates into a negligible effect on the net flux after the 500 year spin-up and a bounded error on the absolute temperature. It leaves the model with an incorrect internal climate state, no question. During subsequent forced temperature projections, the incorrect climate state translates into an error in the flux feedback from clouds. Multiple sampling of the initial error in cloud definition allows the uncertainty in temperature projection to then be mapped. The error propagation is then via a term R'(t) x ΔT(t), where R'(t) is the rate of change of flux due to cloud changes with respect to TEMPERATURE. Although temperature may be changing with time, this uncertainty propagation mechanism is not at all the same as your uncertainty mechanism above, or Pat’s, which are both propagated with TIME independent of temperature change.

“…but shouldn’t the uncertainty of clouds, as you indicate, at least be inherent in how we view the model’s results?”

Yes, it should. I am already on record as saying that I agree with Pat that the GCM results are unreliable, and unfit for informing decision-making, and I also agree with him that cloud uncertainty alone presents a sufficiently large uncertainty in temperature projections to discard estimates of climate sensitivities derived from the GCMs. Unfortunately, I profoundly disagree with the methodology he is proposing to arrive at that conclusion.

They aren’t ‘experiments’, that implies an empirical result. They are more properly described as ‘scenarios’.

kribaez,

Thank you for thoughtful reply. First of all, I’m pleased that we can agree on the unsuitability of relying on the GCMs to dictate policy. Second, I agree with you that tuning affects the level, but not the variance, of the GAST output. However, I still believe that tuning covers up for a lot of missing and/or mis-specified physical processes, all of which have corresponding variances that are missing from the GCMs. So if I understand many of the objections to Pat’s premise correctly, they argue along the lines that if the GCMs are somehow constrained from boiling off / freezing over the oceans, then Pat’s wide error bands based on cloud fraction error can not be correct.

Much of this is too nuanced for me, so I’d like to return to where we agreed about the uncertainty surrounding a random walk. If instead of T(t), we were looking for point forecasts, P(t), around a large-cap stock, I think we’d could agree that most economists would say that the point forecasts of a stock’s price tomorrow or next week (month, quarter, year, etc.) would be today’s closing price, and that the uncertainty around each forecast would be a suitably scaled estimate of the stock’s “bulk” variance as might be calculated be various methods.

Alternatively, there are point forecasts of the same stock price provided by models maintained by any number of financial firms as a service to their customers. Such models vary in complexity, but nominally themselves require additional point forecasts for earnings, pay-out ratios, interest rates, risk premia, economic conditions, regulatory impacts, competitor analysis, etc. An honest calculation of the uncertainty around these forecasts would necessarily require estimates of the joint variances of all the various model inputs, assuming of course that the models are correctly specified in the first place.

Based on the above, I would conclude that for all their sophistication, the uncertainty around the estimates of the financial models has to be at least as great as that around the naive model, which is why I agree with Pat’s conclusion.

Frank,

If a stock-prediction model gets the right binary answer for all of the wrong reasons, that might still be considered a success – everyone makes money.

The generally accepted rule for science, however, is as stated by Wegman when commenting on Michael Mann’s work:-

“right answer, wrong method equals bad science”

kribaez,

What I was trying to convey (probably badly) was that an honest purveyor of the stock prediction model would agree that the uncertainty of its forecasts were on par with that of a random walk. I think I need to dig around your prior post to better understand where this uncertainty [“(s)uch tests support the existence of large uncertainty arising from cloud parameterisation, but do not support the shape of uncertainty propagation suggested by Pat”] is actually expressed.

Fortunately we do both agree with Wegman, but I would argue that there is an abundance of evidence that the answer was wrong as well.

Thank you.

Tim Gorman October 6, 2019 at 4:57 am :

“……… Root-sum-square is *NOT* the same as root-mean-square.”

Pat Frank October 6, 2019 at 10:35 am :

“Tim, my admiration. You’ve been a real stalwart for the integrity of scientific practice.You’ve gotten the analysis exactly correct. ………………”

Did he really read that sentence of Tim´s? In his study, he consistently mentions sum squares where mean squares (i. e. variance) would be in order. Not relevant to the further conclusions, but an embarassing lack of acqaintance with basic statistical terms, also shedding a dim light on the review process.

ulises,

Uncertainty intervals are not probability functions. Therefore they have no variance. Since uncertainty intervals are not probability functions they have no mean from which a variance can be calculated.

It’s why uncertainty adds as root-sum-square instead of root-mean-square. In essence you are making the same mistake that so many others are – uncertainty and error are not the same thing. I at least know enough basic statistics to understand the very simple difference between an uncertainty interval and an error probability distribution.

But as I noted many days ago, it is very hard (as in I don’t know how) to justify adding uncertainty as root-sum-square without using standard probability theory regarding random variables and their distributions.

Think of the uncertainty interval as an indicator of the maximum possible plus and minus associated with the output of a model. The uncertainty interval tells you nothing about what is going on inside the interval but it describes where the object under discussion might be found. I.e. I can tell you my wife’s car is in Topeka, Kansas right now but I can’t tell you exactly where! But it is inside a specific geographical interval that I can describe.

If each of the uncertainty intervals are independent (they can still be equal just not dependent on each other) then you can think of them as orthogonal vectors, u1 and u2. When you add them you are looking for the length of the vector from the base of one to the tip of the other, call it u_t. That is saying that u_t is the hypotenuse of the right triangle whose other sides are u1 and u2. So u_t is related to the other vectors by the equation for a right triangle of u_t^2 = u1^2 + u2^2. So u_t = sqrt( u1^2 + u2^2). As you do other iterations you just keep forming right triangles and adding squares, e.g. (u1^2 + u2^2) + u3^2 for the next iteration. Then (u1^2 + u2^2 + u3^2) + u4^2. And so on.

Again, uncertainty intervals are not random variables. The uncertainty doesn’t change from one iteration to the next in a random fashion. Uncertainty doesn’t have a probability function describing it. Think about it, the maximum variance (plus and minus) of a random variable is not itself a random variable. It is a descriptive value for what you might see from a probability function associated with a random variable. But it is a *value* and not a variable. The only way for the variance to vary would be for the probability function to vary.

Tim, I don’t accept the premise that uncertainty is an interval – I consider it to be a distribution, which might be uniform over an interval but in most cases isn’t.

But anyway I’ll consider your Topeka example. Does Topeka have a fence or wall around it? If not, then Topeka is not even well defined, and if I wanted to find your wife’s car I would start making subjective probability judgments, like a good Bayesian statistician. That car isn’t randomly located in the area of Topeka (even if that well defined) because it’s unlikely to be inside a building other than a parking garage. It’s not next to a fire hydrant. It could well be near your wife’s favourite restaurant, etc. etc. So there is much more information to be had than pure uniform randomness. Information is often commercially or personally useful, and reflects an intelligent approach to life.

Now for the sum of squares. I must say, that’s a nice try, to use orthogonal directions. But you’re just making that up; what does it mean for two “uncertainty intervals” to be independent, and why does that then qualify them to be represented by orthogonal vectors? Whereas, if they are statistical distributions, then it all follows quite naturally.

You are right to say that uncertainty doesn’t change from one iteration to the next, and a probability distribution has exactly that property. As you say, the mean and variance do not change, only the observation of the random variable associated with that distribution. Uncertainty is not a value (as you assert), it is not a variable, it is a distribution. That’s the only way that it can be treated in a proper mathematical way.

Rich,

“”Tim, I don’t accept the premise that uncertainty is an interval – I consider it to be a distribution, which might be uniform over an interval but in most cases isn’t.”

It is a distribution of what? Think about it a little deeper. Is variance of a probability function a distribution of its own or just a value? Does variance have a probability function all of its own? Or is it just a value that describes a probability function?

Is the uncertainty interval any different? Is a constant also a uniform distribution?

“Does Topeka have a fence or wall around it?”

It absolutely does have a boundary. Topeka actually has several signs stating “city maintenance ends here”. The boundary is defined distinctly by the city itself. Try to get them to maintain a street outside the city boundary!

“So there is much more information to be had than pure uniform randomness.”

None of which actually locates the car. The uncertainty interval still remains the city boundary. All you are doing is describing where in the interval it might be. The interval remains constant.

“Now for the sum of squares. I must say, that’s a nice try, to use orthogonal directions. But you’re just making that up; what does it mean for two “uncertainty intervals” to be independent, and why does that then qualify them to be represented by orthogonal vectors? Whereas, if they are statistical distributions, then it all follows quite naturally.”

Now you are just flailing away. I’ve given you the math just like you say you always want and all you can do is use the argumentative fallacy of Argument by Dismissal. Independence is *always* orthogonal. If they weren’t orthogonal then one vector would have a projection onto the other meaning there is a dependence of one on the other. The vector sum would be u_1^2 + u_2^2 + (2)(u_1)(u_2)(cos(theta). Cos(theta) is a measure of the dependence of one on the other. As in Pat’s writings, however, we know that u(i+1) is not dependent on u(i) in his analysis, meaning they are independent.

“a probability distribution has exactly that property.”

An uncertainty interval is no more a probability function than a constant is. Using your logic a constant is a uniform distribution probability function! You would never be able to factor a constant out of a distribution function!

“As you say, the mean and variance do not change, only the observation of the random variable associated with that distribution. ”

And where in the definition of a uncertainty interval does a random variable with a distribution function appear exactly? If the mean and variance do not change, i.e. they are constants, then why must the uncertainty interval change? That would mean the uncertainty interval must be a dependent variable based on an independent variable that changes over an interval. What independent variable and interval is associated with the uncertainty interval?

“it is not a variable, it is a distribution.”

ROFL!!! Listen to yourself! A distribution *is* based on a variable by definition. If you have no variable then what is the distribution describing?

“That’s the only way that it can be treated in a proper mathematical way.”

I gave you the proper way to treat the uncertainty interval. And you just dismissed it using the excuse that vectors can’t be used — with no reason other than you say so. Variance and uncertainty intervals are *exactly* the same concept – a value describing a probability function. And you simply can’t admit that to yourself for some reason.

Reply to Tim Gorman Oct16 6:04am

Well I’m glad Topeka has its limits. My point about non-uniformity is that it is useful, possessing information which, if thrown away, can cost time and money, so successful people tend not to do that. Now, it may be that in some cases the non-uniformity can be adequately described by a normal distribution, in which case its standard deviation is a valid one-dimensional descriptor for it. If not normal, there may be other applicable probability distributions with one parameter other than the mean. But it is the probability distribution, not its dispersion parameter(s), which gets used in combining information.

Moving on to the question of orthogonal summation of uncertainties, I’ll do some more flailing and dismissing, but try to be more careful about it. So I’ll take Topeka as an example, though for the purposes of demonstration I shall reconstruct it a little. Topeka is now a perfect square of 2 miles per side, and all its facilities are underground, so that from its centre you can see your wife’s bicycle wherever within Topeka that may be. But Topeka has only one road, East-West.

Just for fun your wife decided to take her bicycle to a random point in Topeka, but to save time she used the car as well, with the bike in the back. She drove to the centre, flipped a coin to decide whether to go East or West, and then drove a random distance between 0 and 1 mile. She got out, removed the bicycle, and randomly cycled due North or South between 0 and 1 mile. Taking the radius of 1 mile as the uncertainty value, the car’s uncertainty from the centre is 1 mile, the bike’s uncertainty from the car is 1 mile, and the bike’s uncertainty from the centre is sqrt(2) = 1.414 miles. (The bike’s distance from the centre is not uniform in (0,1.414) miles, but let’s not dwell on that.) If there were buildings in the way, and you had to use the so-called city block metric (or 1-norm), then your knowledge of the uncertainty of the crow’s flight distance would be useless, and you’d have to travel up to 2 miles to reach the bike. But since we moved the buildings to underground, you get to travel a maximum of 1.414 miles to retrieve the bike (and your wife if she hung around, except she might have shinned up a flagpole to use a 3rd dimension wherein you could assess the uncertainty in the length of zip wire needed to get her back to the centre).

So in this example, I do see how you could legitimately use orthogonality to add uncertainties. But back in the real world, we like to add uncertainties of things which are commensurate, i.e. they all live in the same dimension, such as global temperature, or radiative flux, as considered by Pat’s paper. In this one-dimensional case, random variables can be added and their variances added and the standard deviation derived by square rooting that, without having to resort to multidimensional orthogonal vectors.

Regarding your “if there is a distribution then what is the associated variable” question, the variable is the error between the observation and either reality or some other definition of mean value.

Though I have visited many fine States in America, I am sorry to say that I have not yet managed to get to Kansas.

Further reply to Jim Gorman Oct11 11:08am

1 – I tell you the temperature is 25 +/- 0.5. Is the +/- 0.5 an uncertainty or a measurement error?

Neither. It is an attempt at a description of the distribution of uncertainty surrounding 25. That distribution might be uniform over (24.5,25.5), but there are problems with that which I shall explain, among other things, in my next submission.

2 – Do measurement errors, when measuring the same thing with the same device, follow a random distribution?

Yes. But, as a technicality, that distribution might have zero variance, so you would probably not choose to call it random. For example, the true value is 25.132 and the device, which has output resolution 0.2, always says 25.4, so zero variance. But alternatively the device might, because of environmental conditions including power spikes and noise report 24.6, 24.8, 25.0, 25.2, 25.4 with some non-zero probability for each of those.

3 – Can you statistically deal with random errors obtained when measuring the same thing with the same device?

Yes, provided you choose a probability distribution for them. Sensitivity analysis might be needed to determine how much effect your distribution has on any final results.

4 – If the interval given is an uncertainty, when measuring the same thing with the same device, do you get a random distribution around a mean that is attributable to the uncertainty, in other words do you have a random variable because of the uncertainty?

Yes. Uncertainty is merely a name for the probability distribution surrounding error, which is defined as the observed (or inferred) value of something minus its true value. My next submission compares two cases.

5 – Do you calculate an error interval before or after taking actual measurements?

Before; you calculate an error distribution. It might be a point value, a uniform interval, or a more general distribution.

6 – Do you calculate uncertainty before or after actually taking a measurement?

The uncertainty distribution is the error distribution shifted by the measurement, so strictly it comes after the measurement, but its salient statistics are available beforehand. Note that “uncertainty” is a sort of antonym for “probability”.

Here is the follow-up. I argue that a uniform uncertainty distribution (u.d.) is impossible except in the presence of an infinitely precise instrument, and I provide mathematical definitions and means of calculation which may be useful.

We shall assume digital output readings so, by scaling, we can assume that the possible outputs are a complete range of integers, e.g. 0 to 1000. Now use Bayesian statistics to describe the problem.

Let X be a random variable for the true value of a quantity which we attempt to measure.

Let x be the value of X actually occurring at some particular time.

Let M be our measurement, a random variable but including the possibility of zero variance. Note that M is an integer.

Let D be the error, = M – x.

Let f(x) be a chosen (Bayesian) prior probability density function (p.d.f.) for X.

Let g(y;x) be a probability function (p.f.) for M over a range of integer y values, dependent on x (PRECISION DISTRIBUTION).

Let c be a constant of proportionality determined by making relevant probabilities add up to 1. Then after measurement M, the posterior probability for X taking the value z is

P[X’=’x | M=y] = c P[M=y | X’=’x] P[X’=’x] = g(y;x) f(x)

Usually we will take f() to be an “uninformative” prior, i.e. uniform over a large range bound to contain x, so it has essentially no influence. In this case,

P[X’=’x | M=y] = c g(y;x) where c = 1/int g(y;x)dx, (UNCERTAINTY DISTRIBUTION).

Then P[D=z | M=y] = P[X=M-z | M=y] = c g(y;y-z). Now assume that g() is translation invariant and symmetric, so g(y;y-z) = g(0;-z) = h(-z) = h(z) defines function h(), and c = 1/int h(z)dz. Then

P[D=z | M=y] = c h(z), independent of y (ERROR DISTRIBUTION = shifted u.d.).

The assertion of a +/-0.5 uncertainty interval would correspond to h(z) = 1 for -0.5<z<0.5 and 0 elsewhere (let’s call that function h_U). However, this is unrealistic when you think about how uncertainty creeps into a measurement. If you have an accurate ruler marked with 1/10” intervals and you are trying to record X to the nearest 0.1”, then if X is actually near one of the marks then you will be confident in your result. But if X is close to the middle of an interval, you will be far less confident, especially when you think about whether you took sufficient care to align your material to the zero end of the ruler.

Taking an example in our integer case, it means that if M = y = 314, then if x is truly 314.159 you might be confident that 313.5 < x < 314.5. But if x is truly 314.459, then there may be a chance that instead of recording 314, you record 315. In this case there is actual uncertainty about what you will record. But if between x = 314.4999 and x = 314.5001 you were bound to flip from recording 314 to 315, then you had almost perfect precision which you wantonly chose to throw away. I would call that “damaged precision”.

So here is an example of an h(), with a trapezium shape (call it h_T), which allows for a perfect result near integers and an interpolating mixed result away from them.

h(z) =

{ 2(z+3/4) for -3/4<z<-1/4

{ 1 for -1/4<z<1/4

{ 2(3/4-z) for 1/4<z<3/4

For any function h() the mean m and variance s^2 of the error D = M-x, given X=x, may be calculated as follows. Let a be the fractional part of -x put into the range (-1/2,1/2). Then

m = c sum_y (y-x)g(y;x) = c sum_k (a+k)g(0;-k) = c sum_k (a+k)h(k)

s^2 = c sum_k (a+k-m)^2 h(k)

For h() = h_T(), we find

if a<1/4, m = a and s^2 = 0,

if 1/4<a<3/4, m = 1/2-a, s^2 = 4(a-1/4)(3/4-a)

with mirror results for a<0. Averaged over all a, m is 0 and s^2 is 1/6 (compared with 0 and 0 for h_U).

Advertizing disclaimer: other h()’s are available and your mileage may vary depending on which you use.

Overnight I realized that astute readers might have noticed an omission in my analysis above, but none has mentioned it yet. It is that in my discussion about the nature of error I used an accurate ruler as the base example. Therefore the above covers precision, which is important, but not accuracy. Here are some words that Tim related above about those:

“Accuracy — a measure of the capability of the instrument to faithfully indicate the value of the measured signal.”

“Precision — a measure of the stability of the instrument and its capability of resulting in the same measurement over and over again for the same input signal.”

I can certainly look at improving my analysis above to take accuracy into account, but for the purposes of analyzing Pat Frank’s paper I do not think it is important. This is because, as far as I can tell, the uncertainty spread in his Figure 6 is all about model consistency and predictability, and not at all about how well the mean lines correspond to reality.

So there is a wide uncertainty distribution around when I might get round to adding in the accuracy component…

I have now thought further on the “accuracy” angle. Unusually, I’ll explain in words rather than mathematics, which may be a blessing to some.

We have an observable, which we wish to measure as best we can. There exists a population of instruments capable of such measurement, and we obtain N of them. The manufacturer asserts that any single instrument will have a bias which is a number randomly and independently drawn from a distribution of known mean (which might or might not be zero) and a known standard deviation. On top of that there is observational error when we perform the measurement. This might be due to a human reading from an analogue instrument as in the case of the rulers, or in the case of a digital instrument be due to its output resolution and any fluctuations in the readings taken.

The bias may be considered to be a systematic error, and relates to accuracy. The observational error relates to precision. The total error is the sum of the two (I believe kribaez mentioned 5 different types of error, but I am concentrating on these two; perhaps a third one is errors in the manufacturer’s assertions). As such, the distribution of the total error is the convolution of the systematic and observational errors, and may be calculated, and its standard deviation might be called by some an “uncertainty interval”. The standard deviation interval does not, by its definition, encompass the total range of error, and the total range of error is very unlikely to have uniform probability.

If N > 1 then the sum of the results has standard deviation sqrt(N) times that of any single results, and the mean of the results has s.d. 1/sqrt(N) times that of a single result.

I have derived this from first principles and helpful comments and questions along the way. Perhaps a true expert in the subject will tell me it is junk. I should like to thank Tim/Jim Gorman for some very challenging questions along the way.

Tim,

thanks for replying. You write :

“Uncertainty intervals are not probability functions.”

Of course they are not; by the respective definitions, intervals can’t be functions. But such intervals can be, and typically are, derived from probability distributions.

Alternatively, you can split an empirical distribution in quantiles of your choice.

Can You explain the term “probability function “? Give an example ?

“Therefore they have no variance.”

Not true. When the distribution parameters are estimates from a sample, which is the common case, they bear their own uncertainty , so there is a variance of the variance. This, in principle, would carry over to the interval limits. But I’m afraid this is not what you meant.

“Since uncertainty intervals are not probability functions”…..see above

“they have no mean from which a variance can be calculated. ”

Leaves me puzzled. Error intervals arecentered on a mean, spanning between upper and lower limit. ” Uncertainty intervals”, according to you, have no mean and no variance, so what are they at all in numerical terms ? Could You give an example ?

“…. uncertainty adds as root-sum-square…. ”

Reads to me : the larger the sum, the more uncertainty ? There is one undeniable certainty about it : the term grows with sample size. May I then correctly conclude that reducing the sample size results in less uncertainty ?

And : the full term for “sum-squares” reads as the ” sum of the squared deviations from the mean”, so there *has to be* a mean, otherwise no “sum-squares”. The root of it I have never seen applied for something, lest to have it divided by root(deg. of freedom, df), to get a standard deviation. The (total) sum of squares, otoh, is the basic quantity in the GLM approach, where it is split into components of sources of deviations, but the partial sums of squares are finally divided by their dfs to give the errors of the model parameters. But you don’t want this error stuff.

“– uncertainty and error are not the same thing.”

Right, semantically not. Instead one might say that the presence of error, assumed or observed, gives rise to uncertainty.

But how is it in numerical terms? Let’ s look into the

“Guide to the expression of uncertainty in measurement” (also mentioned by Pat) :

Introduction,§0.6 :

[Begin citation]

Recommendation INC-1 (1980) Expression of experimental uncertainties

1) The uncertainty in the result of a measurement generally consists of several components which may be grouped into two categories according to the way in which their numerical value is estimated:

A. those which are evaluated by statistical methods,

B. those which are evaluated by other means.

There is not always a simple correspondence between the classification into categories A or B and the previously used classification into “random” and “systematic” uncertainties. The term “systematic uncertainty” can be misleading and should be avoided.

Any detailed report of the uncertainty should consist of a complete list of the components, specifyingfor each the method used to obtain its numerical value.

2) The components in category A are characterized by the estimated variances s(i)^2,

or the estimated “standard deviations” s(i) and the number of degrees of freedom v(i). Where appropriate, the covariances should be given.

3) The components in category B should be characterized by quantities

u(j)^2 , which may be considered as approximations to the corresponding variances, the existence of which is assumed. The quantities 2 j u may be treated like variances and the quantities u(j) like standard deviations. Where appropriate,the covariances should be treated in a similar way.

4) The combined uncertainty should be characterized by the numerical value obtained by applying the usual method for the combination of variances. The combined uncertainty and its components should be expressed in the form of “standard deviations”.

5) If, for particular applications, it is necessary to multiply the combined uncertainty by a factor to obtain an overall uncertainty, the multiplying factor used must always be stated

[End citation – terms with suscripts and powers edited for readability – U.]

So, in this framework, there is no contrast between “error” and “uncertainty”, rather the conventional error analysis and the usage of the respective terms is proposed as “category A” uncertainty analysis. (category B is not relevant here because it does not deal with measurement-derived errors, but with kind of educated guesses as approximations thereof. Nevertheless, Pat uses the “u” notation, though not consistently.)

‘So, in this framework, there is no contrast between “error” and “uncertainty”’.

Exactly.

Except I am happy to allow the use of the term “uncertainty” to describe the distribution of errors. At least, I am on a Tuesday.

Tim writes :

“Uncertainty intervals are not probability functions.”

Of course they are not; by the respective definitions, intervals can’t be functions. But such intervals can be, and typically are, derived from probability distributions.

Alternatively, you can split an empirical distribution in quantiles of your choice.

Can You explain the term “probability function “? Give an example ?

After sleeping a night over it, I regretfully withdraw this comment of mine. Of course, an interval can be defined in terms of the desired area under the probability density distribution curve and a multiplier for the respective sd units, such that for +/- 1 sd and assumed Normal Distribution you get the well-known 68% of potential measurements.

Still, I’m eager to learn how such an interval is defined in the uncertainty domain.

U.

Sorry again, Tim, having been busy with my post, it escaped to me that you did already give

your explanation of uncertainty intervals :

Tim Gorman

October 14, 2019 at 3:10 pm

“Think of the uncertainty interval as an indicator of the maximum possible plus and minus associated with the output of a model. The uncertainty interval tells you nothing about what is going on inside the interval but it describes where the object under discussion might be found.”

Thanks, I can see more clearly now what you mean. You are dealing with ranges, which are imposed arbitrarily, hopefully according to best knowledge and possibly, but not necessarily based on observations. But this is a Type B approach in the sense of the Guide I cited upthread, thus without relevance to the root of our dispute, Pat Frank’s study. The only uncertainty term he deals with is this 4W sd for cloud forcing, a conventional error term, i.e. a Type A approach. I see nowhere in his text that a range in your definition is defined.

Assuming he had done so, how could his claim be corroborated that uncertainty intervals are widening out in the GCM processing , when the range is “the maximum possible plus and minus associated with the output of a model ” ? The range is fixed, no widening into the impossible allowed.