Emulation, ±4 W/m² Long Wave Cloud Forcing Error, and Meaning

Guest post by Pat Frank

My September 7 post describing the recent paper published in Frontiers in Earth Science on GCM physical error analysis attracted a lot of attention, consisting of both support and criticism.

Among other things, the paper showed that the air temperature projections of advanced GCMs are just linear extrapolations of fractional greenhouse gas (GHG) forcing.

Emulation

The paper presented a GCM emulation equation expressing this linear relationship, along with extensive demonstrations of its unvarying success.

In the paper, GCMs are treated as a black box. GHG forcing goes in, air temperature projections come out. These observables are the points at issue. What happens inside the black box is irrelevant.

In the emulation equation of the paper, GHG forcing goes in and successfully emulated GCM air temperature projections come out. Just as they do in GCMs. In every case, GCM and emulation, air temperature is a linear extrapolation of GHG forcing.

Nick Stokes’ recent post proposed that, “Given a solution f(t) of a GCM, you can actually emulate it perfectly with a huge variety of DEs [differential equations].” This, he supposed, is a criticism of the linear emulation equation in the paper.

However, in every single one of those DEs, GHG forcing would have to go in, and a linear extrapolation of fractional GHG forcing would have to come out. If the DE did not behave linearly the air temperature emulation would be unsuccessful.

It would not matter what differential loop-de-loops occurred in Nick’s DEs between the inputs and the outputs. The DE outputs must necessarily be a linear extrapolation of the inputs. Were they not, the emulations would fail.

That necessary linearity means that Nick Stokes’ entire huge variety of DEs would merely be a set of unnecessarily complex examples validating the linear emulation equation in my paper.

Nick’s DEs would just be linear emulators with extraneous differential gargoyles; inessential decorations stuck on for artistic, or in his case polemical, reasons.

Nick Stokes’ DEs are just more complicated ways of demonstrating the same insight as is in the paper: that GCM air temperature projections are merely linear extrapolations of fractional GHG forcing.

His DEs add nothing to our understanding. Nor would they disprove the power of the original linear emulation equation.

The emulator equation takes the same physical variables as GCMs, engages them in the same physically relevant way, and produces the same expectation values. Its behavior duplicates all the important observable qualities of any given GCM.

The emulation equation displays the same sensitivity to forcing inputs as the GCMS. It therefore displays the same sensitivity to the physical uncertainty associated with those very same forcings.

Emulator and GCM identity of sensitivity to inputs means that the emulator will necessarily reveal the reliability of GCM outputs, when using the emulator to propagate input uncertainty.

In short, the successful emulator can be used to predict how the GCM behaves; something directly indicated by the identity of sensitivity to inputs. They are both, emulator and GCM, linear extrapolation machines.

Again, the emulation equation outputs display the same sensitivity to forcing inputs as the GCMs. It therefore has the same sensitivity as the GCMs to the uncertainty associated with those very same forcings.

Propagation of Non-normal Systematic Error

I posted a long extract from relevant literature on the meaning and method of error propagation, here. Most of the papers are from engineering journals.

This is not unexpected given the extremely critical attention engineers must pay to accuracy. Their work products have to perform effectively under the constraints of safety and economic survival.

However, special notice is given to the paper of Vasquez and Whiting, who examine error analysis for complex non-linear models.

An extended quote is worthwhile:

… systematic errors are associated with calibration bias in [methods] and equipment… Experimentalists have paid significant attention to the effect of random errors on uncertainty propagation in chemical and physical property estimation. However, even though the concept of systematic error is clear, there is a surprising paucity of methodologies to deal with the propagation analysis of systematic errors. The effect of the latter can be more significant than usually expected.

“Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources.”

“Of particular interest are the effects of possible calibration errors in experimental measurements. The results are analyzed through the use of cumulative probability distributions (cdf) for the output variables of the model.

“As noted by Vasquez and Whiting (1998) in the analysis of thermodynamic data, the systematic errors detected are not constant and tend to be a function of the magnitude of the variables measured.

When several sources of systematic errors are identified, [uncertainty due to systematic error] beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

“beta = sqrt[sum over(theta_S_i)^2],

“where “i” defines the sources of bias errors and theta_S is the bias range within the error source i. (my bold)”

That is, in non-linear models the uncertainty due to systematic error is propagated as the root-sum-square.

This is the correct calculation of total uncertainty in a final result, and is the approach taken in my paper.

The meaning of ±4 W/m² Long Wave Cloud Forcing Error

This illustration might clarify the meaning of ±4 W/m^2 of uncertainty in annual average LWCF.

The question to be addressed is what accuracy is necessary in simulated cloud fraction to resolve the annual impact of CO2 forcing?

We know from Lauer and Hamilton, 2013 that the annual average ±12.1% error in CMIP5 simulated cloud fraction (CF) produces an annual average ±4 W/m^2 error in long wave cloud forcing (LWCF).

We also know that the annual average increase in CO₂ forcing is about 0.035 W/m^2.

Assuming a linear relationship between cloud fraction error and LWCF error, the GCM annual ±12.1% CF error is proportionately responsible for ±4 W/m^2 annual average LWCF error.

Then one can estimate the level of GCM resolution necessary to reveal the annual average cloud fraction response to CO₂ forcing as,

(0.035 W/m^2/±4 W/m^2)*±12.1% cloud fraction = 0.11%

That is, a GCM must be able to resolve a 0.11% change in cloud fraction to be able to detect the cloud response to the annual average 0.035 W/m^2 increase in CO₂ forcing.

A climate model must accurately simulate cloud response to 0.11% in CF to resolve the annual impact of CO₂ emissions on the climate.

The cloud feedback to a 0.035 W/m^2 annual CO2 forcing needs to be known, and needs to be able to be simulated to a resolution of 0.11% in CF in order to know how clouds respond to annual CO2 forcing.

Here’s an alternative approach. We know the total tropospheric cloud feedback effect of the global 67% in cloud cover is about -25 W/m^2.

The annual tropospheric CO₂ forcing is, again, about 0.035 W/m^2. The CF equivalent that produces this feedback energy flux is again linearly estimated as,

(0.035 W/m^2/|25 W/m^2|)*67% = 0.094%.

That is, the second result is that cloud fraction must be simulated to a resolution of 0.094%, to reveal the feedback response of clouds to the CO₂ annual 0.035 W/m^2 forcing.

Assuming the linear estimates are reasonable, both methods indicate that about 0.1% in CF model resolution is needed to accurately simulate the annual cloud feedback response of the climate to an annual 0.035 W/m^2 of CO₂ forcing.

This is why the uncertainty in projected air temperature is so great. The needed resolution is 100 times better than the available resolution.

To achieve the needed level of resolution, the model must accurately simulate cloud type, cloud distribution and cloud height, as well as precipitation and tropical thunderstorms, all to 0.1% accuracy. This requirement is an impossibility.

The CMIP5 GCM annual average 12.1% error in simulated CF is the resolution lower limit. This lower limit is 121 times larger than the 0.1% resolution limit needed to model the cloud feedback due to the annual 0.035 W/m^2 of CO₂ forcing.

This analysis illustrates the meaning of the ±4 W/m^2 LWCF error in the tropospheric feedback effect of cloud cover.

The calibration uncertainty in LWCF reflects the inability of climate models to simulate CF, and in so doing indicates the overall level of ignorance concerning cloud response and feedback.

The CF ignorance means that tropospheric thermal energy flux is never known to better than ±4 W/m^2, whether forcing from CO₂ emissions is present or not.

When forcing from CO₂ emissions is present, its effects cannot be detected in a simulation that cannot model cloud feedback response to better than ±4 W/m^2.

GCMs cannot simulate cloud response to 0.1% accuracy. They cannot simulate cloud response to 1% accuracy. Or to 10% accuracy.

Does cloud cover increase with CO₂ forcing? Does it decrease? Do cloud types change? Do they remain the same?

What happens to tropical thunderstorms? Do they become more intense, less intense, or what? Does precipitation increase, or decrease?

None of this can be simulated. None of it can presently be known. The effect of CO₂ emissions on the climate is invisible to current GCMs.

The answer to any and all these questions is very far below the resolution limits of every single advanced GCM in the world today.

The answers are not even empirically available because satellite observations are not better than about ±10% in CF.

Meaning

Present advanced GCMs cannot simulate how clouds will respond to CO₂ forcing. Given the tiny perturbation annual CO₂ forcing represents, it seems unlikely that GCMs will be able to simulate a cloud response in the lifetime of most people alive today.

The GCM CF error stems from deficient physical theory. It is therefore not possible for any GCM to resolve or simulate the effect of CO₂ emissions, if any, on air temperature.

Theory-error enters into every step of a simulation. Theory-error means that an equilibrated base-state climate is an erroneous representation of the correct climate energy-state.

Subsequent climate states in a step-wise simulation are further distorted by application of a deficient theory.

Simulations start out wrong, and get worse.

As a GCM steps through a climate simulation in an air temperature projection, knowledge of the global CF consequent to the increase in CO₂ diminishes to zero pretty much in the first simulation step.

GCMs cannot simulate the global cloud response to CO₂ forcing, and thus cloud feedback, at all for any step.

This remains true in every step of a simulation. And the step-wise uncertainty means that the air temperature projection uncertainty compounds, as Vasquez and Whiting note.

In a futures projection, neither the sign nor the magnitude of the true error can be known, because there are no observables. For this reason, an uncertainty is calculated instead, using model calibration error.

Total ignorance concerning the simulated air temperature is a necessary consequence of a cloud response ±120-fold below the GCM resolution limit needed to simulate the cloud response to annual CO₂ forcing.

On an annual average basis, the uncertainty in CF feedback into LWCF is ±114 times larger than the perturbation to be resolved.

The CF response is so poorly known that even the first simulation step enters terra incognita.

The uncertainty in projected air temperature increases so dramatically because the model is step-by-step walking away from an initial knowledge of air temperature at projection time t = 0, further and further into deep ignorance.

The GCM step-by-step journey into deeper ignorance provides the physical rationale for the step-by-step root-sum-square propagation of LWCF error.

The propagation of the GCM LWCF calibration error statistic and the large resultant uncertainty in projected air temperature is a direct manifestation of this total ignorance.

Current GCM air temperature projections have no physical meaning.

The climate data they don't want you to find — free, to your inbox.
Join readers who get 5–8 new articles daily — no algorithms, no shadow bans.
5 1 vote
Article Rating
578 Comments
Inline Feedbacks
View all comments
September 21, 2019 4:03 am

Pat, you will probably recall that back on your first thread https://wattsupwiththat.com/2019/09/07/propagation-of-error-and-the-reliability-of-global-air-temperature-projections-mark-ii/ I disputed the solidity of your error propagation and asked you if your theory was falsifiable, but then I went away on holiday so missed some fun and games since then. I should now like to return to this point.

You wrote in reply “the only way to falsify a physical error analysis that indicates wide model uncertainty bounds, is to show highly accuracy for the models”. I am afraid that that remark addresses one input to your equations, namely the model accuracy (or at least variance if it is a biassed model, since low variance can still be associated with high error in that case). But it does not address the equations themselves and the effect of assumptions about covariance matrices.

Recall that I wrote an equation

T_i(t) = T_i(t-1) + d_i(t) + e_i(t)

but since you have explained that it is forcing (which I’ll call F) which is iterated we should instead write

F_i(t) = F_i(t-1) + d_i(t) + e_i(t)

and then T_i is derived from that. I gave examples of how d_i(t), a non-stochastic component, and e_i(t), a stochastic component, could combine to give the final F_i(T) a low uncertainty. In your reply to me on the meaning of “uncertainty”, you wrote:

“The model then is known to predict an observable as a mean plus or minus an interval of uncertainty about that mean revealed by the now-known calibration accuracy width.”

The observable here is F_i(t), so the uncertainty is a statistical confidence interval for its value (with a normal distribution that would be something like mean +/- 2 standard deviations). So the question is how can we falsify your prediction of the uncertainty interval, in a scientific manner? I suggest to you the following.

You have a known calibration accuracy width, which is a standard deviation s (if there are multiple parameters then s will be the square root of a covariance matrix). Then after n time steps the model is known to predict a value m +/- p(s,n) where p is the function which projects the uncertainty. I believe you are using p(s,n) = s sqrt(n). The fact that for each model multiple calibration experiments were possible in order to estimate s means that multiple predictive experiments can be done to test whether p(s,n) is correct. Specifically:

Choose n, say 4 (years).
Choose a replication number m, say 100.
Run the model m times (you may need to ask the model owner to do this on your behalf), recording x_1,…,x_m. Let

a = sum_i x_i/m, v = sum_i (x_i – a)^2/(m p(s,n)^2).

Test the value of v using statistics.

v should be distributed as chi-squared with m-1 degrees of freedom, and its expected value is 1.
If v is outside the 95% confidence interval for that chi-squared, then your model will have been falsified.

I suggested a fairly small n there, 4, and it is possible that your theory will just survive that test. But try n=25 for a keener test.

Do you see anything wrong with that proposal for an experiment?

Rich.

Reply to  See - owe to Rich
September 21, 2019 8:47 am

You are missing the point. “Calibration accuracy” is not a statistical calculation with a standard deviation. There is no population driven data set determining a mean and standard deviation. It is merely an interval where any value within it has equal probability of occurring. That is why it is “uncertain”. That is why it propagates rather than converging to a mean.

Reply to  Jim Gorman
September 21, 2019 10:38 am

Jim,

I am afraid that your comment contradicts Pat’s words which I quoted:

“The model then is known to predict an observable as a mean plus or minus an interval of uncertainty about that mean revealed by the now-known calibration accuracy width.”

So I’ll await Pat’s reply.

Reply to  See - owe to Rich
September 21, 2019 11:51 am

I’m afraid your quote leaves out a bit of context, Rich.

Here’s what I wrote, with the missing sentence added back:

The model then is known to predict an observable as a mean plus or minus an interval of uncertainty about that mean revealed by the now-known calibration accuracy width.

That width provides the uncertainty when the model is used to predict a future state..

Changes things a bit from what you implied, doesn’t it.

Jim Gorman has it exactly right. His explanation didn’t contradict my words at all.

Reply to  See - owe to Rich
September 21, 2019 11:45 am

Rich, if you want to assess my work, you should refer to the equation I used. Not make something up, and then compose a straw man argument.

It should be obvious that the (delta)F_i in eqn. 1 of the paper are the IPCC’s SRES and RCP forcings. They are givens, and have no error or uncertainty.

Your first equation was irrelevant, and so is your second. T is derived as my equation 1. Not from anything you might invent.

You wrote, “The observable here is F_i(t)” No, it is not. There are in fact no observables. There are predictions and their uncertainties.

You wrote, “So the question is how can we falsify your prediction of the uncertainty interval,…

It’s not my uncertainty interval. It comes from Lauer and Hamilton, 2013. If you want to falsify it, then show that their work is wrong.

You wrote, “If v is outside the 95% confidence interval for that chi-squared, then your model will have been falsified.

Not correct. The (+/-)4 W/m^2 calibration error statistic represents the average of 27 models run across 20 years of hindcast. Each model will have its own calibration error, probably unique to itself. It is unlikely that any one model will display the same calibration error as the average of 27 of them.

So, your experiment tells us nothing.

In any case, the calibration error statistic was derived doing exactly the experiment you described: models runs, followed by comparisons against observations. It’s an empirical statistic.

The fact that the experiment you wanted produced the calibration statistic I used, effectively falsifies your argument on your own grounds.

You have yet to grasp the nettle of my analysis, Rich. Everything you’ve tried has been totally misconceived.

Reply to  Pat Frank
September 22, 2019 2:33 am

Pat, I shall reply below to your points, which I annotate with P:, and then annotate my own as R:.

P: Rich, if you want to assess my work, you should refer to the equation I used. Not make something up, and then compose a straw man argument.

R: The reason I am “making things up” is that I am trying to see if there is an error regime which can explain your observations about the models and yet have a smaller propagation error than you claim. This is a legitimate activity, but is definitely a work in progress.

P: It should be obvious that the (delta)F_i in eqn. 1 of the paper are the IPCC’s SRES and RCP forcings. They are givens, and have no error or uncertainty.

R: OK, in a paper which I am writing I am using F to cover all forcings, so mistakenly used that letter here, since you are using it only for CO2 forcing. The variable at issue here is the Total Cloud Forcing, which you rightly say gives rise to errors in the models. So let’s replace ‘F’ in that equation by ‘C’ to denote this cloud forcing variable. It is up to me to see if I can make anything meaningful out of that.

P: Your first equation was irrelevant, and so is your second. T is derived as my equation 1. Not from anything you might invent.

R: Then your T is a deterministic mean and has no information about error or uncertainty. I shall ponder on how to deal with that – to do statistics one needs error distributions, and they must be lurking somewhere.

P: You wrote, “The observable here is F_i(t)” No, it is not. There are in fact no observables. There are predictions and their uncertainties.

R: No comment at this time.

P: You wrote, “So the question is how can we falsify your prediction of the uncertainty interval,…”
It’s not my uncertainty interval. It comes from Lauer and Hamilton, 2013. If you want to falsify it, then show that their work is wrong.

R: I currently have no qualms about the 4 W/m^2 TCF error which seems to come from their paper. I had hoped that I had made it clear that it is the propagation of the errors over time which concerns me, where you cite Bevington & Robinson (2003). My fear is that your analysis makes unfounded assumptions about error structure. But I cannot prove that yet.

P: You wrote, “If v is outside the 95% confidence interval for that chi-squared, then your model will have been falsified.”
Not correct. The (+/-)4 W/m^2 calibration error statistic represents the average of 27 models run across 20 years of hindcast. Each model will have its own calibration error, probably unique to itself. It is unlikely that any one model will display the same calibration error as the average of 27 of them.

R: In that case I have to replace “run the model” with “run the 27 models”. That would be a lot of work. But do you accept that your theory predicts that multiple runs of those models will give ensemble means which vary widely? And therefore a test of that prediction can be devised? I believe that is the crux of the matter.

P: So, your experiment tells us nothing.
In any case, the calibration error statistic was derived doing exactly the experiment you described: models runs, followed by comparisons against observations. It’s an empirical statistic.
The fact that the experiment you wanted produced the calibration statistic I used, effectively falsifies your argument on your own grounds.

R: I don’t believe those model runs ran 20 years into the future and found uncertainty intervals of 4*sqrt(20) W/m^2 with concomitant effect on model temperatures. But you have studied them more than I, so perhaps you will correct me.

P: You have yet to grasp the nettle of my analysis, Rich. Everything you’ve tried has been totally misconceived.

R: Perhaps. Or maybe you have yet to grasp the nettle of my objections 🙂 The next thing I am going to think about is whether statistics from a single model, rather than the 27, can provide useful information about the error propagation, and also follow some of your links about such in your helpful comment below (Sep21 12:05pm).

Rich.

Reply to  See - owe to Rich
September 22, 2019 12:52 pm

Rich, you wrote, “The variable at issue here is the Total Cloud Forcing, which you rightly say gives rise to errors in the models.

Actually, it’s approximately the other way around. (Theory)-error in the models gives rise to incorrectly simulated total cloud fraction (not forcing).

You wrote, “R: Then your T is a deterministic mean and has no information about error or uncertainty. I shall ponder on how to deal with that – to do statistics one needs error distributions, and they must be lurking somewhere.

Try reading paper sections, CMIP5 Model Calibration Error in Global Average Annual Total Cloud Fraction (TCF) and A Lower Limit of Uncertainty in the Modeled Global Average Annual Thermal Energy Flux

They’ll tell you where the uncertainty comes from.

You wrote, “I currently have no qualms about the 4 W/m^2 TCF error which seems to come from their (Lauer and Hamilton, 2013) paper.

It’s not error, Rich. It’s uncertainty. And it’s not 4 W/m^2, it’s (+/-)4 W/m^2.

These two distinctions are central to my analysis. Virtually all of my critics have ignored, misunderstood, or avoided them.

You wrote, “But do you accept that your theory predicts that multiple runs of those models will give ensemble means which vary widely?

No, because it does no such thing. Uncertainty is about reliability, not specific outcomes or errors. Uncertainty says that even if all the models produced identical air temperature projections, the uncertainty in them would remain unchanged.

You wrote, “I don’t believe those model runs ran 20 years into the future and found uncertainty intervals of 4*sqrt(20) W/m^2 with concomitant effect on model temperatures.

The test runs were 20-year hindcasts, not projections. Models are tuned to reproduce known air temperatures. Their inter-model correspondence is no surprise. It is put in by hand.

See J. T Kiehl (2007) Twentieth century climate model response and climate sensitivity GRL 34(22), L22710.

Uncertainty doesn’t specify a spread of model simulated air temperatures. It specifies whether they are reliable predictions.

Please look at the extract from Kline, 1985 in the selections on uncertainty analysis I posted in this thread, here

You wrote, “Or maybe you have yet to grasp the nettle of my objections

So far, Rich, your objections have centered on inventions.

Reply to  Pat Frank
September 22, 2019 2:27 pm

Pat, I have started to study your helpful screed below, and will tend to make short piecemeal comments rather than save everything up.

First off, you have ticked me off a few times for not using +/-, but for me whenever I say that an error or uncertainty bound is X I always mean +/-X to be understood. Will have to beg forgiveness for that.

Now the enlightening thing about that screed is that it highlights a semantic problem between scientists/engineers and statisticians (and I am the latter). A statistician talks in terms of random variables and random variates. So the result of an experiment, a priori, is X which is a random variable, often assumed to be a normal random variable with mean m and some variance s^2. Then a posteriori, after the experiment has been performed, a value x has been observed, and this is a random variate.

Now assume that m is actually known, which might be the case for some calibration experiments. Then a statistician would call X-m the unknown random error and call x-m the observed error. But uncertainty wallahs would apparently call them “uncertainty” and “error” respectively.

I think this can at least explain the occasional talking at cross purposes. Will continue thinking, especially about propagation over time.

Reply to  Pat Frank
September 22, 2019 3:11 pm

Rich, by “screed” do you mean the extracts from the literature? If so, that’s an unusual usage.

Second, I regret ticking you off, but I can’t have surmised what you meant when it was never specified.

If you’ve followed the debate, here and elsewhere, Nick Stokes, ATTP, and others have argued that rmse is always a positive magnitude.

This is always in the larger context of their claim that all GCM error is a mere offset bias that, known or unknown, always subtracts away.

This nonsense gives them call to assert that all GCM simulation anomalies are perfectly accurate.

The same false assertion of constant bias error is a kind of folk-belief among climate modelers. I’ve encountered it a number of times among my reviewers.

So, I’ve learned to be very careful when someone leaves off the (+/-), because that has typically been the opening gambit of asserting that all GCM simulation errors are positive offset errors that subtract away into predictive perfection.

You apparently did not intend that meaning. I regret not knowing that, and upsetting you.

But around here, if you write a rmse using the positive value convention, many are likely abuse it as a sign that you too believe that (+/-) = +, and go on to argue perfectly accurate anomalies. Please be careful. Openings get exploited.

One element of experimental science to keep in mind is that systematic errors, especially if stemming from uncontrolled variables, or from within deficient theory in non-linear models, are unlikely to be normally distributed. And often of unpredictable distribution, which is again different from randomly distributed.

The messiness of the real world always intrudes.

September 21, 2019 12:05 pm

For the benefit of all, I’ve put together an extensive post that provides quotes, citations, and URLs for a variety of papers — mostly from engineering journals, but I do encourage everyone to closely examine Vasquez and Whiting — that discuss error analysis, the meaning of uncertainty, uncertainty analysis, and the mathematics of uncertainty propagation.

These papers utterly support the error analysis in “Propagation of Error and the Reliability of Global Air Temperature Projections.”

Summarizing: Uncertainty is a measure of ignorance. It is derived from calibration experiments.

Multiple uncertainties propagate as root sum square. Root-sum-square has positive and negative roots (+/-). Never anything else, unless one wants to consider the uncertainty absolute value.

Uncertainty is an ignorance width. It is not an energy. It does not affect energy balance. It has no influence on TOA energy or any other magnitude in a simulation, or any part of a simulation, period.

Uncertainty does not imply that models should vary from run to run, Nor does it imply inter-model variation. Nor does it necessitate lack of TOA balance in a climate model.

For those who are scientists and who insist that uncertainty is an energy and influences model behavior (none of you will be engineers), or that a (+/-)uncertainty is a constant offset, I wish you a lot of good luck because you’ll not get anywhere.

For the deep-thinking numerical modelers who think rmse = constant offset or is a correlation: you’re wrong.

The literature follows:

Moffat RJ. Contributions to the Theory of Single-Sample Uncertainty Analysis. Journal of Fluids Engineering. 1982;104(2):250-8.

Uncertainty Analysis is the prediction of the uncertainty interval which should be associated with an experimental result, based on observations of the scatter in the raw data used in calculating the result.

Real processes are affected by more variables than the experimenters wish to acknowledge. A general representation is given in equation (1), which shows a result, R, as a function of a long list of real variables. Some of these are under the direct control of the experimenter, some are under indirect control, some are observed but not controlled, and some are not even observed.

R=R(x_1,x_2,x_3,x_4,x_5,x_6, . . . ,x_N)

It should be apparent by now that the uncertainty in a measurement has no single value which is appropriate for all uses. The uncertainty in a measured result can take on many different values, depending on what terms are included. Each different value corresponds to a different replication level, and each would be appropriate for describing the uncertainty associated with some particular measurement sequence.

The Basic Mathematical Forms

The uncertainty estimates, dx_i or dx_i/x_i in this presentation, are based, not upon the present single-sample data set, but upon a previous series of observations (perhaps as many as 30 independent readings) … In a wide-ranging experiment, these uncertainties must be examined over the whole range, to guard against singular behavior at some points.

Absolute Uncertainty

x_i = (x_i)_avg (+/-)dx_i

Relative Uncertainty

x_i = (x_i)_avg (+/-)dx_i/x_i

Uncertainty intervals throughout are calculated as (+/-)sqrt[(sum over (error)^2].

The uncertainty analysis allows the researcher to anticipate the scatter in the experiment, at different replication levels, based on present understanding of the system.

The calculated value dR_0 represents the minimum uncertainty in R which could be obtained. If the process were entirely steady, the results of repeated trials would lie within (+/-)dR_0 of their mean …”

Nth Order Uncertainty

The calculated value of dR_N, the Nth order uncertainty, estimates the scatter in R which could be expected with the apparatus at hand if, for each observation, every instrument were exchanged for another unit of the same type. This estimates the effect upon R of the (unknown) calibration of each instrument, in addition to the first-order component. The Nth order calculations allow studies from one experiment to be compared with those from another ostensibly similar one, or with “true” values.

Here replace, “instrument” with ‘climate model.’ The relevance is immediately obvious. An Nth order GCM calibration experiment averages the expected uncertainty from N models and allows comparison of the results of one model run with another in the sense that the reliability of their predictions can be evaluated against the general dR_N.

Continuing: “The Nth order uncertainty calculation must be used wherever the absolute accuracy of the experiment is to be discussed. First order will suffice to describe scatter on repeated trials, and will help in developing an experiment, but Nth order must be invoked whenever one experiment is to be compared with another, with computation, analysis, or with the “truth.”

Nth order uncertainty, “

*Includes instrument calibration uncertainty, as well as unsteadiness and interpolation.
*Useful for reporting results and assessing the significance of differences between results from different experiment and between computation and experiment.

The basic combinatorial equation is the Root-Sum-Square:

dR = sqrt[sum over((dR_i/dx_i)*dx_i)^2]

https://doi.org/10.1115/1.3241818

Moffat RJ. Describing the uncertainties in experimental results. Experimental Thermal and Fluid Science. 1988;1(1):3-17.

The error in a measurement is usually defined as the difference between its true value and the measured value. … The term “uncertainty” is used to refer to “a possible value that an error may have.” … The term “uncertainty analysis” refers to the process of estimating how great an effect the uncertainties in the individual measurements have on the calculated result.

THE BASIC MATHEMATICS

This section introduces the root-sum-square (RSS) combination (my bold), the basic form used for combining uncertainty contributions in both single-sample and multiple-sample analyses. In this section, the term dX_i refers to the uncertainty in X_i in a general and nonspecific way: whatever is being dealt with at the moment (for example, fixed errors, random errors, or uncertainties).

Describing One Variable

Consider a variable X_i, which has a known uncertainty dX_i. The form for representing this variable and its uncertainty is

X=X_i(measured) (+/-)dX_i (20:1)

This statement should be interpreted to mean the following:
* The best estimate of X, is X_i (measured)
* There is an uncertainty in X_i that may be as large as (+/-)dX_i
* The odds are 20 to 1 against the uncertainty of X_i being larger than (+/-)dX_i.

The value of dX_i represents 2-sigma for a single-sample analysis, where sigma is the standard deviation of the population of possible measurements from which the single sample X_i was taken.

The uncertainty (+/-)dX_i Moffat described, exactly represents the (+/-)4W/m^2 LWCF calibration error statistic derived from the combined individual model errors in the test simulations of 27 CMIP5 climate models.

For multiple-sample experiments, dX_i can have three meanings. It may represent tS_(N)/(sqrtN) for random error components, where S_(N) is the standard deviation of the set of N observations used to calculate the mean value (X_i)_bar and t is the Student’s t-statistic appropriate for the number of samples N and the confidence level desired. It may represent the bias limit for fixed errors (this interpretation implicitly requires that the bias limit be estimated at 20:1 odds). Finally, dX_i may represent U_95, the overall uncertainty in X_i.

From the “basic mathematics” section above, the over-all uncertainty U = root-sum-square = sqrt[sum over((+/-)dX_i)^2] = the root-sum-square of errors (rmse). That is U = sqrt[(sum over(+/-)dX_i)^2] = (+/-)rmse.

The result R of the experiment is assumed to be calculated from a set of measurements using a data interpretation program (by hand or by computer) represented by

R = R(X_1,X_2,X_3,…, X_N)

The objective is to express the uncertainty in the calculated result at the same odds as were used in estimating the uncertainties in the measurements.

The effect of the uncertainty in a single measurement on the calculated result, if only that one measurement were in error would be

dR_x_i = (dR/dX_i)*dX_i)

When several independent variables are used in the function R, the individual terms are combined by a root-sum-square method.

dR = sqrt[sum over(dR/dX_i)*dX_i)^2]

This is the basic equation of uncertainty analysis. Each term represents the contribution made by the uncertainty in one variable, dX_i, to the overall uncertainty in the result, dR.

http://www.sciencedirect.com/science/article/pii/089417778890043X

Vasquez VR, Whiting WB. Accounting for Both Random Errors and Systematic Errors in Uncertainty Propagation Analysis of Computer Models Involving Experimental Measurements with Monte Carlo Methods. Risk Analysis. 2006;25(6):1669-81.

[S]ystematic errors are associated with calibration bias in the methods and equipment used to obtain the properties. Experimentalists have paid significant attention to the effect of random errors on uncertainty propagation in chemical and physical property estimation. However, even though the concept of systematic error is clear, there is a surprising paucity of methodologies to deal with the propagation analysis of systematic errors. The effect of the latter can be more significant than usually expected.

Usually, it is assumed that the scientist has reduced the systematic error to a minimum, but there are always irreducible residual systematic errors. On the other hand, there is a psychological perception that reporting estimates of systematic errors decreases the quality and credibility of the experimental measurements, which explains why bias error estimates are hardly ever found in literature data sources.

Of particular interest are the effects of possible calibration errors in experimental measurements. The results are analyzed through the use of cumulative probability distributions (cdf) for the output variables of the model.”

A good general definition of systematic uncertainty is the difference between the observed mean and the true value.”

Also, when dealing with systematic errors we found from experimental evidence that in most of the cases it is not practical to define constant bias backgrounds. As noted by Vasquez and Whiting (1998) in the analysis of thermodynamic data, the systematic errors detected are not constant and tend to be a function of the magnitude of the variables measured.”

Additionally, random errors can cause other types of bias effects on output variables of computer models. For example, Faber et al. (1995a, 1995b) pointed out that random errors produce skewed distributions of estimated quantities in nonlinear models. Only for linear transformation of the data will the random errors cancel out.”

Although the mean of the cdf for the random errors is a good estimate for the unknown true value of the output variable from the probabilistic standpoint, this is not the case for the cdf obtained for the systematic effects, where any value on that distribution can be the unknown true. The knowledge of the cdf width in the case of systematic errors becomes very important for decision making (even more so than for the case of random error effects) because of the difficulty in estimating which is the unknown true output value. (emphasisi in original)”

It is important to note that when dealing with nonlinear models, equations such as Equation (2) will not estimate appropriately the effect of combined errors because of the nonlinear transformations performed by the model.

Equation (2) is the standard uncertainty propagation sqrt[sum over(±sys error statistic)^2].

In principle, under well-designed experiments, with appropriate measurement techniques, one can expect that the mean reported for a given experimental condition corresponds truly to the physical mean of such condition, but unfortunately this is not the case under the presence of unaccounted systematic errors.

When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i. Similarly, the same approach is used to define a total random error based on individual standard deviation estimates,

e_k = sqrt[sum over(sigma_R_i)^2]

A similar approach for including both random and bias errors in one fterm is presented by Deitrich (1991) with minor variations, from a conceptual standpoint, from the one presented by ANSI/ASME (1998)

http://dx.doi.org/10.1111/j.1539-6924.2005.00704.x

Kline SJ. The Purposes of Uncertainty Analysis. Journal of Fluids Engineering. 1985;107(2):153-60.

The Concept of Uncertainty

Since no measurement is perfectly accurate, means for describing inaccuracies are needed. It is now generally agreed that the appropriate concept for expressing inaccuracies is an “uncertainty” and that the value should be provided by an “uncertainty analysis.”

An uncertainty is not the same as an error. An error in measurement is the difference between the true value and the recorded value; an error is a fixed number and cannot be a statistical variable. An uncertainty is a possible value that the error might take on in a given measurement. Since the uncertainty can take on various values over a range, it is inherently a statistical variable.

The term “calibration experiment” is used in this paper to denote an experiment which: (i) calibrates an instrument or a thermophysical property against established standards; (ii) measures the desired output directly as a measurand so that propagation of uncertainty is unnecessary.

The information transmitted from calibration experiments into a complete engineering experiment on engineering systems or a record experiment on engineering research needs to be in a form that can be used in appropriate propagation processes (my bold). … Uncertainty analysis is the sine qua non for record experiments and for systematic reduction of errors in experimental work.

Uncertainty analysis is … an additional powerful cross-check and procedure for ensuring that requisite accuracy is actually obtained with minimum cost and time.

Propagation of Uncertainties Into Results

In calibration experiments, one measures the desired result directly. No problem of propagation of uncertainty then arises; we have the desired results in hand once we complete measurements. In nearly all other experiments, it is necessary to compute the uncertainty in the results from the estimates of uncertainty in the measurands. This computation process is called “propagation of uncertainty.”

Let R be a result computed from n measurands x_1, … x_n„ and W denotes an uncertainty with the subscript indicating the variable. Then, in dimensional form, we obtain: (W_R = sqrt[sum over(error_i)^2]).”

https://doi.org/10.1115/1.3242449

Henrion M, Fischhoff B. Assessing uncertainty in physical constants. American Journal of Physics. 1986;54(9):791-8.

“Error” is the actual difference between a measurement and the value of the quantity it is intended to measure, and is generally unknown at the time of measurement. “Uncertainty” is a scientist’s assessment of the probably magnitude of that error.

https://aapt.scitation.org/doi/abs/10.1119/1.14447

Matthew R Marler
Reply to  Pat Frank
September 21, 2019 11:47 pm

Pat Frank, I skimmed the Vasquez and Whiting paper and read through its reference section. It looks like a solid addition to my education. Thank you.

That post is a good contribution to this discussion.

For more about me, go here: https://www.researchgate.net/profile/Matthew_Marler

September 21, 2019 12:50 pm

I think Stokes is right. The toy model loses it. CMIP5 does not. And if it does, it’s fixed. But you aren’t talking of the math of a CMIP5. You are talking about your toy model. And I think you are pulling out 1 variable of maybe 20 that they use and attacking that one in your toy model.

A CMIP5 has maybe 20 things. A toy model has 3 or whatever it is. When we take 1 of those 20 things and put into the toy model, that model doesn’t work. I don’t care.

The CMIP5 has a deal. It’s something to do with the TOA. It keeps the CMIP5 bounded. They all have it and it works. And when it doesn’t work, it’s fixed. The toy model doesn’t have this same deal.

That the CMIP5 is bounded means its problem has been fixed. I can say a CMIP5 has problems. But then I need to demonstrate it. With results. And the CMIP5s did have problems, but the one at issue has been fixed. And now it gives results.

Yet the problem identified should at least show up in some kind of distribution in all of the CMIP5 results. Because whatever math problem you found, should always be there, in all of the results, though maybe in a bell curve distribution. But it could just cancel out like has been suggested.

But if a simple math case is solid as suggested, then it lives in everything at issue. If it’s true, it’s there. Now where is it?

Matthew R Marler
Reply to  Ragnaar
September 21, 2019 2:01 pm

Ragnaar: Now where is it?

We can not know that until we know the exact truth.

Reply to  Matthew R Marler
September 22, 2019 11:14 am

I am saying if this error exists, we should see it in the CMIP5 outputs. And as a warm bias only half the time as I understand it. And a cold bias the other half.

Reply to  Ragnaar
September 22, 2019 1:05 pm

You’d be wrong, Ragnaar. Like so many others, you think uncertainty is error. It is not.

Uncertainty does not specify a range of model simulated air temperatures.

Uncertainty in projected temperature, i.e., (+/-)C, does not imply that the projection should sometimes have positive temperature error and sometimes negative.

Your entire approach to the problem of uncertainty is incorrect.

Please look at the extract from Kline, 1985 in the set of extracts I posted above, here.

Second paragraph, first line is, “An uncertainty is not the same as an error.”

As I mentioned to Rich above, CMIP5 models (like all others) are tuned to produce the historical temperature trends. Their inter-model consistency is put in by hand.

See J. T Kiehl (2007) Twentieth century climate model response and climate sensitivity GRL 34(22), L22710.

Matthew R Marler
Reply to  Ragnaar
September 22, 2019 6:13 pm

Ragnaar: I am saying if this error exists, we should see it in the CMIP5 outputs. And as a warm bias only half the time as I understand it. And a cold bias the other half.

And I have written that you are wrong on both counts.

There is an ambiguity, perhaps, about which “time interval” you are halving. There is no reason to think that the error is + for half of the GCM runs and – for the other half.

Reply to  Ragnaar
September 21, 2019 2:52 pm

Ragnaar,

“The toy model loses it. CMIP5 does not. And if it does, it’s fixed. But you aren’t talking of the math of a CMIP5. You are talking about your toy model. And I think you are pulling out 1 variable of maybe 20 that they use and attacking that one in your toy model.”

If that one variable provides the same output from the “toy” model as does the CIMP5 for temperature prediction then which is the better model for temperature prediction? Frank’s wasn’t attacking the variables used per se but the uncertainty in the output of the models. It doesn’t matter if the models use 20 variables or one variable when both produce the same linear output for temperature!

You sound a little bit jealous over the climate models to me. Their complexity is not a virtue if they are nothing more than generators of a linear relationship.

Reply to  Ragnaar
September 21, 2019 6:04 pm

Uncertainty is not bounded, Ragnaar.

Propagated calculational uncertainty can grow well beyond the limits of a bounded system. When it does so, it means the model expectation values have no information; no physical meaning.

A lower limit of resolution defines the limit of model reliability. One needs only to evaluate that limit, to know whether to trust a model. Which is what I have done.

Throughout your comment, you’ve confused error with uncertainty. A Fatal mistake. Nothing of what you wrote has any relevance.

Reply to  Pat Frank
September 22, 2019 11:07 am

Because a thing is a problem in one context, your model, doesn’t mean it’s a problem in another context, a CMIP5. A CMIP5 is a system. With all things interacting and dependent on each other. With your thing that is the problem, there are other things keeping it in check in a CMIP5. I am making a few assumptions here. When you don’t keep it in check, it is a problem as you are saying.

Here’s what I think would help you. Put it into a picture story. Your story exists beyond almost everyone’s understanding. I am trying to tell a story about it. You could even put my attempt into a picture story, to show me why I am wrong.

Here’s another attempt. My pick-up works. Someone is telling me it shouldn’t and should waver all over the road going half into the other lane every 30 seconds. But it works fine. My pick-up has errors as it’s 20 years old. The steering aint what it used to be. My best proof is not responding to whomever is saying my pick-up doesn’t work and just driving around as I normally do. So you need some crashes.

Reply to  Ragnaar
September 22, 2019 1:09 pm

Your approach to the problem remains wrong, Ragnaar. It’s not about error, it’s about uncertainty.

It’s not a question of your pick-up wandering around on the road.

It’s a question of whether it will get you to the town 80 km away, given that the mechanic says your transmission is failing and it is presently making awful grinding noises.

Uncertainty, Ragnaar, not error.

Reply to  Pat Frank
September 22, 2019 3:14 pm

I am uncertain to what inch my left front tire is off the center line all the time. I am uncertain every second. But I am fine. The other guy who posted on this showed a story of the temperature over time. A story in pictures.

Life exists with uncertainty. Now model it.

Reply to  Ragnaar
September 22, 2019 4:42 pm

“With your thing that is the problem, there are other things keeping it in check in a CMIP5. ”

The issue isn’t “keeping it in check”. The issue is how certain the output is. Boundary limits don’t guarantee accuracy, the uncertainty still remains.

Reply to  Tim Gorman
September 22, 2019 8:00 pm

So we have uncertainty that is bounded. Now this uncertainty must be assigned a value that is useful. Say it’s growing exponentially. But 90% of that growth is being thrown away by what bounds it. So now we need the effective uncertainty. If we are throwing away uncertainty, where does that leave us? Above I say my left front tire is I don’t know how many inches from the center line. But as long as things look right, I don’t care. I am throwing away uncertainty all the time and that doesn’t bother me. That there is this uncertainty, doesn’t seem to impact the system. You might say CMIP5s can’t do what I am doing. They track. That’s whats prevents them from blowing up, same with the climate. I will say it again. CMIP5s track. The spin up is an example of that. Everything that is wrong, go away. That’s tracking. So as has been said, they track CO2 too. You set the CO2, and they track it relentlessly for I suppose at least 50 years. But this propagation of uncertainty seems to have been solved so far by tracking a CO2 equilibrium. That is the CO2 sets the equilibrium. So they’ve been teaching the models to track the CO2 equilibrium and inventing a bunch of stuff, some of it questionable. I think it was said, the cloud problem has a made up assumption in it. But the whole deal still tracks. We can say there’s a total model uncertainty. That includes all the uncertainties if each one could be distilled.

Reply to  Ragnaar
September 23, 2019 8:02 am

“So we have uncertainty that is bounded. Now this uncertainty must be assigned a value that is useful. Say it’s growing exponentially. But 90% of that growth is being thrown away by what bounds it.”

Who says the uncertainty is bounded? In an iterative process the uncertainty adds at each step. The only boundary on how much it grows is how many iterative steps are taken. It’s why a weather forecast for 24 hours ahead is more certain than one for 48 hours. The uncertainty grows over that second 24 hours. Nor does uncertainty usually grow exponentially. Did you not read the past posts on a ruler that is “about” 12 inches long, say somewhere between 11″ and 13″? In measuring a room using that ruler the first iterative step has an uncertainty of +/- 1″. The second measurement will double that, i.e. 13″ + 1″ and 11″-1″, or 10″-14″ or an uncertainty of +/- 2″ instead of one. The third iteration will be +/- 3″.

“Above I say my left front tire is I don’t know how many inches from the center line. But as long as things look right, I don’t care.”

How do you know things look “right”? If you don’t know where your tire is in relation to the center line then you don’t know where it is from the drop-off of the pavement on the ditch side. And if it is like every truck I have driven you can’t see the tire on the ditch side so you *should* care what the uncertainty is.

“That there is this uncertainty, doesn’t seem to impact the system. ”

Again, how do you know The climate modelers never tell us what the uncertainty of the model output is. So when they claim they can *know* the average temperature growth to the nearest hundredth of a degree how are we to judge if that is inside or outside the uncertainty of the projection? Since most temperature measuring devices, especially in the past, can only resolve to the nearest +/- 0.5C how can the climate models have a resolution better than this? Especially since the central limit theory doesn’t apply when you are using independent measurement devices measuring independent conditions. I hate to keep harping on the 1000 steel girder example but if you measure 1000 steel girders with 1000 different tape measures how does the central limit theory help you? The differences in the measurements are not random but, instead, are systemic. You don’t know if the tape measures were hot or cold and therefore shrunken or expanded and the same applies to the girders. The central limit theory won’t help you get an accurate average in such a case.

“Everything that is wrong, go away. That’s tracking. ”

Tracking what? We already know the model outputs don’t match the real world, they run hot. So *something* wrong did *not* go away.

“But this propagation of uncertainty seems to have been solved so far by tracking a CO2 equilibrium.”

Tracking CO2 doesn’t mean the temperature output of the models are accurate! They will still have a measure of uncertainty!

“But the whole deal still tracks. We can say there’s a total model uncertainty. That includes all the uncertainties if each one could be distilled.”

Again, how do you know the models track anything? If the uncertainty interval is larger than the resolution of the temperature changes they claim then you don’t know if the models are tracking or not.

Reply to  Tim Gorman
September 23, 2019 6:33 pm

Ragnaar, you merely find a new way to be wrong.

Some days ago, I added a long post from the literature about the meaning of uncertainty, here. It establishes that uncertainty is not error; establishes, Ragnaar.

You have ignored it. Fine. Be wrong as you prefer.

Uncertainty is unbounded. It is not constrained by physical boundary conditions.

Your arguments are wrong. Several people here have tried to show you the way, most especially Tim Gorman and Matthew Marler. You prefer to ignore their correct explanations.

Good luck Ragnaar with the rest of your life, because you’ll not get anywhere in science.

Anthony Banton
September 22, 2019 5:10 am

https://pubpeer.com/publications/391B1C150212A84C6051D7A2A7F119#5

#5 Carl Wunsch
I am listed as a reviewer, but that should not be interpreted as an endorsement of the paper. In the version that I finally agreed to, there were some interesting and useful descriptions of the behavior of climate models run in predictive mode. That is not a justification for concluding the climate signals cannot be detected! In particular, I do not recall the sentence “The unavoidable conclusion is that a temperature signal from anthropogenic CO2 emissions (if any) cannot have been, nor presently can be, evidenced in climate observables.” which I regard as a complete non sequitur and with which I disagree totally.

The published version had numerous additions that did not appear in the last version I saw.

I thought the version I did see raised important questions, rarely discussed, of the presence of both systematic and random walk errors in models run in predictive mode and that some discussion of these issues might be worthwhile.

CW

Reply to  Anthony Banton
September 22, 2019 1:27 pm

That sentence, “The unavoidable conclusion is that a temperature signal from anthropogenic CO2 emissions (if any) cannot have been, nor presently can be, evidenced in climate observables.”

… was in every single version of the paper CW saw.

… which I regard as a complete non sequitur and with which I disagree totally.”

I would like to see CW, or anyone else, show how a model can simulate the effect of a perturbation that is two orders of magnitude below the model’s lower limit of resolution.

Given the immediately very large uncertainty bounds around projected air temperatures, it is certainly not a non-sequitur to say that a temperature signal from anthropogenic CO2 emissions (if any) cannot have been, nor presently can be, evidenced in climate observables.

I read CW’s comment yesterday at PubPeer, and admit to a bit of shock that he would write such a disavowal of his own review.

He may disagree, but it is nevertheless clear that the huge uncertainty following from low model resolution means inability to detect any temperature effect of CO2 emissions.

Let’s notice, too, Anthony Banton that you neglected to mention any of my readily available replies there. Prejudiced a bit, is it? Or just careless.

Anthony Banton
Reply to  Pat Frank
September 23, 2019 12:54 am

“Prejudiced a bit, is it? Or just careless.”

Neither.
Just extremely relevant.
Obviously.

Gator
Reply to  Anthony Banton
September 23, 2019 4:31 am

Relevance would dictate including Pat’s replies.

Prejudiced. Biased. Denier.

Reply to  Anthony Banton
September 23, 2019 7:39 pm

Relevant to the way Carl Wunsch thinks, Anthony Banton.

And relevant to those who put politics ahead of science; to those who employ fake criticism.

Not relevant to anything in the paper.

September 23, 2019 12:18 am

Pat Sep22 3:11pm:

Note 2: “Screed”

[I shall consider my Sep22 2:27pm to be Note 1: Uncertainty = random variable, error = random variate.]

I was in danger of being a Mrs. Malaprop there (my wife sometimes complains of this), but I was helpful in prepending the adjective “helpful”. Therefore Pat realized, I think, that my “helpful screed” was not pejorative.

The New Shorter Oxford English Dictionary Volume 2, a mere 3700-odd pages long, gives “screed” as “a long, esp. tedious, piece of writing or speech; a (dull) tract”, but includes the apparently non-pejorative example “Any news will be welcome and I will give you a screed in reply”. I wonder whether the negative connotations have become more prevalent in modern usage (there is no date for the given example).

September 23, 2019 12:41 am

Note 3: Error and estimated error

In Note 1 “Uncertainty = random variable, error = random variate” I wrote about a random variable X having a mean m and variance s^2, and how if m is known then an observation x of X records an error x-m. But often m is not known. In this case replication is used; n observations x_i are taken and m* = sum_1^n x_i/n, the sample mean, is used to determine “errors” x_i-m*. However, because m* is not m, these are really estimated errors, and sometimes the distinction will be important. Often the fact of estimation is understood and the “estimated” adjective is sensibly dropped for brevity. So for example the Root Mean Squared Error (RMSE) is often of the estimated type, and is then

RMSE = sqrt(sum_1^n (x_i-m*)^2/(n-1))

Sometimes n is used in place of n-1 there, but the latter makes MSE = RMSE^2 an unbiassed estimate of the true error variance under certain conditions.

Reply to  See - owe to Rich
September 23, 2019 12:18 pm

“In Note 1 “Uncertainty = random variable”

What makes you think uncertainty is a random variable? Take my voltmeter with an uncertainty of +/- 0.1v. How is that +/- 0.1v random? It doesn’t change from reading to reading. That uncertainty will remain no matter how many individual measurements I make. You can take as many sample groups of arbitrary size as you want out of those individual measurements but you won’t cancel out the uncertainty using the central limit theorem. You might develop a mean which you *think* is more accurate than any of the individual measurements but you still won’t be sure because the uncertainty remains.

Reply to  Tim Gorman
September 23, 2019 2:21 pm

Tim: well, suppose you are measuring a voltage which some super duper voltmeter has established is 10.0000 volts. What does it mean for you to say that your voltmeter has an uncertainty of +/- 0.1v? Does it mean that for example it is measuring 9.70 volts and the manufacturer has sworn blind and false that the true voltage is therefore between 9.6 and 9.8 volts (or a wider range if some normal distribution is involved)? Or does it mean that it read 9.93 volts and the manufacturer swore blind and true that the true voltage is between 9.83 and 10.03 volts?

When you have clarified what that uncertainty “interval” means, it may be possible to discuss any statistical tests that might pertain.

Reply to  See - owe to Rich
September 23, 2019 3:00 pm

See,

Every voltmeter I have uses resistive dividers to provide for measuring different voltage levels. Those resistive dividers have uncertainty built into them because no resistor can be considered perfect. Just as no thermometer can be considered perfect. Even the new thermistors used in modern thermometers have uncertainty. They may have high resolution but that doesn’t eliminate the uncertainty. Manufacturers try to reduce the uncertainty by generating calibration curves but the uncertainty can only be reduced, not eliminated. And once the thermometer leaves the calibration shop the uncertainty grows thereafter, it never gets better.

Analog meters have all kinds of uncertainties, e.g. static electricity on the dial face affecting needle positioning, non-linearity in the meter motor, even humidity. The fact that no meter dial can have infinitely scaled meter markings requires interpolation to obtain a measurement but this can be somewhat reduced by taking the average of multiple readings (the central limit theory). The other uncertainties can’t be accounted for since the impact of static electricity, non-linear meter motors, and etc are unknown.

Digital meters have what is called “last digit” uncertainty. What does it take for that last digit to change from one digit to the next? A major unknown affecting the uncertainty of any reading.

For your example the meter would not read 9.70v. It would read 9.7v. And yes, an uncertainty of +/-0.1 v would mean the reading could be between 9.6v and 9.8v inclusive. Stop and think about your significant digits. I know that concept isn’t taught much anymore, just as uncertainty isn’t. Why would a meter with a resolution of +/- 0.1v have a readout in the hundreth’s? The uncertainty would swamp anything shown on the meter in the last digit – which is the entire point of Dr. Frank’s paper on uncertainty in the climate models. How can they resolve to a hundreth of a degree when their uncertainty interval is larger than that? Precision is not accuracy. Uncertainty is not error.

Reply to  Tim Gorman
September 23, 2019 6:40 pm

Calibration uncertainty is an interval within which an instrument or a model can provide no data. It is not a random error.

Calibration uncertainty is typically empirically determined. It is an empirical measure of the resolution of the method. Serial calculations employing values with appended uncertainties put a root-sum-square of those uncertainties in the final result.

All that is amply explained in the post above, presenting extracts from the literature.

I have no idea why it is invisible to so many.

Reply to  Pat Frank
September 23, 2019 8:52 pm

Dr. Frank,

“I have no idea why it is invisible to so many.”

Because so many people have no actual experience with using analog methods in the field to accomplish a goal. None of them have ever had to actually worry about tolerances on a fish plate used to connect bridge girders together. Or using an analog voltmeter to determine circuit conditions in something affecting human lives, they just hook up a digital voltmeter and By God it gives you a precise and accurate value, no questions need to be asked, uncertainty is zero. Same for their digital thermometer.

And it is endemic throughout much of science today. My PhD son doing HIV research has pointed out how so many experiments today in the biological sciences simply can’t be replicated. It’s because so many of the researchers use methods with high degrees of uncertainty but they simply don’t understand that. So they trust their results to be both precise and accurate and never a question asked! My son ignored the advice of his advisors in undergraduate studies that he didn’t need any statistics classes and took 9 hours anyway. When he combines that with his knowledge of immunology he understands how to get repeatable results with an uncertainty interval. Just because two different researchers get different results doesn’t mean either is wrong, not if they are within the range of uncertainty.

Reply to  Tim Gorman
September 24, 2019 2:30 am

Tim, thanks for the clarification. I threw in the centivolt digit just to see what you would say. Now, in order for scientists to do meaningful mathematics on uncertainty, they have to formalize it into a random variable. Otherwise, uncertainties cannot be combined together in a proper manner, in the way elucidated in Pat’s paper.

In our example, we know that the voltmeter has a bias, since it is reading low. We don’t know exactly what that bias is, but a best estimate from the single experiment is -0.3v. Then on top of that there is the distribution of its error, for which we have that single reading of -0.3v. It might be that its error is uniformly distributed in the interval (-0.4,-0.2), and such an assumption might lead to reasonable conclusions. However, given the physical nature you describe it is more likely to be a normal distribution, say N(-0.3,0.1^2). But it is still more likely that it is something like N(-0.27,0.08^2). Multiple experiments, including changing the true (or more accurately known) voltage, would help to detrmine that. Whether the more accurate information on its error distribution would be useful would depend on the application, and one would suspect that most times it would not.

Rich.

Reply to  See - owe to Rich
September 24, 2019 6:18 am

Rich,

“Now, in order for scientists to do meaningful mathematics on uncertainty, they have to formalize it into a random variable. Otherwise, uncertainties cannot be combined together in a proper manner, in the way elucidated in Pat’s paper.”

Why is this? When using a ruler that is assumed to be 12″ long with an uncertainty of +/- 1″ the uncertainty is not random. It is a specific interval that doesn’t change. And yet the uncertainty associated with its use over iterative measurements can be calculated quite easily – and it is not random.

I still think you are confusing error and uncertainty. You want to try and wish uncertainty away by making an unwarranted assumption that uncertainty is a random variable when it isn’t.

“In our example, we know that the voltmeter has a bias, since it is reading low. We don’t know exactly what that bias is, but a best estimate from the single experiment is -0.3v.”

You just jumped from uncertainty to error.

“Then on top of that there is the distribution of its error, for which we have that single reading of -0.3v.”

Like I said, you just jumped from uncertainty to error. An uncertainty interval doesn’t tell you what any error might be or its distribution. It just gives you an interval within which the error can exist, it doesn’t tell you the actual error.

“Multiple experiments, including changing the true (or more accurately known) voltage, would help to detrmine that.”

It will help determine the error distribution. It won’t change the uncertainty interval.

“Whether the more accurate information on its error distribution would be useful would depend on the application, and one would suspect that most times it would not.”

Error is not uncertainty. Uncertainty is not error. Stop trying to conflate the two and you’ll understand.

September 23, 2019 5:48 am

Note 4: The +/-4 W/m^2 is an estimated RMSE for TCF

[Mainly for my own clarification.]

Figure 6 of Pat’s paper says “identical SRES scenarios showing the ±1σ uncertainty bars due to the annual average ±4 Wm–2 CMIP5 TCF long-wave tropospheric thermal flux calibration error…”, where TCF is Total Cloud Forcing.

This estimated RMSE arises from statistical analysis of CMIP5 models. It appears from Lauer & Hamilton (2003) that that comes from 24 models times 20 annual observations.

September 23, 2019 6:08 am

Note 5: An error in Equation (5)

In Pat’s paper, it appears that Equation (5) is incorrectly derived from Equation (1). In showing this I am first going to simplify the equations in three ways. First, remove the Delta from Delta-T, because it is comparing time t to time 0 and is therefore confusing in comparison with Delta-F which compares time i to time i-1. Second, remove the K’s, as we know what units we are working in. Third, combine f_CO_2 = 0.42 and the 33 into their product 13.86. So now we have

T_t = 13.86(1+sum_{i=1}^t Delta F_i/F_0) + a (1)

Now, the only way to get to a single Delta F_i from this is to subtract a (t-1)-fold sum from a t-fold sum, thus:

T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)

Imagine writing i instead of t there, and then compare with Equation (5.2) as per my simplification:

T_i +/- u_i = 13.86 (1+Delta F_i/F_0) +/- 1.665 (5.2)

where 1.665 = 0.42*33*4/33.30.

I am not sure whether or how this affects the validity of the propagation analysis, which effectively uses 1.665*sqrt(t) as its output, but this mismatch between (5.2) and (5.R) is disturbing.

Matthew R Marler
Reply to  See - owe to Rich
September 23, 2019 9:15 am

See-owe to Rich: this mismatch between (5.2) and (5.R)

It looks to me like you have found an actual error. I can’t see where it affects the uncertainty propagation analysis. Good job. Several dozen minds must have missed that error by now.

Incidentally, I have reread not just the paper (I wonder how many remaining errors I have missed!), and the supplementary information, which quotes reviewers’ criticisms and responds to them. It is well worth rereading.

Reply to  Matthew R Marler
September 23, 2019 7:01 pm

Matthew, there is an actual error. It is Rich’s error.

His mistake is always the same one. He invariably imposes his mistaken understanding on what I wrote.

It’s very tedious.

There is no mistake in eqn. 5, nor in eqn. 1. Indeed eqn. 5 follows directly and obviously from eqn. 1.

Reply to  Matthew R Marler
September 23, 2019 7:34 pm

Further, Rich is not calculating a (delta)T.

His subtraction is (delta)T_i – (delta)T_(i-1) = (delta-delta)T_i-(i-1). “i” can be “t;” it doesn’t matter. The equations are always (delta)T.

The final difference written correctly is, (delta-delta)T_(i – (i-1)) = 13.86*[(delta-delta)F_i-(i-1))]/F_0.

Eqn 5.2 (R) is incorrect.

Is that disturbing?

Matthew R Marler
Reply to  Pat Frank
September 23, 2019 11:50 pm

Pat Frank: Is that disturbing?

It looks like in eqn 5, deltaT,(K) is supposed to be the uncertainty in the temperature increase from time i-1 to time i, which is proportional to the change in forcing from time i-1 to time i. That is why the uncertainty has only the single term [0.42 x etc]. If that is the case, then you have forgotten to subtract F0 from the sum of the forcings in eqn1.

About eqn 1 you write In Equation 1, delta Tt is the total change of air temperature in
Kelvins across projection time t,

and F0 + sumoni deltaFi is appropriate.

In eqn 5 sumoni deltaFi has been replaced by deltaFi. Furthermore about eqn 5 you write belowq figure 8 that The impact of a 0.035 Wm^2 annual forcing change on cloud cover due to increased CO2 cannot be resolved, or simulated by, climate models that have a +/- 4 Wm^2 resolution lower limit . That seems to imply that eqn 5 describes only the annual temperature change caused by an annual forcing change. That interpretation is consistent with the uncertainty in eqn 5, which is the uncertainty of a 1 year temp change, not the uncertainty accumulated over K years.

You also wrote before eqn 5 that: The CMIP5 average annual LWCF +/- 4.0 Wm^2 [per year]
calibration thermal flux error is now combined with the thermal flux due to GHG emissions in emulation equation 1, to produce equation 5. This will provide an estimate of the uncertainty in any tropospheric global air temperature projection made using a CMIP5 GCM. In equation 5 the step-wise GHG forcing term, deltaFi, is conditioned by the uncertainty in thermal flux in every step due to the continual imposition of LWCF thermal flux calibration error.

that makes it seem that deltaTi(K) is the change on the last step to the final K.

So, the simplest explanation is that you changed the meaning of delta Ti(K) and forgot to drop F0 from the difference in forcing.

It took a good mind to discern that the symbol delta Ti(K) is perfectly clear in each case, but has changed its meaning (if that is what occurred) from eqn 1 to eqn 2. If that is the notational detail that Nick Stokes complained of, he certainly wrote it most obscurely.

Also notice that the uncertainty in eqn 1 can now be written as:

a^2 = sum on i ui^2.

Notice also that, conditional on the error in the cloud feedback parameter, eqn 1 does not describe a random walk.

You wrote this as well: Equation 5 gives the change in temperature over a single annual step. Hence the index “i,” which you have removed. Notice: no Greek capital. No summation.

That was my interpretation at the start. If that is true, then you have forgotten to subtract off F0 in eqn 5, which I did not notice! Not a big deal. But the change in meaning of the common symbol deltaTi(K) should be explicitly noted, and (if I am correct) the eqn corrected.

Reply to  Matthew R Marler
September 24, 2019 1:55 pm

Matthew, “It looks like in eqn 5, deltaT,(K) is supposed to be the uncertainty in the temperature increase from time i-1 to time i, which is proportional to the change in forcing from time i-1 to time i.

No, ΔT_i(K) is the change in projected temperature due to the ΔF_i change in forcing. All the uncertainty is calculated from ±4 W/m^2.

You wrote, “That seems to imply that eqn 5 describes only the annual temperature change caused by an annual forcing change.

And so it does.

That interpretation is consistent with the uncertainty in eqn 5, which is the uncertainty of a 1 year temp change, not the uncertainty accumulated over K years.

Yes. The uncertainty over n years (I reserve K for Kelvins), is the rss of the individual years. Eqn. 5 shows how individual years are calculated.

that makes it seem that deltaTi(K) is the change on the last step to the final K.

No, it doesn’t. It makes it seem like each ΔT_i is calculated from ΔF_i.

Eqn. 1 is the expression for ΔT_t. Eqn. 5 is the expression for ΔT_i. They are not the same.

Eqn. 1 is about total change. Eqn. 5 is about the individual step.

Your problem, Matthew (if you are indeed Matthew, which I seriously doubt) is that you’re not reading carefully.

So, the simplest explanation is that you changed the meaning of delta Ti(K) and forgot to drop F0 from the difference in forcing.

No. There is no ΔT_i in eqn. 1. The simplest explanation is that you’re analytically careless, whoever you are.

Also notice that the uncertainty in eqn 1 can now be written as: a^2 = sum on i ui^2.

There is no uncertainty in eqn. 1.

Notice also that, conditional on the error in the cloud feedback parameter, eqn 1 does not describe a random walk.

Eqn. 1 is not meant to describe a random walk. Eqn. 1 describes the linear extrapolation of GHG forcing.

Eqn. 5 doesn’t describe a random walk, either. Eqn. 5 is about uncertainty, not about error.

you have forgotten to subtract off F0 in eqn 5, which I did not notice!

I forgot no such thing. F_0 should not be subtracted from eqn. 5. Eqn. 5 is just a single step version of eqn. 1, with the LWCF calibration uncertainty added.

Who are you, really. I seriously doubt you’re Matthew Marler. Matthew has given every indication of careful and knowledgeable thinking. You have given neither.

Matthew R Marler
Reply to  Matthew R Marler
September 24, 2019 5:48 pm

Pat Frank: Who are you, really. I seriously doubt you’re Matthew Marler. Matthew has given every indication of careful and knowledgeable thinking. You have given neither.

That was uncalled for.

eqn. 1: ΔT_t = 13.86(F_0+ΔF_t)/F_0
eqn. 5: ΔT_i = 13.86(F_0+ΔF_i)/F_0

ΔT_t – ΔT_i = ΔΔT = [13.86(F_0+ΔF_t)/F_0] – [13.86(F_0+ΔF_i)/F_0]

= 13.86[(F_0+ΔF_t)/F_0 – (F_0+ΔF_i)/F_0]

=13.86[(F_0+ΔF_t)-(F_0+ΔF_i)/F_0]

=13.86[(F_0+ΔF_t-F_0-ΔF_i)/F_0]

=13.86[(ΔF_t-ΔF_i)/F_0]

ΔΔT = 13.86[ΔΔF/F_0]

You’re not calculating a ΔT, Rich. You’re calculating a ΔΔT from a ΔΔF.

That equation is not a simplification of eqn. 5.

You’ve dropped the Δ on T and ignored that the Δ operators on the F’s combine in a subtraction.

If that was what you intended to write, then you need to rewrite eqn 5. If you are really calculating ” ΔΔT from a ΔΔF”, then you need to drop F0 from the computation of ΔΔF.

Equation 1 gives the change in temperature after a total of i steps in forcing change. If that is what you meant, it is definitely non-standard notation. Index “i” seldom denotes the number of steps over which a summation is taken, which is usually represented by a capital letter, viz {i: i = 1, …, N} at the foot of the summation sign; or “i=1” at the foot of the summation and “N” at the head of the summation sign.

Just as you have used delta Ti to represent two different things, so you have used i to represent two different things.

Reply to  Matthew R Marler
September 25, 2019 9:18 pm

Me, Who are you, really. I seriously doubt you’re Matthew Marler. Matthew has given every indication of careful and knowledgeable thinking. You have given neither.

Matthew, “That was uncalled for.

Sorry, Matthew. But that post seemed very sloppy and unedited to me. It didn’t look or read like anything else of yours I’d ever read. Apologies for any offense.

You wrote, “If you are really calculating ” ΔΔT from a ΔΔF”, then you need to drop F0 from the computation of ΔΔF.

I’m not calculating a ΔΔT from anything. I’m not using a ΔΔF anywhere. I’m calculating exactly what eqns. 1 and 5 show.

If that is what you meant, it is definitely non-standard notation. Index “i” seldom denotes the number of steps over which a summation …

It’s standard where I come from. When N is unspecified, it just means one sums of whatever number of steps one likes.

Just as you have used delta Ti to represent two different things, so you have used i to represent two different things.

No, I have not. Eqn. 1 has ΔT_t, while eqn. 5 has ΔT_i.

Page 2: “In Equation 1, ΔT_t is the total change of air temperature in Kelvins across projection time t,…

On page 10, eqn. 5 ΔT_i is conditioned upon the step-wise forcing term ΔF_i. The “i” index always represents single steps.

It all seems very obvious. The source of your misapprehension, not.

Reply to  See - owe to Rich
September 23, 2019 6:58 pm

Equation 1 gives the change in temperature after a total of i steps in forcing change.

It is the total change in temperature after those steps; hence the Greek capital sigma on the (delta)F_i indicating a summation. The index on T is “t,” for total time.

Equation 5 gives the change in temperature over a single annual step. Hence the index “i,” which you have removed. Notice: no Greek capital. No summation.

The (delta)T does not compare to time t = 0. It is the change in temperature over the step after time t = i-1.

The delta in (delta)T_i in eqn. 5 indicates the change in temperature due to the delta forcing in step “i.” You remove it and the meaning changes.

Eqn. 5, in other words, is a single step within the totality of steps represented by eqn. 1.

The (delta)F_i come from the IPCC standard forcings. This is made abundantly clear in the paper. One does not have to derive the (delta)F_i from anything.

Rich, you have recognized that (delat)F_i is the forcing change after step i-1.

How, then, is it possible for you to have missed the meaning that (delta)T_i is the temperature change after step i-1? How did you miss that? Incredible.

There is no mistake. You’ve just imposed your mistaken understanding of what I wrote.

Your comments have been consistent that way, Rich.

Matthew R Marler
Reply to  Pat Frank
September 24, 2019 12:01 am

Pat Frank: Equation 1 gives the change in temperature after a total of i steps in forcing change.

Ah. Now I see that your intention in writing delta Ti(K) was merely to identify the units as Kelvin. I mistakenly took it to be the number of steps, ie, the ending value of the index of summation i. You have described i as both the index of summation and its upper limit.

Readers who did not make my inferential mistake of using K in two senses in the same equation will naturally be confused over your summation index.

Again, I doubt that is what Nick Stokes was aiming at. Most commonly N would be used as the upper limit of summation, but you could use I.

In reading the supplemental information, no one caught this error (as it seems to me) before See-oh-two. The critiques largely missed the main points, or were extraneous (unless someone reveals a previously hidden meaning some time.)

This is still an good and important paper.

Reply to  Matthew R Marler
September 24, 2019 2:26 pm

Rich, “I wrote:

T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)

Yes, and it’s wrong.

eqn. 1: ΔT_t = 13.86(F_0+ΔF_t)/F_0
eqn. 5: ΔT_i = 13.86(F_0+ΔF_i)/F_0

ΔT_t – ΔT_i = ΔΔT = [13.86(F_0+ΔF_t)/F_0] – [13.86(F_0+ΔF_i)/F_0]

= 13.86[(F_0+ΔF_t)/F_0 – (F_0+ΔF_i)/F_0]

=13.86[(F_0+ΔF_t)-(F_0+ΔF_i)/F_0]

=13.86[(F_0+ΔF_t-F_0-ΔF_i)/F_0]

=13.86[(ΔF_t-ΔF_i)/F_0]

ΔΔT = 13.86[ΔΔF/F_0]

You’re not calculating a ΔT, Rich. You’re calculating a ΔΔT from a ΔΔF.

That equation is not a simplification of eqn. 5.

You’ve dropped the Δ on T and ignored that the Δ operators on the F’s combine in a subtraction.

Your entire approach is wrong.

Reply to  Pat Frank
September 25, 2019 6:56 am

Oh, WordPress has done it again and we are now commenting here on a comment of mine below. Anyway Pat, I agree that this Delta stuff gets confusing. In fact, your argument below looked so compelling that I thought I was going to have to apologize to you, but see below.

I still think it would have been much clearer if instead of writing ΔT_t for T_t-T_0 you had simply used an anomaly from the initial temperature and just called it T_t, and that is why I dropped that Δ in my analysis. However, to get us back onto the same base I am going to put those Δ’s back in. But if you are going to use Δ’s with different spans, it is necessary to subscript them in order properly to follow the algebra. Thus Δ_i X_j = X_j – X_{j-i}.

Let us now examine your claim that

“T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)”
Yes, and it’s wrong.”

In the new Δ terminology that equation of mine now reads:

Δ_1 T(t) = 13.86 Δ_1 F_t/F_0

Let us follow the consequences of Equation (1), which in simplified form is

Δ_t T_t = 13.86(F_0+sum_1^t Δ_1 F_i)/F_0
= 13.86(F_0+ Δ_t F_t)/F_0

So that verifies your statement:

eqn. 1: ΔT_t = 13.86(F_0+ΔF_t)/F_0

provided that Δ is Δ_t here. Next you write

eqn. 5: ΔT_i = 13.86(F_0+ΔF_i)/F_0

Just as Δ had to be Δ_t in Equation (1), now it isomorphically has to be Δ_i. But now there is a problem, because you wrote earlier that “Equation 5 gives the change in temperature over a single annual step”, which would imply that the LHS is Δ_1 T_i = T_i – T_{i-1}. So the equation above is not equivalent to Equation (5) in the paper because it has Δ_i instead of Δ_1 on the LHS. Now to get back to my equation we can let i = t-1 and get:

Δ_1 T_t = T_t – T_i = Δ_t T_t – Δ_i T_i

and with this I can now follow your own further analysis, adding subscripts to the Δ’s:

Δ_t T_t – Δ_i T_i = ΔΔT = [13.86(F_0+Δ_t F_t)/F_0] – [13.86(F_0+Δ_i F_i)/F_0]
= 13.86[(F_0+Δ_t F_t)/F_0 – (F_0+Δ_i F_i)/F_0]
=13.86[(F_0+Δ_t F_t)-(F_0+Δ_i F_i)/F_0]
=13.86[(F_0+Δ_t F_t-F_0-Δ_i F_i)/F_0]
=13.86[(Δ_t F_t – Δ_i F_i)/F_0]

This is 13.86(F_t-F_0-F_i+F_0)/F_0 = 13.86 Δ_1F_t/F_0 since i = t-1. QED!

I wouldn’t be so severe as to say that “your entire approach is wrong”, but only to say that this element of it is.

Rich.

Reply to  Pat Frank
September 25, 2019 8:00 pm

Rich ΔT_t does not equal T_t-T_0.

ΔT_t = 13.86*[(ΔF_total +F_0)/F_0] > 13.86 C.

T_0 = 13.86 C. Eqn. 1 ΔT is the step-wise changed temperature and includes a summation over ΔF_i.

Eqn. 5 is one single ΔF_i incremental step of the summation in eqn. 1.

You wrote, “Δ had to be Δ_t in Equation (1)

I presume you mean ΔF had to be ΔF_t in Equation (1). If so, then only at the t_th step.

Eqn. 1 is generalized and includes a summation, so that ΔF must be ΔF_i.

Let me reproduce the central and telling element of your derivation, Rich.

You wrote, “Δ_t T_t – Δ_i T_i = ΔΔT = [13.86(F_0+Δ_t F_t)/F_0] – [13.86(F_0+Δ_i F_i)/F_0]
ΔΔT = …

This is [ΔΔT] = 13.86(F_t-F_0-F_i+F_0)/F_0 = 13.86 Δ_1F_t/F_0 since i = t-1. QED!

You’ve derived a ΔΔT, just as I reported. ΔΔT ≠ ΔT.

I’ll post an illustration demonstrating that.

Reply to  Pat Frank
September 25, 2019 8:18 pm

I illustrate here what you’re doing wrong, Rich.

For the sake of discussion, let F_0 = 30 W/m^2 and let ΔF_i = constant = 1 W/m^2.

We can all agree that at time t=0, ΔF_i = 0.

We start with your derived expression, Rich, which I present as given:

ΔT = 13.86 C*(ΔF_i/F_0), yielding 0 C at t = 0, ΔF_i = 0.

At time t = 1, ΔT_1 = 13.86 C*(1W/m^2/30 W/m^2) = 0.462 C

At t = 2, ΔT_2 = 13.86 C*(1W/m^2/30 W/m^2) = 0.462 C

At t = n, ΔT_n = 13.86 C*(1W/m^2/30 W/m^2) = 0.462 C
+++++++++++

Now we take eqn 1 (or eqn. 5) as they appear in the paper.

There, slightly rearranging, eqn 1 = 13.86 C*[(Σ_iΔF_i+F_0)/F_0] =

At t = 0, ΔF_i = 0 and ΔT = 13.86[(0+30 Wm^-2)/30 Wm^-2] = 13.86 C.

At t = 1, ΔF_i = 1 W/m^2 and ΔT_1 = 13.86 C*[((1+30)W/m^2)/30 W/m^2] = 14.322 C; increment = 0.462 C

At t = 2, the summation yields ΔF_i = 2 W/m^2 and ΔT_2 = 13.86 C*[((2+30)W/m^2)/30 W/m^2] = 14.784 C; increment = 0.462 C

At t = n, the summation yields ΔF_n = nW/m^2, and ΔT_n = 13.86 C*[((n+30W/m^2)/30 W/m^2] = n*0.462 C + 13.86 C; increment over t_n-1 = 0.462 C.
++++++++++++

ΔT(paper) is the step-wise changed temperature.

ΔT(Rich) is the stepwise increment of temperature change.

In the paper ΔT_n – ΔT_n-1 = ΔΔT = 0.462 C = ΔT(Rich).

ΔΔT(paper) = 0.462 C. ΔT(Rich) = 0.462 C.

Therefore, ΔT(Rich) = ΔΔT(paper), as derived above.

So, now the mistake is manifest. ΔT(Rich) is not ΔT(paper).

Your ΔT is in fact paper ΔΔT.

This makes sense because the presentation of both eqn. 1 and eqn 5 is in terms of ΔT.

Although you wrote your difference as T_n – T_n-1 = ΔT_n, Rich, you should have written it as ΔT_n – ΔT_n-1 = ΔΔT_n.

Big mistake.

Your entire criticism based in a misconception.

I feel badly to point out that yet once again you have imposed your idea of what you think I should have meant onto what I in fact did mean.

This sort of inadvertent straw man argument has been a constant in your criticisms.

Every one is misconstrued from the start, misrepresenting or misconstruing what I actually did.

is it really so hard to look at the content of the paper and figure out what it actually says and what it means? It seems to me discerning intended meaning is the first obligation a reader owes to an author.

Especially in science, where we intend to be monosemous.

Reply to  Pat Frank
September 26, 2019 4:35 am

Pat [Sep25 8:18pm in case WordPress puts this somewhere strange],

First of all I’d like to stress that everything I am doing here is in good faith. I genuinely want to understand how you obtain your results, though I do confess that as I have a gut instinct that the mathematical sequence is not well proven, I am indeed looking out for errors. You have to be prepared for that in peer review, whether formal or on a blog. If I find no problems I shall be relatively happy because I have little faith in GCM projections anyway.

Your illustration has helped me to see that I had been ignoring the offset F_0 so you are right that ΔT_t in your parlance cannot equal a T_t-T_0. In fact I think we can say that your Δ is a delta, Jim, but not as we know it. To clarify things for me and hopefully others, I am going to use the following terminology:

Δ_i X_n = X_n-X_{n-i}
‘Δ’ = the one in your Equation (1)
“Δ” = the one in your Equation (5)
R_t = 13.86 sum_{j=1}^t Δ_1 F_j/F_0

Note that R_t represents a change in temperature across the t steps from 1 to t. Now we can rewrite your equations, starting with (1).

‘Δ’T_t = R_t + 13.86 + a (1.R)

Moving to Equation (5), note that Δ_1 R_i = R_i – R_{i-1} = 13.86 Δ_1 F_i/F_0 and is a temperature change across one time step, so

“Δ”T_i = 13.86((F_0 + Δ_1 F_i)/F_0) = 13.86 + Δ_1 R_i (5.R)

This shows that Equation (1) is a change of temperature across t steps (a Δ_t) plus two additives 13.86 and a, and Equation (5) is a change of temperature across 1 step (a Δ_1) plus one additive 13.86. Why the crazy 13.86 additives? Well it’s your paper so you can write those equations how you like, with Δ’s that mean different things, or what we wish them to mean as if we were with Alice in Wonderland, and I can no longer say one is “wrong” or inconsistent with the other, but I doubt that I am the only reader who finds them pretty confusing.

Reply to  Pat Frank
September 24, 2019 3:03 am

Pat,

I wrote:

T_t – T_{t-1} = 13.86 Delta F_t/F_0 (5.R)

Imagine writing i instead of t there, and then compare with Equation (5.2) as per my simplification:

T_i +/- u_i = 13.86 (1+Delta F_i/F_0) +/- 1.665 (5.2)

Now, it may well be that you changed the meaning of Delta T_i from T_i-T_0 in Equation (1) to T_i-T_{i-1} in Equation (5), but that is really bad practice and makes papers very hard to read and verify, and a good referee would have picked up on that. But even if we allow you that, you cannot then retain the 13.86*1 in my (5.2) here, or equivalently the 0.42x33Kx[(F_0 in the (5.2) in your paper, as I believe Matthew noted above, because to get to (5.2) from (1) that term gets cancelled out. That >is< an error, although it is not necessarily an important one, because after this point you concentrate on the u_i's and not the Delta T_i's whichever version you happen to be using on a Tuesday.

Reply to  See - owe to Rich
September 25, 2019 7:41 pm

Rich, “you changed the meaning of Delta T_i from T_i-T_0 in Equation (1) to T_i-T_{i-1} in Equation (5),…

T_0 in eqn. 1 is 13.86 C, Rich. Eqn. 1 (delta)T does not equal T_i – T-0.

Eqn. 5 gives the incremental change of any one (delta)F_i.

How does it come that you have so much trouble reading what is actually there?

Reply to  Pat Frank
September 25, 2019 10:17 pm

Because you keep changing it. Just today we have here
“T_0 in eqn. 1 is 13.86 C, Rich. Eqn. 1 (delta)T does not equal T_i – T-0.
Eqn. 5 gives the incremental change of any one (delta)F_i.”

But here they are the same:
“This makes sense because the presentation of both eqn. 1 and eqn 5 is in terms of ΔT.”

Reply to  Pat Frank
September 28, 2019 7:01 pm

Nick, “But here they are the same:
“This makes sense because the presentation of both eqn. 1 and eqn 5 is in terms of ΔT.”

Yes and eqn. 1 has a summation that eqn. 5 does not.

And eqn. 1 has a ΔT_t, while eqn. 5 has a ΔT_i.

They are the same ΔT; one in summation, one the single-step contribution.

Very deep math, but I know you’ll get it if you try, Nick.

Reply to  See - owe to Rich
September 23, 2019 7:38 pm

“In Pat’s paper, it appears that Equation (5) is incorrectly derived from Equation (1).”
Well, yes, that is one of its problems. Basically it drops a Σ. It goes from F₀+ΣΔFᵢ to F₀+ΔFᵢ. And muddles its units in 5.2. This is really just a version of saying that you shouldn’t compound the supposed error in F.

Of course, the real problem is that error propagation in (1) has nothing to do with error propagation in a GCM.

Reply to  Nick Stokes
September 23, 2019 7:47 pm

I”m glad to see you opportunistically agreeing with Rich’s mistaken derivation, Nick. It shows that your prejudice overcomes not only your sense but your math skills.

The muddle is yours, Nick. Why would not the equation of a single step, i, drop the Σ indicating a sum over i?

If I had kept the Σ in eqn. 5.1, 5.2, that would have been a mistake.

But you knew that, didn’t you. But you proceeded with the falsehood anyway.

GCMs are demonstrated to be no more than linear extrapolation machines. Linear error propagation is exactly appropriate.

Reply to  Pat Frank
September 23, 2019 9:19 pm

” Why would not the equation of a single step, i, drop the Σ indicating a sum over i?”
Because it means something totally different. F₀+ΣΔFᵢ means you take all the ΔFᵢ and add them to F₀. It also means the i suffix does not apply to the result. There is no way in that process you deal with a single F₀+ΔFᵢ, and no way that can sensibly acquire an index i.

If there were any logic in what you say, it would involve the partial sum of the first i terms, not just one plucked out of the sum. And then there would be a whole other story on its uncertainty.

Reply to  Nick Stokes
September 24, 2019 6:27 am

“It also means the i suffix does not apply to the result. There is no way in that process you deal with a single F₀+ΔFᵢ, and no way that can sensibly acquire an index i.”

Are you actually thinking about what you are saying?

The process is made up of iterative steps. The term ΣΔF *is* a sum of the iterative steps that came before when considering any specific step i, i.e. the sum from i=1 to i=i-1. So why wouldn’t it have an index i?

I still think you are just trolling this thread to see what you can cause to be generated.

Reply to  Nick Stokes
September 24, 2019 9:26 am

“a sum of the iterative steps that came before when considering any specific step i”
Yes. So you might add ΔFᵢ to the sum of F₀ and previous ΔF. But why would you add the i-th ΔF to the original F₀ without anything else?

Reply to  Nick Stokes
September 24, 2019 10:44 am

Nick,

Look closely at these two equations and tell me what the difference is.

∆Tt(K) = fCO2 x 33K x [(F0 + Σⁱ∆Fi)/F0] + a (1)

∆Ti(K) ∓ ui = 0.42 x 33K x [(F0 + ∆Fi ∓ 4Wm2)/F0] (5.1)

Reply to  Nick Stokes
September 24, 2019 11:49 am

Σⁱ

Reply to  Nick Stokes
September 25, 2019 8:59 pm

Nick, “…no way that can sensibly acquire an index i.

Index i indicates step number. When i = 1, you in fact, “deal with a single F₀+ΔFᵢ,” namely F₀+ΔF_1.

Really weak, Nick.

Reply to  Nick Stokes
September 25, 2019 9:36 pm

“When i = 1, you in fact, “deal with a single F₀+ΔFᵢ,” namely F₀+ΔF_1.”
And when i=5, why would you add ΔF_5 to the original F₀? But you said
“No, ΔT_i(K) is the change in projected temperature due to the ΔF_i change in forcing.”
So what is F₀ doing there? If the change in forcing were zero, the change in temperature would have been 0.42 * 33K = 13.9 K (in 1 year!). If there were no change in forcing at all, the temperature would have gone on increasing by those uniform steps forever.

This isn’t just uncertainty. Eq 5 now describes a quite different, and totally implausible model, with rapid linear warming not in response to added GHGs, but to the preindustrial (1900, you say) F₀.

Reply to  Nick Stokes
September 28, 2019 7:27 pm

All explained here, Nick. You can be at peace.

Matthew R Marler
Reply to  Nick Stokes
September 24, 2019 12:04 am

Nick Stokes: Of course, the real problem is that error propagation in (1) has nothing to do with error propagation in a GCM.

Once again, this is about uncertainty propagation, and a in eqn 1 can be written in terms of the ui in eqn 5, as I did above.

Reply to  Matthew R Marler
September 24, 2019 1:09 am

MRM,
“Once again, this is about uncertainty propagation”
Once again, the paper is titled “propagation of error”.

“a in eqn 1 can be written in terms of the ui in eqn 5”
The coefficient a in Eq 1 has nothing to do with uncertainty. It is specified:
“Finally, coefficient a = 0 when ΔTₜ is calculated from a temperature anomaly, but is otherwise the unperturbed air temperature.”

” I mistakenly took it to be the number of steps”
The number of steps is specified, and is neither K nor N:
“ΔTₜ is the total change of air temperature in Kelvins across projection time t”

“So, the simplest explanation is that you changed the meaning of delta Ti(K) and forgot to drop F0 from the difference in forcing”
Well, Eq 5.1 could represent a difference, but there are a lot of words that say it doesn’t:
“Where ±uᵢ is the uncertainty in air temperature, and ±4 Wm⁻² is the uncertainty in tropospheric thermal energy flux due to CMIP5 LWCF calibration error. The remaining terms of equations 5 are defined as for equation 1. In equations 5, F₀+ΔFᵢ represents the tropospheric GHG thermal forcing at simulation step “i.” The thermal impact of F₀+ΔFᵢ is conditioned by the uncertainty in atmospheric thermal energy flux.”

Reply to  Nick Stokes
September 25, 2019 9:24 pm

Nick, “Once again, the paper is titled “propagation of error”.

Propagation of error calculates the uncertainty in a result. See here.

Matthew R Marler
Reply to  Matthew R Marler
September 24, 2019 10:03 am

Nick Stokes: Once again, the paper is titled “propagation of error”.

Yep, that’s the title. The paper is about the propagation of uncertainty.

The number of steps is specified, and is neither K nor N:
“ΔTₜ is the total change of air temperature in Kelvins across projection time t”

The number of steps is not represented in the equation.

I don’t know where Matthew Marler got the idea that he was doing something with correlations.

The correlations appear where appropriate in eqns 3 and 4. In Eqns 5, the sequence of annual increments is deterministic, conditional on the value of the cloud feedback parameter and its random error, so correlations do not enter the calculation. If the process were a random walk, as you seem to persist in believing it to be, with a random increment each year, then the correlations of the successive increments would need to be included in calculating the error variances of the annual increments.

You seem to be stumbling over the distinction between error and the uncertainty of the size of the error. You also seem to be stumbling over the distinction between distributions and conditional distributions (c.f. my dice example). If the truth of the value of the cloud feedback parameter, and hence the error in the current estimate, ever became known, then the errors induced in the temp series by the error in the parameter estimate could be propagated through the calculations, as in your response to the meter stick metaphor. Instead, all we have is an uncertainty (a range of probable values) in the parm value, so it is the uncertainty that propagates, as described by Pat Frank.

Reply to  Matthew R Marler
September 24, 2019 10:31 am

” correlations do not enter the calculation”
So what is the basis for adding the uᵢ in quadrature, in Eq 6, if it isn’t an εKε calculation?

“The number of steps is not represented in the equation”
He says the ΣΔFᵢ is being summed over “projection timesteps”. He says ΔTₜ is over projection period t. It’s true that he doesn’t define the timestep right there, but pretty soon he is talking about annual.

Matthew R Marler
Reply to  Matthew R Marler
September 24, 2019 11:55 am

Nick Stokes: It’s true that he doesn’t define the timestep right there, but pretty soon he is talking about annual.

Sometimes I think that you do not bother to read. The number of time steps is not specified in the formula.

Reply to  Matthew R Marler
September 24, 2019 12:11 pm

Well, then, what are the limits of the summation Σ?

Reply to  Nick Stokes
September 24, 2019 2:43 pm

“Well, then, what are the limits of the summation Σ?”

How about “N” where N < infiinity

Reply to  Matthew R Marler
September 24, 2019 3:53 pm

An equation is supposed to tell you how to do a calculation. If you specify a sum, it has to be over a specified (and enumerated) set of numbers. I cannot see how it would be other than the number of time steps here.

Reply to  Nick Stokes
September 24, 2019 4:20 pm

“An equation is supposed to tell you how to do a calculation. If you specify a sum, it has to be over a specified (and enumerated) set of numbers. I cannot see how it would be other than the number of time steps here.”

Since this is an emulation and analysis of the CGM’s forecasts for the climate don’t you suppose the number of steps and their size would be the same as the CGM’s?

September 24, 2019 3:21 am

Note 6: What does the uncertainty interval mean?

Specifically, what does the interval of roughly -15K to +17K at 2100 in Figure 6(B) mean? As it is a +/-1 sigma, is it a 68% confidence interval for something? If so, is it for:

a) each CMIP5 model run’s projection to 2100
b) the ensemble mean’s projection to 2100
c) something else

If not, then what is it?

Reply to  See - owe to Rich
September 24, 2019 4:01 am

“As it is a +/-1 sigma”
It seems to be. It seems the arithmetic is primitive, straight from Eq 6, based on variances. The error bars are
±.42*33*4/33.3 sqrt(n) = ±1.665*sqrt(n)K where n is the number of years elapsed. So in Fig 6, n=100 and it is ±16.65 about each curve at the end. And in Fig 8 n=62 at the end and it is ±13.1 about the marked curves.

I don’t know where Matthew Marler got the idea that he was doing something with correlations.

Reply to  Nick Stokes
September 24, 2019 4:17 am

BTW, just to point out the failure of dimensions, that is
0.42 (dimensionless)*33K *4 Wm⁻² year⁻¹ * sqrt(n years) = K/sqrt(year).
Or if you prefer 4 Wm⁻², as it seems to be more often lately, then it is K*sqrt(year)
Odd units for a temperature uncertainty.

Reply to  Nick Stokes
September 24, 2019 4:19 am

Left out a term
0.42 (dimensionless)*33K *4 Wm⁻² year⁻¹/(33.3 Wm⁻²) * sqrt(n years) = K/sqrt(year).

Reply to  Nick Stokes
September 24, 2019 2:58 pm

For i = year 1: ±[(0.42 * 33K * 4 Wm⁻² year⁻¹)/F_0]_year_1

= ±[(0.42 * 33K * 4 Wm⁻²)/F_0]_1

±[(0.42 *33K * 4 Wm⁻²)/F_0]_i

=>±[(0.42 * 33K * 4 Wm⁻²)/F_0]_n

The left side of eqn. 5 has an “i” index. That index is not on the right side ±uncertainty term. The “year⁻¹” in “4 Wm⁻² year⁻¹” is cancelled by the 1-year index.

I described that above already, here

You already agreed with the formalism, Nick, here

Which agreement I acknowledged, here

And now you falsely raise the whole thing again, as through it were unsettled.

Reply to  Pat Frank
September 24, 2019 3:09 pm

Nick appears to be a troll, trolling the thread hoping for someone to make a misstatement he can use in the faint hope of falsifying your entire paper.

Reply to  Pat Frank
September 24, 2019 4:01 pm

“You already agreed with the formalism”
I said it could probably be made to work (although it was nuts). But you clearly haven’t made it work.

“And now you falsely raise the whole thing again”
No, I showed what the actual calculation was. The product of five numbers, one (time in years) being subject to a square root. And whether you prefer your Tuesday unit of 4 Wm⁻² year⁻¹ or Wednesday unit of 4 Wm⁻², the units just don’t make sense.

Pat Frank
Reply to  Pat Frank
September 25, 2019 9:28 pm

They do make sense. And they make sense to every serious thinker here, but not to you, Nick.

They’ll never make sense to you here, for reasons of policy.

Reply to  See - owe to Rich
September 24, 2019 2:32 pm

See what Vasquez and Whiting write about propagating empirical uncertainty through a set of calculations, Rich. That’s what it is.

Reply to  Pat Frank
September 25, 2019 4:00 am

Pat, assuming you are referring to my question in Note 6 somewhat further above (dratted WordPress), is not Vasquez and Whiting about uncertainty propagation in general? It is the specific instance of your paper which I want to understand. If I do not, then I have no way of judging if your overall conclusions are well founded. Specifically, your conclusion seems to be that GCM uncertainties at 2100 are so large as to be useless. In that context, what exactly does Figure 6(b) represent so that it affirms your conclusion?

TIA,
Rich.

Reply to  See - owe to Rich
September 25, 2019 9:31 pm

Legend to Figure 6: “Panel (B) the identical SRES scenarios showing the (+/-)1sigma uncertainty bars due to the annual average (+/-)4 Wm^-2 CMIP5 TCF long-wave tropospheric thermal flux calibration error propagated in annual steps through the projections as equation 5 and equation 6.

That’s what it means, and that’s why it affirms my conclusions.

Reply to  Pat Frank
September 26, 2019 2:03 am

Pat, thanks, so the 1-sigma uncertainty bars in Panel B do relate one-to-one to the bars in Panel A, but they are calculated differently. Now I think the bars in Panel A can be used to predict the spread which would occur if new model runs were made. But your point in Panel B is that those runs could have strayed ever so far from reality because of the physical uncertainty in the parameters of those models.

Is that a fair summary? And sorry for being a bit slow on this 🙂

September 25, 2019 3:51 am

Note 7: How does a cloud in 1990 affect me now?

In 1990 my third daughter was born. 9 weeks later I took her to the surgery for a routine check-up. It was scorching hot, and it turned out that around the time of that visit the local town recorded the highest known temperature for my country (since broken a couple of times). But we made it there and back again fine. There was not a cloud in the sky, as you might expect.

How does the lack of clouds that day affect the temperature today? If my country had been Tuvalu, then the lack of clouds would have heated the water, which could then have heated deeper waters, and some of that heat could be returning via deep ocean currents to the atmosphere today. But my country is not Tuvalu, it is England. It is well known that ground does not retain significant heat for any length of time, because of highly effective radiation from warm surfaces, so it is hard to see how the lack of cloud in 1990 could affect temperatures today. It follows that any error, or uncertainty, in modelling clouds over land cannot persist in the error between model and reality today.

Over the seas is another matter however. But now it is not sufficient to consider TCF (Total Cloud Forcing), but rather the distribution of CF over land and sea (and different portions of sea may differ in terms of the long term effects of the forcing). And in this case it is now necessary to take account of correlations across time and space. For example, if OCF (Ocean Cloud Forcing) had low model uncertainty and LCF (Land Cloud Forcing) had a compensating high model uncertainty, very little CF uncertainty should be inferred to propagate from one year to the next. I am not saying this situation obtains, but we can’t simply assume it doesn’t. Across time, if a GCM in one year of calibration has a positive OCF error, but an anticorrelated tendency to a negative error the following year or two (as might be predicated by El Nino/La Nina), then again those values will tend to cancel out.

In conclusion I would say that you can meaningfully linearize the means of GCMs, but you can’t linearize the noise, in the sense of year after year sampling supposed uncorrelated uncertainties in order simply to add up the variances thereof to arrive at the paper’s high uncertainty values.

Or as Nick Stokes, with whom I tend to disagree on climate alarm but am with him on this one, said: “the real problem is that error propagation in (1) has nothing to do with error propagation in a GCM”.

Reply to  See - owe to Rich
September 25, 2019 7:00 am

Forgot to mention “took her to the surgery” meant “walked half a mile with a pram to get there”, and the local town’s own record has not been beaten but other places have done so, one of them this year.

Matthew R Marler
Reply to  See - owe to Rich
September 25, 2019 10:54 am

See-owe to Rich: I am not saying this situation obtains, but we can’t simply assume it doesn’t. Across time, if a GCM in one year of calibration has a positive OCF error, but an anticorrelated tendency to a negative error the following year or two (as might be predicated by El Nino/La Nina), then again those values will tend to cancel out.

In conclusion I would say that you can meaningfully linearize the means of GCMs, but you can’t linearize the noise, in the sense of year after year sampling supposed uncorrelated uncertainties in order simply to add up the variances thereof to arrive at the paper’s high uncertainty values.

What you know is that, across a variety of GCMs, all of them that have been tested to date, the relationship between CO2-induced forcing and temperature is linear, to a high degree of approximation. Whatever effects there might be of some forcings, or parameterizations, cancelling others (we can be pretty sure that they exist because of the extensive “tuning” that has occurred), the result for all the GCMs as they are is that the relationship between their forcing input and temperature output is nearly linear, to a high degree of approximation. I hope you don’t mind the repetition; at the end of thinking what else might be in the models and what the physics really is, that relationship is pretty well established.

Equations 1 and 5, if I understand Pat Frank’s delta-delta clarification of his notation, describe the uncertainty in forecasts from his linear model due to the uncertainty in the cloud feedback parameter used in the GCMs. Why should it not therefor be an accurate approximation to the propagation of uncertainty in the GCMs? Is there anything available now for which a case can be made it would provide a better approximation?

Nick Stokes proposed complete reruns of the GCMs with a range of parameter values (I think he proposed that. He described complete reruns with a range of initial values.) I proposed bootstrapping. Right now these are not really feasible, but perhaps a few values from within the CI for the parameter — say the lower and upper 10% points and the median.

Is there right now a procedure better than what Pat Frank has carried out?

Matthew R Marler
Reply to  Matthew R Marler
September 25, 2019 11:05 am

See-owe to Rich: in the sense of year after year sampling supposed uncorrelated uncertainties

And at the risk of more exasperating repetition, let me repeat that the sampling does not occur year after year. The sampling occurs at the start, with the choice of the parameter value to put into the program (a value that is undoubtedly in error but an unknown error); the uncertainty that accumulates year after year is a result of that single random draw.

Reply to  Matthew R Marler
September 25, 2019 1:25 pm

Matthew, I was sure that the paper said that there was a new random draw each year – will have to check that later.

You wrote “describe the uncertainty in forecasts from his linear model due to the uncertainty in the cloud feedback parameter used in the GCMs”. I’d like Pat to confirm that, as it is the nub of my question in Note 6. It would at least give me some more concrete mathematics to understand and analyze, whereas I am still groping for the linkages between the various components. (That’s it for today UK time.)

Reply to  See - owe to Rich
September 25, 2019 9:38 pm

Rich, “ Across time, if a GCM in one year of calibration has a positive OCF error, but an anticorrelated tendency to a negative error the following year or two (as might be predicated by El Nino/La Nina), then again those values will tend to cancel out.

Cancelling errors in a calibration experiment do not remove uncertainty in a prediction. They combine as a rss uncertainty that conditions the reliability of the prediction.

Cancelling errors do not improve physical theory. They just hide the mistakes in physical theory.

September 25, 2019 7:37 am

Tim (Sep24 6:18am), on uncertainty.

Unfortunately WordPress is not letting me reply directly to your comment. You ask why uncertainties have to be formalized into random variables. The reason is that there is a mathematics of random variables which shows how to combine them. As far as I can tell, even though uncertainty may be a unique animal, the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them. And without that mathematics you ain’t got nothing, or you might have to be over pessimistic and assume the worst case that for a bounded uncertainty interval either endpoint is a feasible value. In that case you have to sum the interval sizes instead of the sophistication of Equation (3).

Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.

Now, I understand that you think that in some cases the uncertainty is not a random variable because it is always the same even if we don’t know exactly what it is. Well, if we knew what it was we would just treat it as a bias and subtract it off. And for a single uncertainty interval if you wanted to claim that any value in that range was equally likely, no-one would stop you unless they had extra knowledge about the instrument. But for combining multiple uncertainty intervals into a single one, the statistics of random variables is the only good tool we have, and at that point we need at a minimum an estimate of the standard deviation for the uncertainty.

Rich.

Matthew R Marler
Reply to  See - owe to Rich
September 25, 2019 10:32 am

See-owe to Rich: And without that mathematics you ain’t got nothing, or you might have to be over pessimistic and assume the worst case that for a bounded uncertainty interval either endpoint is a feasible value. In that case you have to sum the interval sizes instead of the sophistication of Equation (3).

I was planning at some time to ask Nick Stokes if he thought that summing the interval sizes instead of summing the variances and taking their square roots would be a preferable procedure.

Nick Stokes: So what is the basis for adding the uᵢ in quadrature, in Eq 6, if it isn’t an εKε calculation?

Reply to  Matthew R Marler
September 25, 2019 12:29 pm

“Why should it not therefore be an accurate approximation to the propagation of uncertainty in the GCMs?”
“Is there right now a procedure better than what Pat Frank has carried out?”

This is profoundly unscientific reasoning. You have to show that it is an accurate approximation. And “at least it’s something” is not a validation of Pat Frank’s procedure (which is nuts).

I gave a demonstration in my article on DEs of how error propagation and having a common solution are totally uncoupled.

“the uncertainty that accumulates year after year is a result of that single random draw.”
“to ask Nick Stokes if he thought that summing the interval sizes instead of summing the variances and taking their square roots would be a preferable procedure.”
Why should uncertainty accumulate at all?
Suppose you made an error in, say, overestimating the solar constant. The GCM would simply solve for the temperature of a slightly warmer planet. It isn’t an error that accumulates year after year until the seas boil. Same with cloud cover.

Matthew R Marler
Reply to  Nick Stokes
September 25, 2019 5:30 pm

Nick Stokes: It isn’t an error that accumulates year after year until the seas boil.

Once again you are confusing error with uncertainty — the effects of the error accumulate, but you do not know what they are, so the uncertainty accumulates. But the error in this case is in the forecast, not in the sun. Overestimating the solar constant will not make the seas boil.

Some of the posts seem to claim that Pat Frank’s uncertainty analysis of the GCMs can’t be accurate because it opens the possibility that some of the GCMs might forecast physically impossible states. You have claimed above that the error in the model treatment of solar power can’t propagate because then the seas would boil. If the error in the model might produce a physically impossible forecast, and the analysis shows that such a forecast is compatible with the uncertainty in the parameter estimate, that is not a flaw in the analysis. It’s limitation of knowledge and modeling that ought to be recognized.

This is profoundly unscientific reasoning. You have to show that it is an accurate approximation.

He has shown that the approximations that can be tested so far are accurate, most importantly the linearity in the CO2-Temp input-output relationship of the GCMs. Given that, his uncertainty propagation is accurate. We are stuck with forecasts that have not been shown to be accurate (but are defensible), and an uncertainty estimate that is defensible and can’t be improved upon — and is higher than most people imagined possible. The “scientific” approach is to continue to improve on the approximations.

Reply to  Nick Stokes
September 25, 2019 8:13 pm

“the effects of the error accumulate, but you do not know what they are, so the uncertainty accumulates”
They don’t accumulate, as I said. An error in solar constant would simply be reflected by the GCM producing a consistently warmer or cooler world. The uncertainty/error mumbo jumbo makes uncertainty able to do more than actual, realised error.

“If the error in the model might produce a physically impossible forecast, and the analysis shows that such a forecast is compatible with the uncertainty in the parameter estimate, that is not a flaw in the analysis. “
It is not a flaw in the analysis of the simple model. But it refutes the application to the GCM, because the GCM, unlike the simple, has mechanisms that ensure that they will not reach that result. So you are not uncertain about whether a GCM might produce that result – it can’t.

“an uncertainty estimate that is defensible”
It is not defensible. As you have shown, Eq 5 makes no sense for Σ reasons, and PF is not open to fixing it. You haven’t commented on the absurd units of the results. I noted elsewhere the clear mistake in S6.2, where he forms an average by summing over n – the number of GCM-sat interactions, but divides by 20, the number of years. etc etc.

But the big one, of course, is simply not actually analysing the effect of the differential equations that make up the GCM.

Reply to  Matthew R Marler
September 25, 2019 1:12 pm

Well if you do, you’ll get a result which is about sqrt(n) times as big as before, so an uncertainty of 270*O(1)K instead of 30K after 81 years! Not a good idea.

Reply to  See - owe to Rich
September 25, 2019 3:40 pm

“As far as I can tell, even though uncertainty may be a unique animal, the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them.”

How do you get that? Have you read the examples using a ruler assumed to be 12″ long with an uncertainty of +/1 1″? That uncertainty interval grows with each iteration of using the ruler to measure the width of a room. The first iteration has an uncertainty range from 11″ to 13″. The second iteration will then have an uncertainty interval from 10″ to 14″ or +/- 2″. The third interval will have an uncertainty interval of 9″ to 15″ or +/- 3″. The uncertainty grows linearly with each iteration. There is no randomness to this at all.

“you might have to be over pessimistic and assume the worst case that for a bounded uncertainty interval either endpoint is a feasible value.”

That is *exactly* what an uncertainty interval means!

“Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.”

A covariance is a relationship between two *random* variables. The uncertainty is not a random variable. The uncertainty interval doesn’t assume random values at each step.

“Well, if we knew what it was we would just treat it as a bias and subtract it off.”

You are still confusing error and uncertainty. You can’t subtract off uncertainty like you can error or bias. It doesn’t work that way. To reduce uncertainty you have to work on the areas of uncertainty to make them more certain.

“But for combining multiple uncertainty intervals into a single one, the statistics of random variables is the only good tool we have, and at that point we need at a minimum an estimate of the standard deviation for the uncertainty.”

Again and again and again – uncertainty is not a random variable, it has no standard deviation. It is not the same thing as error. Go back to the ruler example. No amount of statistical analysis will change the fact that the ruler has an uncertainty of +/- 1″. The only way to lessen the uncertainty is to work on the ruler so that it’s results become less uncertain!

Reply to  Tim Gorman
September 25, 2019 4:10 pm

“The uncertainty grows linearly with each iteration. There is no randomness to this at all.”
But Pat’s uncertainty grows as sqrt(n). He adds in quadrature. No-one has any coherent explanation for that.

Reply to  Nick Stokes
September 25, 2019 5:20 pm

“But Pat’s uncertainty grows as sqrt(n). He adds in quadrature. No-one has any coherent explanation for that.”

Have you actually read Pat’s document for meaning? What does Eq 6 say?

Reply to  Tim Gorman
September 25, 2019 6:30 pm

Eq 6 says he does it. But no-one can say why.

“uncertainty is not a random variable, it has no standard deviation”
From Eq 3:
“the uncertainty variance propagated into x is:”
From Eq 4:
“a serial propagation of physical error through n steps yields the uncertainty variance in the realization of the final state,”

Standard deviation is the square root of variance.

What do you think is being added in Eq 6?

Reply to  Nick Stokes
September 25, 2019 8:07 pm

““uncertainty is not a random variable, it has no standard deviation””

This is a misstatement and is misleading. My apologies. Uncertainty defines an interval not a probability function. It does not describe the probabilities for the various amounts of error that can occur within that uncertainty interval so it can’t give the variance of the error function. It can only tell you the interval in which the true answer lies. The uncertainty interval does have a relationship to the systemic error associated with a process. If it were possible to eliminate *all* error then of course the uncertainty interval would be zero. That is certainly not the case with the climate models.

What do I think is being added? Exactly what he said:

“For example, in a single calculation of x = f(u,v,…), where u, v, etc., are measured magnitudes with uncertainties in accuracy of ±(σu,σv,…), then the uncertainty variance propagated into x is,”

“When states x0,., xn represent a time-evolving system, then the model expectation value XN is a prediction of a future state and σ2XN is a measure of the confidence to be invested in that prediction, i.e., its reliability.”

I simply don’t get what is so hard about this.

Reply to  Tim Gorman
September 25, 2019 8:57 pm

“Exactly what he said”
Well, he said what he did. But how is it to be justified? It relates to the question of is u random. If uᵢ are iid random variables, then adding variances is right. But iid is a big qualification. If there is covariance, that has to come in. But “uncertainty” is now a mush of maybe random, maybe not. Yet, while there are big conditions needed to be satisfied for random variables, people here are happy to add in quadrature with no such investigation at all.

Tim Gorman
Reply to  Nick Stokes
September 26, 2019 5:54 am

“If uᵢ are iid random variables, then adding variances is right. But iid is a big qualification. If there is covariance, that has to come in. But “uncertainty” is now a mush of maybe random, maybe not.”

Only in *your* mind, Nick. “u” is uncertainty, it is an interval that is related to a random variable but is not itself a random variable. The uncertainty associated with a ruler that is 12″ long +/- 1″ doesn’t change as you make iterative measurements! That +/- 1″ has a cumulative effect on iterative measurements. The impact on the climate models is no different. That might be an inconvenient truth for you to comprehend but it is the truth nonetheless.

Reply to  Tim Gorman
September 25, 2019 11:55 pm

“at least the reference is here to demonstrate it”

Your reference, headed “A Summary of Error Propagation” and caveats at the outset:

“Here are some rules which you will occasionally need; all of them
assume that the quantities a, b, etc. have errors which are uncorrelated and random”

Just what I said here; they need to be iid random variables for this claim of adding in quadrature. Now folks here keep saying that they aren’t random. In which case, it is even less likely that they are uncorrelated. Not a word has been spoken to justify that proviso. Matthew keeps saying that it is just one initial error which compounds. If so, it seems 100% correlated.

Reply to  Tim Gorman
September 26, 2019 12:18 am

Nick, “they need to be iid random variables for this claim of adding in quadrature.

No, they don’t.

The point of that reference was to establish that errors propagate into uncertainties, contradicting your denial of that fact.

Which contradiction you have conveniently by-passed.

Reply to  Tim Gorman
September 26, 2019 1:08 am

“No, they don’t.”
The reference you just linked says they do. I just quoted it.

Now you link back to your long rigmarole, with no quotes to support your contention. But just looking at Vasquez, which you most commonly quote, it’s true that they carelessly give a quadrature expression without explicit proviso in Eq 5. But in the very next paragraph, they say:
“For the case of having correlation among the input variables,” [non-iid]
and list an expression with correlations – not quadrature. You actually quote that in your Eqs 3 and 4. But you never explain why you can drop the correlations.

Reply to  Tim Gorman
September 26, 2019 12:46 pm

I’m just getting back to this today. As a Ph.D. statistician it is blindingly obvious to me that adding uncertainties in quadrature, as Nick Stokes well put it, arises from an assumption of underlying uncorrelated random variables – it’s hard to see how it could be anything else. Nevertheless, I am grateful to Nick for reading the Vasquez reference to confirm that.

Reply to  See - owe to Rich
September 26, 2019 2:56 pm

“an assumption of underlying uncorrelated random variables”

Uncertainty is not a random variable that changes from iteration to iteration. Once the uncertainty is baked in then it stays baked in. A ruler that is 12″ +/- 1″ doesn’t all of a sudden change to 12″ +/- 0.5″ on the third iteration of its use.

Reply to  Tim Gorman
September 27, 2019 3:03 am

Tim [Sep26 2:56pm],

Your ruler example can be viewed as a repeated use of a single random variate, which is an actual value taken by a random variable in your case uniformly distributed in (-1,1) inches. So you are right that if we use the ruler say 20 times we will get an error of 20x and because we don’t know x, even though we believe it iexists as a number, we have to allow an uncertainty interval of (-20,20).

But in Pat’s paper the uncertainties are added “in quadrature” as Nick writes it. This comes about not by repeated use of a single random variate, but multiple samples, assumed uncorrelated, from the uncertainty distribution. In your case this would mean using 20 different rulers, each with uncertainty, or error as many people would actually call it, in the range of (-1,1). Assuming uniformity again, the variance is 1/3 and the final uncertainty would be taken to be +/-sqrt(20/3) = +/-2.58 for a 1-sigma interval or 5.16 for a 2-sigma interval. With the latter, the chance that the measurement was incorrect by more than 5.16 inches would be about 5%.

I hope this helps.

Reply to  See - owe to Rich
September 25, 2019 9:46 pm

Rich, “the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them.

It doesn’t, actually. It just shows that scientists will use a method that yields a useful measure of reliability, even if the closed form axioms are not known to be sustained.

See Vasquez and Whiting, “When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:

beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i.

Rich, “Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.

The calibration error statistic is a fixed value that does not covary.

Well, if we knew what it was we would just treat it as a bias and subtract it off.

And if the error is unpredictably variable? And if there is no error information about predicted future states?

Reply to  Pat Frank
September 27, 2019 4:06 am

1. Rich, “the combination of them as in, for example, Pat’s Equation (3), assumes that there is a hidden random variable underneath them.”

It doesn’t, actually. It just shows that scientists will use a method that yields a useful measure of reliability, even if the closed form axioms are not known to be sustained.
See Vasquez and Whiting, “When several sources of systematic errors are identified, beta is suggested to be calculated as a mean of bias limits or additive correction factors as follows:
beta ~ sqrt[sum over(theta_S_i)^2], where i defines the sources of bias errors and theta_S is the bias range within the error source i.”

Answer: that is how you get junk science, when scientists use a method when its axioms and conditions are not known to be sustained. I’m not saying that your science is junk, but as of now I’m not saying that it’s not either.

2. Rich, “Note that that equation uses covariances, and those can be negative, so ignoring them and using the root sum square of the variance terms does not produce a provable lower bound.”
The calibration error statistic is a fixed value that does not covary.

Answer: if it is a fixed value then not only does it not have covariance amongst samplings of itself, it also does not have variance. Therefore it is an unknown fixed value, and its uncertainty intervals must be added like Tim Gorman’s single ruler example, not added “in quadrature”.

3. “Well, if we knew what it was we would just treat it as a bias and subtract it off. ”
And if the error is unpredictably variable? And if there is no error information about predicted future states?

Answer: but you just said that it is a fixed value, so how can it also be unpredictably variable? Which day of the week is it now, and in which Wonderland are we residing?

1sky1
September 25, 2019 4:27 pm

Pat Frank’s (2019) abstract points attention to substantial errors within CMIP5 climate models in simulating the global cloud fraction. It then claims:

The resulting long-wave cloud forcing (LWCF) error introduces an annual average ±4 Wm–2 uncertainty into the simulated tropospheric thermal energy flux.

What the cited source for this uncertainty, Lauer and Hamilton (2013) asserts, however, is stunningly different, both physically and statistically:

The CF is defined as the difference between ToA all-sky and clear-sky outgoing radiation in the solar spectral range (SCF) and in the thermal spectral range (LCF)…

The rmse of the multimodel mean for SCF is 8 W m−2 in both CMIP3 and CMIP5….For CMIP5, the correlation of the multimodel mean LCF is 0.93 (rmse = 4 W m−2) and ranges between 0.70 and 0.92 (rmse = 4–11 W m−2) for the individual models.

Clearly, L&H’s empirical error estimates (based upon a 20-yr comparison of model simulations and satellite measurements) pertains strictly to ToA power flux densities. It says little about the troposphere. Moreover, the empirical error is always specified as a fixed rms value. This standard error specification is necessarily in the same units as the signal itself and applies indefinitely; it cannot be projected per annum as Frank would have it. The fact that annual average data were used in the comparison implies only that there’s no involvement of seasonal and diurnal cycles in the given error estimates. There’s simply no pertinent annual rate of change here, such as with the “CO2-forcing” signal.

Nor does the specification of standard error for annually averaged data imply anything about the one-step autocorrelation of that error and its predictability at any step. Contrary to assumption here, the “prediction error” of the models, in any rigorous sense, is not involved at all. Later this week I’ll demonstrate how Frank’s own analysis of one-step autocorrelation of error totally undermines his projection of how it propagates.

Reply to  1sky1
September 25, 2019 5:18 pm

All true
“What the cited source for this uncertainty, Lauer and Hamilton (2013) asserts, however, is stunningly different, both physically and statistically”
It isn’t even the uncertainty of a global (spatial) average. It seems clear from the description that he computes the spatial correlation, ie between GCM/obs values at grid points. That is also the focus of Taylor’s paper, which Lauer seems to be following. That correlation is then converted to the 4 W/m2. It is the uncertainty over space, not time.

Reply to  Nick Stokes
September 25, 2019 6:16 pm

What is –

∓cloud-cover-unit year-1 x Wm-2/(cloud-cover-unit)

Please show where the math in Supplemental Section 6.3 is incorrect.

The distance a car travels is a spatial sum. Yet you can calculate the miles/year value for the car thus turning a spatial calculation into one with a time relationship. Why can’t you do the same for cloud cover?

Reply to  Tim Gorman
September 25, 2019 6:44 pm

“Please show where the math in Supplemental Section 6.3 is incorrect.”
S6.2 is clearly incorrect. He sums a set of discrepancies e_{i,g} in g over n, “where “n” is the number of simulation-observation pairs evaluated at grid-point “g” across the 20-year calibration period.”. But then, to get the average, he divides by 20 years. There is no basis for that. And it’s where is year⁻¹ unit nonsense comes from.

Reply to  Nick Stokes
September 25, 2019 8:23 pm

“the annual mean simulation error at grid-point g, calculated over 20 years of observation and simulation, is”

“where “n” is the number of simulation-observation pairs evaluated at grid-point “g” across the 20-year calibration period. Individual grid-point error ei,g is of dimension”

You’ve got a 20 year value for cloud-cover so you divide by 20 years to get an annual value. “20 year mean” means something.

I simply don’t see what is so difficult about that.

Reply to  Nick Stokes
September 25, 2019 8:51 pm

“I simply don’t see what is so difficult about that.”
It’s just nonsense. To get an average, you sum and then divide by the number of things summed. By n, not by a fixed 20 years. If you had twice as many observations, that wouldn’t necessarily change the average. But with that formula, since you divide by a fixed 20, it would double it.

Reply to  Nick Stokes
September 25, 2019 11:38 pm

Nick, “If you had twice as many observations,…

Observations of what magnitude error, Nick? Increasing the number of observations may reduce the average.

Once again you make a simplistic mistake as soon as the arena is science.

Reply to  Nick Stokes
September 25, 2019 11:52 pm

Nick, “But then, to get the average, he divides by 20 years. There is no basis for that.

From Lauer and Hamilton, “An analysis of the spread in the 20-yr-mean LWP among the ensemble members of individual models. …

Figure 1 shows the 20-yr annual mean liquid water path averaged over the years 1986–2005 from 24 CMIP5 models …

FIG. 1. The 20-yr average LWP (1986–2005) from the CMIP5 historical model runs and the multimodel mean

A measure of the performance of the CMIP model ensemble in reproducing observed mean cloud properties is obtained by calculating the differences in modeled (xmod) and observed (xobs) 20-yr means (my bolding)”

Twenty-year model and observational means are described throughout Lauer and Hamilton.

But you already knew that, didn’t you Nick.

Your “no basis” must have been an innocent mistake, mustn’t it.

Reply to  Nick Stokes
September 26, 2019 12:35 am

“Twenty-year model and observational means”
This is just idiotic. To get a mean or average, you divide a sum of things by the number of things. Not by 20 years because someone mentioned that they compiled the things (differences) over a twenty year period.

Here is the section of the SI with Eq 6.2. The point is of some importance because it is where the year⁻¹ nonsense enters.

“Increasing the number of observations may reduce the average.”
The mean of observations is used as a population mean estimate. So there is no basis for that. In any case, this is yet another case of you and your defenders trying to excuse something that is mathematically wrong by saying that, well, you can’t tell, it just might come out all right.

Reply to  Nick Stokes
September 26, 2019 5:49 am

‘”The mean of observations is used as a population mean estimate. So there is no basis for that. In any case, this is yet another case of you and your defenders trying to excuse something that is mathematically wrong by saying that, well, you can’t tell, it just might come out all right.”

Any additions of observations that shift the population makeup can affect the population mean, either positively or negatively. Only if the additional observations equal the mean is there no change in the mean.

Reply to  Nick Stokes
September 26, 2019 12:06 am

Nick, “It isn’t even the uncertainty of a global (spatial) average.

From Lauer and Hamilton, “In both CMIP3 and CMIP5, the large intermodal spread and biases in CA and LWP contrast strikingly with a much smaller spread and better agreement of global average SCF and LCF with observations. The SCF and LCF directly affect the global mean radiative balance of the earth, so it is reasonable to suppose that modelers have focused on ‘‘tuning’’ their results to reproduce aspects of SCF and LCF as the global energy balance is of crucial importance for long climate integrations.

Further, “FIG. 7. Biases in simulated 20-yr-mean LWP from (left) the (top to bottom) four individual coupled CMIP5 models and (middle) their AMIP counterparts, with the smallest global average rmse in LWP. (right) The biases in annual mean SST in the coupled runs(my bold)”

Wrong again, Nick. But not deliberately so. Not at all.

Reply to  Pat Frank
September 26, 2019 2:02 am

The RMSE is the RMS of discrepancies over space, not time. Here (from here) is what the author, Lauer, had to say:

“I have contacted Axel Lauer of the cited paper (Lauer and Hamilton, 2013) to make sure I am correct on this point and he told me via email that “The RMSE we calculated for the multi-model mean longwave cloud forcing in our 2013 paper is the RMSE of the average *geographical* pattern. This has nothing to do with an error estimate for the global mean value on a particular time scale.”.”

Reply to  Nick Stokes
September 26, 2019 8:06 am

“The RMSE is the RMS of discrepancies over space, not time. Here (from here) is what the author, Lauer, had to say:”

I believe I already asked this. A car’s odometer has an uncertainty associated with it and the odometer measures a quantity in space and not time. Yet we can certainly take the increase of the odometer over the period of a year and call it miles/year driven, a measure in time. The uncertainty in space then becomes an uncertainty in time. The uncertainty doesn’t just disappear or go away because you are now using a measure in time. The uncertainty over the measure of a mile by the odometer adds for every mile measured, just like the uncertainty of a measurement made by the ruler that is 12″ +/- 1″ adds with every iteration of it being used to measure something longer than 12″. When that uncertainty in the total of the odometer over a year is evaluated at the end of the year the uncertainty doesn’t just disappear.

Why is this concept so damnably difficult for you to grasp?

Do you disagree? Do you *really* think that a measure of miles/year has no uncertainty.

Reply to  Nick Stokes
September 26, 2019 10:19 am

“Why is this concept so damnably difficult for you to grasp?”

It’s nothing like that. Lauer sets it out clearly and it’s even reflected in S6. There is a grid of points on Earth. At each, at various times, they have coincident values from GCM and observation. They get a correlation coefficient, and from that deduce the 4 W/m2. They haven’t formed a spatial average. The correlation doesn’t relate to any particular period of time, as Lauer said.

Why is it so hard to grasp that if you are going to base a whole theory of failure of GCMs on that one number, 4 W.m2, you need to know what it actually is?

Reply to  Nick Stokes
September 26, 2019 2:59 pm

“Why is it so hard to grasp that if you are going to base a whole theory of failure of GCMs on that one number, 4 W.m2, you need to know what it actually is?”

In other words you still don’t believe that the number of miles driven in a car (a spatial scalar) over the period of a year can become miles/year. Got it.

Reply to  Nick Stokes
September 28, 2019 1:25 pm

Nick, “They get a correlation coefficient, and from that deduce the 4 W/m2.

From Lauer and Hamilton, page 3831: “A measure of the performance of the CMIP model ensemble in reproducing observed mean cloud properties is obtained by calculating the differences in modeled (x_mod) and observed (x_obs) 20-yr means.

From page 3833: “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means. (my bold)”

Page 3842, “Our analysis of the root-mean-square error of simulated LWP, CA, SCF, and LCF supports our findings on little to no changes in the skill of reproducing the observed LWP and CA.

It’s very clear that the rmse in LWCF is derived from the geographical distribution of (simulation minus observation) errors over a 20 year calibration time.

The correlations describe the linear coherence between simulated cloud properties and observed cloud properties.

Correlations are *not* used to deduce the 4 W/m2.

Nick either did not read Lauer and Hamilton, or has seriously deficient reading skills. Or?

Reply to  Nick Stokes
September 29, 2019 12:42 am

“Correlations are *not* used to deduce the 4 W/m2.”
They are. The very brief reference to that figure you have based your analysis on says:
“For CMIP5, the correlation of the multimodel mean LCF is 0.93 (rmse = 4 Wm⁻²)”
and they explain how they get that (p 3833). They form a polar plot – a Taylor diagram, which plots sd against correlation (theta). Then they deduce rmse as proportional to a linear distance on this plot.

“It’s very clear that the rmse in LWCF is derived from the geographical distribution of (simulation minus observation) errors over a 20 year calibration time.”
Yes, it is. But that is the rmse associated with variation at each grid-point, as measured at points with coincident obs/GCM values during a 20 year period. It is not the variation of a global spatial average, as you are treating it. It does not represent a variability you can associate with a forcing. In fact, it will be much attenuated once you do form a global average. And it has no per year status, as Lauer said.

Reply to  1sky1
September 25, 2019 5:32 pm

“Later this week I’ll demonstrate how Frank’s own analysis of one-step autocorrelation of error totally undermines his projection of how it propagates.”

And again we see the confusion of error with uncertainty.

Reply to  1sky1
September 25, 2019 11:11 pm

1sky1, “it cannot be projected per annum as Frank would have it.

L&H page 3833, “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. (my bold)”

Are you going to argue, as Nick Stokes does 1sky1, that “annual mean” does not mean annual mean? Is object incoherence a practice of art for you, as it is for Nick?

1sky1, “Nor does the specification of standard error for annually averaged data imply anything about the one-step autocorrelation of that error and its predictability at any step.“(my bold).

So, it’s all annually averaged after all, including annual mean uncertainty. Uncertainty, not error, 1sky1. Arguing error is a fatal mistake. Uncertainty due to calibration error stemming from model theory-error necessarily appears in every simulation step of a futures projection.

1sky1, “Clearly, L&H’s empirical error estimates … pertains strictly to ToA power flux densities. It says little about the troposphere.

You don’t stand a chance of making that case, 1sky1. And you’d not even try unless you’re an adherent of Stokesism.

From Hartmann, et al, below: “The largest contribution to net cloud forcing are provided by low clouds, especially in the tropical stratus cloud regions and the summer hemisphere(Fig. 21). Low clouds are abundant, and act to reduce the radiation balance by reflecting solar radiation. High thick clouds also provide significant reductions to the radiation balance, since they reflect more solar radiation than they trap longwave emission. High- and middle-level clouds with relatively low optical depth do not enter very strongly into the regression equations for net radiation. High thin clouds make a small positive contribution to the net radiation in tropical latitudes — amounting to about 5 W m^-2 in zonal average. (Fig 22).

Quote from p. 1299 of Hartmann, et al., (1992) The Effect of Cloud Type on Earth’s Energy Balance: Global Analysis J Climate. 5(11), 1281-1304.

That quote is just one of what could be many from Hartmann, where they discuss the impact of clouds on the thermal energy flux of the troposphere.

See also Figure 7.1 of the IPCC 5AR for a graphical rendering of the impact of longwave cloud forcing on global tropospheric air temperature.

From Figure 7.1 legend: “Overview of forcing and feedback pathways involving greenhouse gases, aerosols and clouds. … Feedback loops, which are ultimately rooted in changes ensuing from changes in the surface temperature, are represented by curving arrows (blue denotes cloud feedbacks;…)(my bold)

Surrounding text includes,”Figure 7.1 illustrates key aspects of how clouds and aerosols contribute to climate change, and provides an overview of important terminological distinctions. … Rapid adjustments (sometimes called rapid responses) arise when forcing agents, by altering flows of energy internal to the system, affect cloud cover or other components of the climate system and thereby alter the global budget indirectly. … As shown in Figure 7.1, adjustments can occur through geographic temperature variations, lapse rate changes, cloud changes and vegetation effects. (my bold)”

See also Figure 7.3, wherein, “the brown arrow symbolizes the importance of couplings between the surface and the cloud layer for rapid adjustments. (my bold)”

AR5, under 7.2.1.2, “The net global mean CRE (cloud radiative effect) of approximately –20 W m^–2 implies a net cooling effect of the clouds on the current climate. Owing to the large magnitudes of the SWCRE (short wave cloud radiative effect) and LWCRE (long wave cloud radiative effect), clouds have the potential to cause significant climate feedback (Section 7.2.5).

5AR under 7.2.5: “Until very recently cloud feedbacks have been diagnosed in models by differencing cloud radiative effects in doubled CO2 and control climates, normalized by the change in global mean surface temperature. … Moreover, it is now recognized that some of the cloud changes are induced directly by the atmospheric radiative effects of CO2 independently of surface warming, and are therefore rapid adjustments rather than feedbacks (Section 7.2.5.6).

Not a chance, 1sky1.

1sky1
Reply to  Pat Frank
September 26, 2019 4:38 pm

So, it’s all annually averaged after all, including annual mean uncertainty. Uncertainty, not error, 1sky1. Arguing error is a fatal mistake.

When someone clings obsessively to the bizarre notion that straightforward algebraic averaging of any given time-series over yearly intervals introduces an “annual mean uncertainty,” thereby changing the dimensions of the data, then there’s no chance of any rational discussion.

Reply to  1sky1
September 27, 2019 4:14 am

1sky1, I agree, but the only way to demonstrate error , that is scientific and logical error, in order to further the rational discussion, is to use mathematics to dispute what was asserted. If you could do that it would be great. I might get around to it, but I am frying other fish right now.

Rich.

1sky1
Reply to  See - owe to Rich
September 27, 2019 2:05 pm

With only minutes to spare in my daily schedule for blog activity, I must leave the demonstration of the mistaken treatment of modeling uncertainty to others. Fortunately, there is another blog that has taken up this murky issue competently, offering Monte Carlo simulations of both the systematic forcing error and the putative error arising from Frank’s mistaken assumption of a random-walk type of yearly error propagation. See:comment image

Reply to  See - owe to Rich
September 28, 2019 1:47 pm

There is no such, Frank’s mistaken assumption of a random-walk type of yearly error propagation.

Uncertainty is not about error specification.

It’s about reliability in a predicted result where no knowledge of error is available.

Uncertainty analysis includes no assumption about the structure of error.

You folks are consistently wrong.

Reply to  Pat Frank
September 29, 2019 4:51 am

Pat,

They keep saying your analysis doesn’t work properly for determining the error in the output of the GCM’s. I simply cannot believe that it is so hard to understand the difference between uncertainty and error. If the uncertainty is greater than the change trying to be recognized then you simply don’t know if the change is really true. It’s the same as calculating out to the hundreths when your measuring device only resolves to the tenths! Mathemeticians and computer programmers seem to have no concept of significant digits or uncertainty.

1sky1
Reply to  See - owe to Rich
September 30, 2019 4:52 pm

Uncertainty is not about error specification. It’s about reliability in a predicted result where no knowledge of error is available.

If “no knowledge of error is available,” from where does the 4 W/m^2, whose square you accumulate annually, come? From divine imagination, just like your notion that a process whose variance increases linearly with time is not a Wiener process–a 1-D version of a random walk?

Reply to  1sky1
September 28, 2019 1:42 pm

I changed no dimensions, 1sky1.

It’s not “any given time-series” It’s an error series.

The LWCF rmse is the mean annual standard deviation of error from a 20-year calibration-of-simulation experiment. It generally characterizes the resolution of CMIP5 simulations of TCF.

If you’re going to obsessively assert what is not present, then your conclusion is true.

1sky1
Reply to  Pat Frank
September 30, 2019 1:37 pm

No matter what the nature of the series, the mathematical operation of averaging data in barely non-overlapping intervals cannot change the dimensions of the series. The rmse of ToA LW emissions reported by Lauer and Hamilton is the standard deviation of the all-model aggregate average of the individual 20-yr model-error series. Because there is autocorrelation in those series, that deviation is definitely NOT the “mean annual standard deviation” of the individual years, as you have it.

Reply to  1sky1
September 29, 2019 5:17 am

Here is something from that other discussion:

“As Gavin Schmidt pointed out when this idea first surfaced in 2008, it’s like assuming that if a clock is off by about a minute today, that tomorrow it will be off by two minutes, and in a year off by 365 minutes. In reality, the errors over a long time are completely unconnected with the offset today.”

If the clock was correct yesterday and off by a minute today then it is *NOT* an offset bias that caused the difference. It is an in-built error in the time-keeping mechanism. If that error mechanism is unknown then there is an uncertainty in the output of the clock at any point in time. That uncertainty will grow as the actual error displayed by the clock grows since most clocks don’t randomly gain and lose time. If the actual growth in the error over time can be determined then the uncertainty growth can be determined as well by growing it over time.

September 25, 2019 7:01 pm

“again we see the confusion of error with uncertainty”
Just dumb parroting. From the title:
“Propagation of Error and…”
From the abstract:
“Linear projections are subject to linear propagation of error”
From the key-words
“propagated error,”
From the intro
“Propagating physical errors through a model is standard”
etc etc

Reply to  Nick Stokes
September 25, 2019 8:43 pm

“Just dumb parroting. From the title:”

I’ll admit some of the wording is perhaps confusing. But the math isn’t.

You still can’t get over the fact that you can’t eliminate uncertainty by assuming it is a random variable. Get over it! The example of a 12″ +/- 1″ ruler being used to measure the width of a room should have been enough of an example for you to figure this out. The uncertainty grows with every iteration of its use. You can’t “average” it away! It’s the same for the climate models, you can’t “average” the uncertainty away. It grows with each iteration. The only way to lower the overall uncertainty is to make the models more accurate, i.e. lower that initial uncertainty interval.

Reply to  Nick Stokes
September 25, 2019 11:23 pm

As usual, you display no knowledge of science, Nick.

Propagation of error through a calculation yields uncertainty in a result.

It doesn’t matter that I posted a large selection of published literature showing that to be the case.

You’ll evidently continue in denial anyway. Science denial. What a concept, hey?

Reply to  Pat Frank
September 27, 2019 4:24 am

Pat, Nick is not denying science and mathematics, merely your unquestioning use of aspects of it. Here are 3 ways to accumulate uncertainty.

1. Add up the intervals. That is correct in the case of Tim Gorman’s single ruler used many times.

2. Add up the inferred variances and take the square root (“adding in quadrature”). That is correct in the case of using many rulers, under the assumption at least that no pair of them were cut from a 24+/-0.01″ piece of wood. If that assumption is violated then there is severe anti-correlation.

3. Add up the inferred variances and covariances weighted by the appropriate mathematics which you have cited. This is the correct way, but reduces to No. 2 if it turns out all the covariances are statistically indistinguishable from 0. As far as I can tell you haven’t looked at a relevant covariance matrix so you can’t tell.

Reply to  See - owe to Rich
September 28, 2019 1:06 pm

An average calibration error statistic has no covariance, Rich.

1sky1
Reply to  See - owe to Rich
September 30, 2019 2:08 pm

A single, mis-scaled ruler used consistently over time introduces only a static scaling error, which may be unknown, but is easy to spot when accurate rulers are available. Even in the absence of such rulers, if the error-statistics of the production run of faulty rulers are known, then measurements made by a large-enough set of such rulers provide useful results via their aggregate average. That’s the principle that GCMs, which provide no bona predictions of actual climate, rely upon in their simulations of “climate scenarios.”

Contrary to Frank’s claim of “no covariance,” straightforward scaling error is always perfectly coherent at all frequencies with the underlying signal. That signal is known to be very significantly autocorrelated , which totally invalidates his presumption of additivity of stochastically independent variances.

Reply to  Nick Stokes
September 25, 2019 11:31 pm

There’s no confusion of terms or in wording, Tim.

Suppose you measure some quantities a; b; c; … with uncertainties ∂a; ∂b; ∂c; … . Now you want to calculate some other quantity Q which depends on a and b and so forth. What is the uncertainty in Q?Harvard physics 86 kb pdf.

That reference won’t stop Nick from endlessly repeating his distruth, but at least the reference is here to demonstrate it.

Reply to  Pat Frank
September 26, 2019 7:46 am

Pat,

“There’s no confusion of terms or in wording, Tim.”

Not to me anyway. To someone dead set on finding *anything* to stick to the wall it’s any shelter in the storm.

Matthew R Marler
September 25, 2019 7:14 pm

Nick Stokes: So what is the basis for adding the uᵢ in quadrature, in Eq 6, if it isn’t an εKε calculation?

That would be an interesting question if anything else could be agreed. Outstanding disagreements seem to revolve around.

1 In the GCMs, the relationship between CO2 forcing input and temperature output is linear. Nick Stokes asked how that was possible if the GCMs are not accurate models of climate [hope I did not lose anything important in the paraphrase.] That the relationship is linear to a high degree of approximation is supported by analysis; and in turn supports Pat Frank’s uncertainty propagation.

2. Pat Frank’s analytic procedure does not create a random walk with independent sampling of error in each year. The only random variation is in the estimate of the parameter. There is additional uncertainty in the estimate of the standard deviation of that parameter estimate (hence the endpoint of the CI), but no evidence that the standard error of the parameter is small.

3. Representation of the uncertainty in the parameter estimate by a probability model instead of a fixed-width interval (for which the propagation of uncertainty produces a larger spread in the uncertainty of the final projected value.

4. In professional as well as common usage, the word “error” can refer to each of: (1) a particular error value; (2) the uncertainty in the value of an error that is likely present.

As far as I can tell, the critics of Pat Frank are unwilling to admit both that the CO2-Temp relation of the GCMs is highly linear and (b) that relationship can be useful in subsequent calculations.

Reply to  Matthew R Marler
September 25, 2019 7:57 pm

“That would be an interesting question if anything else could be agreed”
How does it depend on that. The statement is there, in the paper. There must be some justification for it. Well, maybe.

“if the GCMs are not accurate models of climate”
No, I said how is it possible if the GCMs are so mired in uncertainty, that any cohereht modelling could emulate them.

” uncertainty in the parameter estimate by a probability model”
So what is the difference between that and saying it is random?

“are unwilling to admit both that the CO2-Temp relation of the GCMs is highly linear”
Not at all. As I said here, a generally linear dependence of ΔT on ΔF is to be expected, and was the basis of much 1-D modelling, up to Manabe and Weatherald and beyond, who did it far better than here. And it is expected to be strongly present in the small part of GCM output related to global surface temperature.

” that relationship can be useful in subsequent calculations.”
It can be useful in analysing Eq 1. It has nothing to do with error propagation in a GCM.

An example of why. Suppose you have a GCM, and also a simple model of the Earth at which TOA outward is fixed at 1361 W/m2. Since that currently is about right, the simple model may well give right answers. But how will the two models respond to uncertainty about insolation? What if it increases? The GCM will come into balance; the simple model will just keep accumulating heat.

Reply to  Nick Stokes
September 26, 2019 12:31 am

Nick wrote, “No, I said how is it possible if the GCMs are so mired in uncertainty, that any cohereht modelling could emulate them.

Showing ignorance of the difference between error and uncertainty.

Nick, “a generally linear dependence of ΔT on ΔF is to be expected,

It appears the trauma of Jerry Browning has had its effect on you.

Nick, “The GCM will come into balance; the simple model will just keep accumulating heat.

Eqn. 1 isn’t “a simple model of the Earth.” It’s a model demonstrating that GCM air temperature projections are simple extrapolations.

Also, for the gazillionth time, uncertainty is not error.

A point that also combines to invalidate, “It has nothing to do with error propagation in a GCM

Reply to  Nick Stokes
September 26, 2019 11:58 am

Hi Nick,

A while back on this thread, I inquired if you had worked with DEs in the area of finance, to which you kindly provided references to CSIRO’s development of plug-in models for FENICS (ref. Nick Stokes September 20, 2019 at 12:06 am). I meant to respond at the time, if only to say that you are clearly a very smart person, whose math and modeling skills are vastly beyond mine. Subsequently, I have thought further about the precision vs. accuracy conundrum, and would like to run the following analogy between GCMs and derivative pricing models (DPMs) by you for comment:

Consider a universe of major money-center banks, each of which maintains proprietary DPMs for pricing long-dated interest rate options out to 10 years, or so. While independent, these DPMs are all based on similar references to the academic literature and are calibrated in real-time to reflect current market interest rates (e.g., cash Libors, ED futures, swap rates, Treasury bond yields, etc.), as well as the volatilities of observable instruments. In other words, they are internally consistent with forward rates and volatilities. Presuming that one could obtain concurrent pricing on a strip of forward interest rate options (out to 10 years) from each of the banks, what observations could be made about the DPMs’ 1) precision and 2) accuracy?

Based on my own experience (admittedly limited and dated), I’d posit that the precision of the GCMs would be high, and that price variations among the various banks would be minimal, maybe on the order of a few “bips”. In contrast, the accuracy of the DPMs would be low – I can’t think of any rational person or financial institution that would hold a sizable unhedged short position in these options for any amount of time, neither overnight or a week, let alone anything on the order of years. This is not to say that we lack expertise in how to model interest rate derivatives in a “risk neutral” world, we just have no way of knowing how economic / financial conditions will evolve over time in the real world based on any information we can obtain from current financial market data.

To extend the analogy a bit further, the issue I have with GCMs is not that bright people like yourself have contributed to their current state of development, it’s that so-called “policy makers” are using the GCMs’ results to insist that we cede our personal liberties to them concurrent with massively shorting our current long position in reliable energy sources. While this might work out well for a fortunate few, I think it would be disastrous for most of us.

Thank you.

Reply to  Frank from NoVA
September 26, 2019 12:06 pm

All,

In the first sentence of the third paragraph, GCMs should read DPMs. – F.

Reply to  Frank from NoVA
September 26, 2019 10:00 pm

Frank,
“what observations could be made about the DPMs’ 1) precision and 2) accuracy?”
Personally, I think much nonsense is talked about precision and accuracy. Accuracy is supposed to be discrepancy between measurement and truth, but folks who push this also tend to insist that every measure has uncertainty. IOW we can never know “truth” and so can never determine accuracy. All we can do is compare measures with what we believe to be better measures.

This is particularly true of DPMs. They assign a value to a probability proposition, but you can never test that by any perfect measure, or even a good one. The only test would be market value (when?) but that only tests relative to others’ estimate (probably using similar software). In fact, a lot of the early use of our software was in providing a second opinion (because it was somewhat different).

I don’t think there is a much useful analogy between DPMs and GCMs for error propagation. DPMs are basically heat equation, and fairly stable, although people try to push them into unstable situations. GCMs have wave-like solutions which are very important for error propagation. They run for long basically steady periods, whereas DPM propositions typically have defined periods.

As to GCMs, personal liberties etc, I think of course that is nonsense. We are in a situation where we are making big changes to the atmosphere. That will have effects; that has been known since Arrhenius in 1896. GCMs are our best way of quantifying the future effects, but they didn’t originate the concern about them. Deciding not to try to quantify them at all is not an adequate answer.

Reply to  Nick Stokes
September 28, 2019 1:52 pm

Nick, “That will have effects; that has been known since Arrhenius in 1896.

Arrhenius had no adequate physical theory of climate to make such a determination and neither do you, Nick.

Nor anyone else.

Meanwhile, the climate shows no unusual behavior.

Matthew R Marler
Reply to  Nick Stokes
September 28, 2019 10:59 am

Nick Stokes: “if the GCMs are not accurate models of climate”
No, I said how is it possible if the GCMs are so mired in uncertainty, that any cohereht modelling could emulate them.

” uncertainty in the parameter estimate by a probability model”
So what is the difference between that and saying it is random?

First, picking up on your comment that the A matrix is known, the question is “about what is there uncertainty?” We are certain what parameter value is used in the modeling, but we are uncertain about the relationship between the used number (based on empirical evidence) and the “true” number. It’s the same question as with the early estimates of the gravitational constant in Newton’s inverse-square law: the first estimate was known as soon as it was calculated; the uncertainty revolved around its closeness to the true value.

Second, I touched earlier on the use of the same mathematical theory of probability to represent relative frequencies of occurrence (“aleatory”) and confidence/uncertainty in the truth of propositions (“epistemic”) and the relation ship between them. It’s “obvious” that the random elements in the data induce uncertainty in the parameter estimates computed from them. Qunatifying the relationship between the random variation and the uncertainty is done two principal ways: through Bayes’ Theorem to calculate “credible intervals” for the parameter from prior belief/uncertainty (before data collection) and the random variation in the data; standard errors of the parameter estimate yielding confidence intervals and confidence distributions. They have been much studied. A good recent book, somewhat technical, is “Computer Age Statistical Inference” by Bradley Efron and Trevor Hastie. I also recommend “Confidence, Likelihood, Probability: Statistical Inference with Confidence Distributions” by Tore Schweder and Nils Lid Hjort. A formal limit theorem states that with a sufficiently large sample the credibility distribution and confidence distribution are practically indistinguishable. Not everyone agrees that the “epistemic” probabilities have been shown to be as useful as the “aleatory” probabilities.

Reply to  Matthew R Marler
September 28, 2019 2:11 pm

“Not everyone agrees that the “epistemic” probabilities have been shown to be as useful as the “aleatory” probabilities.”
Some fancy philosophy here. But you are not dealing with the very basic questions of why the error should be compounded in quadrature. Or why it should be compounded at all. Or the basic mathematical howlers, like eq 5.1, which makes no sense either as a change of whole temperature, as you noted, or as difference, which was your alternative (repudiated) scenario. Or of the junior high school level error in S6.2. Or of the units, which are treated with no consistency, but can’t work however you do it.

And, of course, the very basic question of how analysing Eq 1 helps with propagation of error in a GCM.

Matthew R Marler
Reply to  Nick Stokes
September 28, 2019 4:10 pm

Nick Stokes: But you are not dealing with the very basic questions of why the error should be compounded in quadrature. Or why it should be compounded at all.

That is not a “basic” question.

1sky1
Reply to  Nick Stokes
September 30, 2019 2:15 pm

Not only is that question mathematically basic, it is THE fundamental premise of Frank’s entire claim.

1sky1
September 26, 2019 2:35 pm

The muddling of the meaning of the empirical estimate of the rms error in modelling ToA planetary LW emissions (see my comment yesterday ) is compounded by “propagating” this error as if it accumulated in annual steps. The assumption is made that it’s a systematic specification error, analogous to a mis-scaled ruler used for measurement, with a different ruler used every year in a whole chain of mis-measurements.

But rms error specifies only the variability—not the bias–of model output. And no GCM operates by randomly changing its computational algorithms—no matter how unrealistic–in annual, or any other, steps. Nor do GCMs, run in climate simulation mode, make any calculations based upon assimilating data about the actual state of climate. They run the same algorithms with the same set parameters as in the final “calibration” phase. Their only tie to real time lies in the historic specification of “forcing.” The error of such model simulations is entirely unrelated to those of bona fide time-series predictors, which indeed grows with time to an asymptotic value set by total signal auto-covariance. Contrary to assumption here, it tends to remain stable, albeit in a complicated way that requires deeper insight into model specifics.

Frank examines the structure of CMIP5 multi-model-mean total cloud fraction error over a 25-year period, concluding that the “highly autocorrelated lag-1 error (R = 0.97) implies that systematic cloud effects remain in the error residual. This in turn indicates that the CSIRO GCM systematically misrepresented the terrestrial cloud cover.”

While modelling errors reaching beyond 10% are certainly disturbing, what strong lag-1 autocorrelation in a short error-record doesn’t resolve, however, is whether this is due to a truly deterministic system-specification bias or simply the result of failing to model random climate variations with longer periods, such as the ~60yr-oscillation evident in many indices–or perhaps even energetic, low-frequency “red noise.” In any event, it points strongly to the conclusion that intimately-related errors in modeled LW emissions and surface temperatures are NOT stochastically independent year to year. Thus whatever modeling error there may be in the long run, it CANNOT be legitimately propagated as the square root of the cumulative sum of the variance of average yearly errors.

The truly dismal feature of the presentation here is how many fail to realize that stochastic independence at each step is mathematically necessary for Frank’s error formulation to hold.

kribaez
Reply to  1sky1
September 27, 2019 3:10 am

An excellent comment that gets close to the heart of the problem.

Dr Franks is making the mistake of believing that there is a one-suit-fits-all recipe for uncertainty propagation from a calibration error, irrespective of the physical system under study. One simple test to run is to assume that the calibration error in LW cloud forcing is made up of a systemic bias error and an annually imposed random error independently drawn each year from a pdf. It is easily shown that neither type of error translates into the temperature series as an integrating error.

The lag-1 autocorrelation in “LW cloud forcing” is actually expected and predictable from the physics/mathematics of the problem and arises from the relationship between cloud feedback and temperature. It does not imply that LW cloud forcing itself varies as an integrated series; even less does it imply that the total forcing propagates uncertainty as an integrated series. I will try to explain this further in another post.

1sky1
Reply to  kribaez
September 27, 2019 5:14 pm

Unfortunately, the “one-suit-fits-all recipe” for solving very difficult real-world problems is a common academic boondoggle. Instead of admitting any scientific inability, attention is often directed to some aspect of the problem for which the solution is well known.
That is egregiously the case in “climate science,” which places excessive emphasis on idealized treatment of purely radiative processes in the atmosphere, while soft-pedaling the dominant mechanism of moist convection in transferring heat from the aqueous planetary surface. The ensuing cloud formation not only influences LW fluxes, but strongly modulates the surface insolation–the real exogenous forcing of the system.

To keep matters in realistic perspective, please note that the lag-1 autocorrelation referred to here pertains not to any “LW cloud forcing,” per se, but to the cloud fraction, usually reported in oktas.

Reply to  1sky1
September 28, 2019 3:36 pm

1sky1, “Unfortunately, the “one-suit-fits-all recipe” for solving very difficult real-world problems is a common academic boondoggle.

Another inadvertent descent into ironic humor.

Supposing heavily parameterized engineering models are predictive well beyond their calibration bounds.

Reply to  kribaez
September 28, 2019 3:32 pm

kribaez, “An excellent comment that gets close to the heart of the problem.

Yeah. The heart of the problem is that none of you know anything about uncertainty analysis.

Dr Franks ….

Frank, not Franks. But in any case, please call me Pat

… is making the mistake of believing that there is a one-suit-fits-all recipe for uncertainty propagation from a calibration error, irrespective of the physical system under study.

That’s truly funny, kribaez. Linear sums always follow linear propagation of error, no matter the physical system under study.

GCM air temperature projections are linear sums of linearly extrapolated fractional GHG forcing. That’s a qed.

One simple test to run is to assume that the calibration error in LW cloud forcing is made up of a systemic bias error and an annually imposed random error independently drawn each year from a pdf. It is easily shown that neither type of error translates into the temperature series as an integrating error.

The usual, refractory, and very tedious mistake of thinking uncertainty is error.

You wrote, “The lag-1 autocorrelation … arises from the relationship between cloud feedback and temperature.

The lag-1 autocorrelation is in the error due to the between observed and simulated CF. It doesn’t matter how it arises. It reveals the structure of the simulation error.

You wrote, “It does not imply that LW cloud forcing itself varies as an integrated series; …

No, it implies that GCMS do not simulate cloud fraction correctly and that the error includes a linearly deterministic residual.

… even less does it imply that the total forcing propagates uncertainty as an integrated series.

Irrelevant. The lag-1 auto-correlation is never taken to imply that.

I will try to explain this further in another post.

Let’s see: that will be a further explanation of an argument starting from a mistake and extending into an irrelevance.

Reply to  1sky1
September 27, 2019 4:35 am

Please also see my Sep27 4:24am https://wattsupwiththat.com/2019/09/19/emulation-4-w-m-long-wave-cloud-forcing-error-and-meaning/#comment-2807229 , which I wrote before seeing your comment here. You augment my general argument with reasons why autocorrelations/covariances almost certainly exist in the climate system, which affect both the estimation of parameters and the future evolution of model runs.

Reply to  1sky1
September 28, 2019 3:15 pm

1sky1, “The muddling of the meaning of the empirical estimate of the rms error in modelling ToA planetary LW emissions…

There was no such muddling. Evidence posted here.

1sky1, “The assumption is made that it’s a systematic specification error, analogous to a mis-scaled ruler used for measurement, with a different ruler used every year in a whole chain of mis-measurements.

Your description is not correct.

The analogy might be the *same* mis-scaled ruler used to make whole a series of measurements. That ruler is from a production run of rulers that are known to be mis-scaled. However, the error in any given ruler is unknown.

All you have is a calibration average of measurement uncertainty for that production run of rulers. You then use one of them. You have no idea of its specific measurement error. You only have the average calibration error statistic for the production run.

Then you take a series of measurements using that ruler of unknown specific measurement error, and sum them all up into a final total.

The uncertainty is present in every single measurement. If the uncertainty is (+/-)1 mm in measurement 1, then it is (+/-)1.41 mm in the sum of measurement 1+ measurement 2. And so it goes.

The uncertainty in the summed total is the rss of the uncertainty in every individual measurement made. That is error propagation. It is done using the calibration error statistic of the production run.

A much more involved problem, and a much worse problem than you allow, 1sky1.

You wrote, “But rms error specifies only the variability—not the bias–of model output.

No, it does not.

The rms calibration error specifies the difference between simulation and observation: accuracy. The variability of model output is simulation 1 minus simulation 2: precision.

You have made the fatal mistake that is standard among climate modelers. You folks exhibit no understanding of physical error analysis.

You wrote, “And no GCM operates by randomly changing its computational algorithms…

Not an assumption in my work, nor an assumption in determining calibration error.

You’re going further and further afield into mistaken territory, 1sky1.

You wrote, “Nor do GCMs, run in climate simulation mode, make any calculations based upon assimilating data about the actual state of climate.

They’re merely based upon parameters derived from assimilated data about the actual state of the climate over the calibration period.

And those parameters, which produce wrong simulations in the calibration period, are supposed to produce accurate simulations in a futures projection. Great thinking.

You wrote, “They run the same algorithms with the same set parameters as in the final “calibration” phase.

Perfect. And those parameters are found to produce simulation errors when compared with observations over the calibration period. The rms of those errors across models and across the calibration period establish a lower limit of model resolution.

You wrote, “The error of such model simulations is entirely unrelated to those of bona fide time-series predictors, which indeed grows with time to an asymptotic value set by total signal auto-covariance

But growth of uncertainty is not about growth of error, 1sky1. It’s about the reliability of the expectation value.

It’s about not knowing where the simulated climate state is with respect to the physically correct climate state in the climate phase-space.

You wrote, “what strong lag-1 autocorrelation in a short error-record doesn’t resolve, however, is whether this is due to a truly deterministic system-specification bias or simply the result of failing to model random climate variations with longer periods, such as the ~60yr-oscillation evident in many indices–or perhaps even energetic, low-frequency “red noise.”

Except that the error is a 25 year average per model. Random effects will have diminished by 5-fold.

Further, we know that the average TCF error over all 27 models and 20 years = 540 model years, is (+/-)12.1 %, indicating the persistence of error through a large aggregate.

One gets the same fractional CF simulation error whether averaging the per-model CF simulation error, or taking the difference between observed cloud fraction and the average simulated CF of the entire 12 model ensemble discussed in Jiang, et al., (2012) Evaluation of cloud and water vapor simulations in CMIP5 climate models using NASA “A-Train” satellite observations JGR 117, D14105; doi: 10.1029/2011jd017237.

That is, the error fraction does not diminish when all the model simulations are averaged. Did the simulation errors have a random component, the fractional CF error would be reduced in the simulation average.

Nor is it expected of random error to find large inter-model correlations of error in 25-year error means.

You wrote, “the conclusion that intimately-related errors in modeled LW emissions and surface temperatures are NOT stochastically independent year to year.

This conclusion follows your hand-waving dismissal of inter-model correlation of TCF error.

Apart from that bit of negligence, the uncertainty analysis is not about error. You can’t know how error behaves in a futures projection. Your argument here is utterly irrelevant to an uncertainty analysis.

You wrote, “Thus whatever modeling error there may be in the long run, it CANNOT be legitimately propagated as the square root of the cumulative sum of the variance of average yearly errors.

Whatever modeling error there may be in the long run [in a futures projection] will always be unknown. There’s no point in talking about how unknown quantities behave.

The calibration rmse is not an “average yearly error” as you have it. It is the average uncertainty to be expected across every single year of a simulation. Add up the years, the uncertainties combine as the rss.

It’s standard analysis that invariably escapes climate modelers.

You wrote, “The truly dismal feature of the presentation here is how many fail to realize that stochastic independence at each step is mathematically necessary for Frank’s error formulation to hold.

Rather, the truly dismal failure here is how many who suppose they are scientists know nothing about physical calibration error analysis.

My analysis is about uncertainty, not error. It presumes nothing about the structure of simulation error, including nothing about its stochastic independence.

Uncertainty is about the reliability of a predicted result for which no specific error magnitude is available.

Over and yet over again you miss that critically central point.

1sky1
Reply to  1sky1
September 29, 2019 4:35 pm

All you have is a calibration average of measurement uncertainty for that production run of rulers. You then use one of them. You have no idea of its specific measurement error. You only have the average calibration error statistic for the production run.

Then you take a series of measurements using that ruler of unknown specific measurement error, and sum them all up into a final total.

The uncertainty is present in every single measurement. If the uncertainty is (+/-)1 mm in measurement 1, then it is (+/-)1.41 mm in the sum of measurement 1+ measurement 2. And so it goes.

The uncertainty in the summed total is the rss of the uncertainty in every individual measurement made. That is error propagation. It is done using the calibration error statistic of the production run.

A much more involved problem, and a much worse problem than you allow, 1sky1.

No tired repetition of basic misconceptions can provide any fitting explanation of actual uncertainty in GCM modelling of climate. The key misconception here is that the 4 W/m^2 rms error in the all-sky LW ToA emissions estimated for an aggregate average of model runs over 20 years of annually averaged data constitutes a static “calibration error,” one that compounds every year, simply because the data has been averaged yearly. That’s as ludicrous as claiming that a thermometer with a simple scaling factor of 0.98 that reports an average temperature of 93F for 3 minutes a hot summer day compounds its uncertainty every 3 minutes.

The melange of similar misreadings and misdirections promoted here does not merit even a minute’s distraction from football viewing and Oktoberfest.

Reply to  1sky1
September 29, 2019 5:01 pm

sky:

“That’s as ludicrous as claiming that a thermometer with a simple scaling factor of 0.98 that reports an average temperature of 93F for 3 minutes a hot summer day compounds its uncertainty every 3 minutes.”

We are actually being asked to believe that the thermometer can show the temperature as 93.00F. For it to happen that the temperature in the summer over a 3 minute interval with changing wind, humidity, and sun incidence is constant to a hundreth of a degree is proof itself that error is compounding somewhere in the measurement unit. It might be in the thermal inertia in the measurement housing and other infrastructure or in the thermal inertia of the measuring device (i.e. thermistor, etc) itself.

Now, if you want to argue that our measurement infrastructure is only accurate to the nearest degreeF then heck, I’ll be right in there with you. But that assumption alone invalidates the outputs of the climate models in trying to forecast temperature differences in tenths or hundreths of a degree!

1sky1
Reply to  1sky1
September 30, 2019 12:53 pm

Totally missing the point about specious specification of the compounding interval of presumed scaling factor uncertainty in time-series modeling is no way to carry on a logical discussion. The accuracy of in situ measurements is not the issue here.

kribaez
September 27, 2019 4:55 am

Dr Franks,
You wrote:-
“The paper presented a GCM emulation equation expressing this linear relationship, along with extensive demonstrations of its unvarying success.

In the paper, GCMs are treated as a black box. GHG forcing goes in, air temperature projections come out. These observables are the points at issue. What happens inside the black box is irrelevant.

In the emulation equation of the paper, GHG forcing goes in and successfully emulated GCM air temperature projections come out. Just as they do in GCMs. In every case, GCM and emulation, air temperature is a linear extrapolation of GHG forcing.”

No, what happens in the black box is not irrelevant.

I was asked in a previous thread if I accepted your emulation model, and responded that I would accept it if you really had found a source of integrated error in forcing, but that I do not believe that you have. The most disturbing thing about your paper is that you seem to believe that there is a recipe for dealing with a calibration error which applies irrespective of the physical system being examined. My previous challenges to you on this subject have evidently not moved you from your position, so I believe that I am going to have to dissect your emulation model in a lot more detail, and I will do so in a follow-up post, but I would first like to offer you a simple analogy to demonstrate the importance of the physical model to uncertainty propagation.

A building company wants to build 42 identical houses. One of the tasks is to measure and mark the locations of 60 rafters along a wall-plate which abuts a vertical wall. The specification is that the centre points of each rafter should be 41cms apart. The architect (who is not very bright) sets out instructions that the builders should carefully cut a batten measuring 41 cms long. They should then place an offcut from one of the rafters against the vertical wall to mark the edge of the first rafter. After that, they should lay the measuring batten against the first mark and then mark off the edge of the second rafter, and then repeat for the third, and so on.
After the wall plates are marked off, and the first five rafters have been installed, a surveyor tests the location of the 3rd, 4th and 5th rafters for all 42 houses, and calls an emergency meeting with the site foreman. According to the surveyor’s calculations, the rafters from all 42 houses show uniformly distributed errors of (+/-)5mm. He assumes a Uniform distribution, with mean zero and variance 8.33. He also assumes that the architect’s instructions have been followed exactly, which means that he is dealing with an integrating series. Consequently, he calculates that the (additional) variance arising from the addition of a further 55 rafters will be 55 * 8.33, yielding a horrifyingly large sd of over 22mm – an expression of uncertainty in location of the final rafter. The foreman (who is a lot brighter than the architect) tells him not to worry. The error will not be more than (+/-) 5mm in the 60th rafter.

The site foreman has been putting in rafters for many years, so he understands why it is truly stupid to mark off intervals in a way that integrates error, so he ignored the architect’s instructions. What his teams actually did was to lay a 30m rule along each wall plate, and mark off the points at 41cms, 82cms, 123cms, etc . The errors found by the surveyor comprised a bias error of upto (+/-) 4mm (because the end wall – the starting point for the measurements – is never a perfect plane) plus a marking error of (+/-)1mm. By marking off the cumulative distances, the foreman had eliminated the integrating error.

The point of this parable is that you cannot dissociate uncertainty propagation from the physical problem involved. The surveyor’s calculations were reasonable for his assumptions, but he was trying to solve the wrong physical problem.

This is what you are doing with your calculation procedure. I will try in my next post to explain with a more relevant physical model why what you are doing is conceptually invalid.

Reply to  kribaez
September 28, 2019 3:55 pm

I’m not going to go through your numbers, kribaez, but your rafter story makes the wrong analogy.

In your story, the error is known and correctable. If a futures projection, the first is not known and the second is not possible.

A better analogy would be if your contractor had a set of different runs of rafters cut in different lumber yards. Each run is cut to a (+/-)5 mm specification, but each run of rafters has some specific error of unknown magnitude.

The mean of errors is not known to be zero, either within runs or between them.

Your contractor gets rafters of the different runs all mixed together, not knowing which rafter is from which run and not knowing the specific errors.

He calculates an uncertainty in combined final rafter length if he proceeds under those conditions. How does he do it?

kribaez
Reply to  Pat Frank
September 29, 2019 12:45 am

Pat,
“How does he do it?”
Well, yes, he will have to calculate the uncertainty in total length as an integrating series. A classic unit root problem in the underlying statistical model.
But rafters are normally laid spaced out side-by-side at fixed intervals. I was specifically trying to distinguish between an integrating series problem in future (spatial) prediction and a problem where all marked points were based on already accumulated spatial error. Two different problems on the same physical system, but with quite different outcomes in terms of uncertainty calculation for the location of the final rafter.
(It is also a useful lesson for all DIY enthusiasts if they are trying to put screwholes in walls at equal intervals!)

kribaez
September 27, 2019 8:14 am

Dr Frank,
“In the emulation equation of the paper, GHG forcing goes in and successfully emulated GCM air temperature projections come out. Just as they do in GCMs. In every case, GCM and emulation, air temperature is a linear extrapolation of GHG forcing.”
I would like here to discuss some of the inadequacies of your model for what you are trying to do, and to try to demonstrate why there is no basis for treating a calibration error in LW cloud forcing as an integrated time series error in Forcing.

There is no doubt that the majority of GCMs can be faithfully emulated at the aggregate level as Linear Time Invariant systems.
They adhere well to the conservation equation:-
Net flux = Forcing – Restorative Flux
The simplest LTI that can be fitted to GCM results is the single-body constant linear feedback equation. For a fixed step forcing F, in Watts/m2, this can be written as: –
Net flux = CdT/dt = F – λT (1)
Where:
Net flux is the difference between incoming and outgoing flux (positive downward by convention)
λ is the total feedback ( always >0) Watts/m2/K
T is the change in temperature from initial
C = heat capacity of the system in Watt-years/m2/K
and t = time in years
Since this model is linear in T, then it can be solved for an arbitrary forcing series by means of convolution or superposition. I will use the latter.
The solution of the above equation for a fixed step forcing is given by
T = (F/ λ) *(1 – exp(-t/ τ)) where τ = C/ λ (2)
From the discretized superposition equation, for time increments of Δt, we can obtain the solution in recursive form for the nth timestep as:
Tn = F(tn)/ λ *(1 – exp(- Δ t/ τ)) + Tn-1 * exp(- Δ t/ τ) (3a)
Net flux at time tn = F(tn) – λTn (3b)
Energy gain of the system = CT (3c)
Where Tn is the temperature gain from t = 0 to the nth timestep, and F(tn) = the cumulative forcing to the nth timestep.
I do not commend this LTI model as the BEST emulator of GCMs. I do commend it as a much better emulator than your model, and I will try to highlight the three main reasons.
(i) Your emulation model is not actually a single emulation model for each GCM. Your best fitted parameter values are dependent on the frequency content of the input forcing series. This is why your parameter values change (for the same GCM) when you go from scenario to scenario. This problem becomes more evident when you move away from the near monotonic increases in forcing which form the subject of your comparisons. The LTI model does not have this problem.
(ii) Throughout your writings, you have failed to distinguish between an error in flux, an error in net flux (balance) and an error in forcing. Indeed, your emulation model encourages you to treat all flux errors as errors in forcing, since you do not have the degrees of freedom to deal with any other type of error. You have commented, correctly, that an error in flux still means that the climate energy state is incorrect. However, that translates into a propagating error in the response to a forcing, particularly feedback, not into an integrating series error in the forcing itself. The LTI model allows one to calculate net flux as an entity distinct from forcing, and hence allows some intelligent discrimination between these different types of error.
(iii) The LTI model has the property that (a) solutions are additive and (b) the solution for a linear increase in forcing asymptotes to a straight line increase in temperature, properties that are seen in the majority of GCMs. Starting with the LTI model, it is therefore simple to show why it appears that T varies linearly with F(t) if F(t) is a near linear function of time. You cannot however get from your model to the LTI model. In other words, it offers a more general solution than does your emulator, and better reflects the aggregate calculation of the GCMs.
So we can now run some simple tests on the LTI model.
Firstly, the 500 years of spin-up.
How does a (+/-) 4 W/m2 error in cloud flux propagate during the spin-up? Answer is that, in the worst case, it reflects ab initio an uncompensated error in the flux balance, which acts like an initial forcing on the system. The temperature of the system rises/falls if the flux error is positive/negative by exactly (error/lambda) deg K until the net flux is reduced to zero. The systemic error or bias in cloud flux will remain in the system, but introduces no uncertainty into the long-term net flux, which will always go to zero. The temperature change is bounded. In practice, because all forcing runs will subtract the starting temperature to calculate the incremental temperature change, the effect of this temperature change is limited. It does however change the feedback response of the system, which clearly cannot be perfect if the internal components of the energy balance are not correct. We can examine the propagating characteristics of such error in a moment.
What happens if a stochastic uncertainty is added annually to the bias error in cloud flux? Answer is not a lot if they are all independently sampled and added to the bias. They do not represent an integrating error in propagation. I set the cloud flux to a bias error of 4 W/m2 and each year added an uncompensated error drawn from a normal distribution with mean 0 and sd of 0.3. It can be seen from the recursive formula (Eq3a) that the new temperature is dependent on the new forcing increment (which includes this error) as well as the temperature from the previous time-step (which includes all previous errors). The long-term effect is (only) a fluctuation of net flux about zero with a sd less than the sd of the cloud flux error and a compensatory fluctuation of temperature about the system’s stable steady-state value. If, on the other hand, I add each randomly drawn annual sample error to the previous value then the problem explodes very rapidly, as one would expect with an integrating time series. There is however no justification for this.
What is worth noting is that this combination of systemic bias error and random annual error would be large visible contributors to the calibration error which you identify, but as indicated above, they have very little effect on the uncertainty of temperature projection post-spin-up, apart from feedback error.
I will deal with the question of why the LW cloud flux forcing shows a high degree of autocorrelation and the related question of feedback error in a separate post, since this is already too long for comfort.

Reply to  kribaez
September 28, 2019 6:22 pm

kribaez, emulation eqn. 1 only shows that GCM air temperature projections are linear extrapolations of GHG forcing. It emulates output.

I’m not sure what you mean by GCM “aggregate level” and so will ignore that.

You wrote, “Your emulation model is not actually a single emulation model for each GCM.

It is an emulator for each air temperature projection of each GCM.

You wrote, “This problem becomes more evident when you move away from the near monotonic increases in forcing which form the subject of your comparisons.

Figures 3, 7, and 8, as well as SI Figure S4-7, show that eqn. 1 does well reproducing projected GCM air temperatures that reflect the non-linear forcing from volcanic aerosols.

You wrote, “Throughout your writings, you have failed to distinguish between an error in flux, an error in net flux (balance) and an error in forcing.

I am not concerned with any of those. I am concerned with the uncertainty, not error.

My paper is concerned with uncertainty in simulated tropospheric thermal energy flux, following from the the average annual GCM calibration error statistic in simulated cloud fraction (CF). This calibration error puts an uncertainty in the total simulated tropospheric thermal energy flux. It marks a lower limit of GCM resolution.

It’s not about error. It’s about predictive uncertainty.

You wrote, “However, that [incorrect climate state] translates into a propagating error in the response to a forcing, particularly feedback, not into an integrating series error in the forcing itself.

The cloud feedback modifies the forcing. However, the cloud feedback is uncertain because of the simulation error in cloud fraction (CF). This simulation error puts a continuous average annual (+/-)4W/m^2 uncertainty in the net tropospheric thermal energy flux.

That is, the cloud feedback response is not known to sufficient accuracy to reveal the effect of GHG forcing.

In your 3a-3c, each ΔF(i) of the ΔF(n) is never known to better than (+/-)4 W/m^2 because the simulation error in CF leads to a large uncertainty in cloud response in every single step of a futures simulation. That in turn leads to a growing uncertainty in LWCF. That uncertainty is not the same as a growing error. We do not know the behavior of error in a climate futures projection.

After ’n’ simulation steps of a futures projection, the uncertainty in ΔFn is the rss of the uncertainty in each step.

You wrote, “Starting with the LTI model, it is therefore simple to show why it appears that T varies linearly with F(t) if F(t) is a near linear function of time

Eqn. 1 reproduces GCM projected ΔT when ΔF(t) is non-linear in time.

You wrote, “How does a (+/-) 4 W/m2 error in cloud flux propagate during the spin-up? Answer is that, … The systemic error or bias in cloud flux will remain in the system, but introduces no uncertainty into the long-term net flux, which will always go to zero.

You’re treating the uncertainty as an error again. The uncertainty does not go to zero. The error in the observable goes to zero. But it does not go to zero because the simulation is physically correct.

The uncertainty arises because the model deploys a deficient physical theory. That means the physics is not described correctly. The climate state is not described correctly. The cloud response is not described correctly.

The calculated flux has a large implicit uncertainty because of the cryptic error in the description of the state. You don’t know where, exactly, it is — don’t know the flux magnitude to better than (+/-)4 W/m^2.

If the climate state is incorrectly represented, how, then, is it possible to know that the change in CF is correctly simulated, that the forcing change is correctly represented, and that the calculated temperature change is a proper representation of the forcing change?

The physical theory is not good enough to answer any of those questions.

The physical meaning of the calculation is obscure. One can have no confidence in the accuracy of any of the simulated changes in the climate.

The difference in two inaccuracies, each of unknown magnitude, is not an accuracy. It’s not even an error. It’s an uncertainty given by some independently determined calibration statistic.

You wrote, “The temperature change is bounded.

But its uncertainty is large and hidden.

You wrote, “In practice, because all forcing runs will subtract the starting temperature to calculate the incremental temperature change, the effect of this temperature change is limited.

And you are subtracting a simulated T_1(+/-)u_1 from a T_2(+/-)u_2, yielding a ΔT_1,2 (+/-)sqrt(u1^2+u2^2) = (+/-)u_1,2 > u1, u2.

See the above about subtracting occult inaccuracies.

It’s not a question of bounded physical error, kribaez. It’s a question of how much you know about the accuracy of the result. In this case, that knowledge is (+/-)u_1,2 > u1, u2.

You wrote, “What happens if a stochastic uncertainty is added annually to the bias error in cloud flux?

A tendentious question, in that your condition of a _stochastic_ uncertainty determines your answer.

You wrote, “I set the cloud flux to a bias error of 4 W/m2 …

But the uncertainty is not a bias error. It is a (+/-) interval; one that that does not subtract away or imply a mean of 0.

You wrote, “If, on the other hand, I add each randomly drawn annual sample error to the previous value then the problem explodes very rapidly, as one would expect with an integrating time series. There is however no justification for this.

A GCM calibration error statistic is not a sample error. It is a characteristic of the GCMs. It is a limit on their resolution.

It says a simulation provides no information about the impact of thermal energy flux change that is less than a lower limit of (+/-)4 W/m^2.

No information, kribaez.

One does not know how clouds respond to the forcing because GCMs cannot resolve the change in cloud fraction. Therefore one does not know the change in air temperature.

One can use a physical model to calculate all sorts of detailed things. But if the details are below the model resolution, they have no physical meaning; no significance.

In a homely example, one can use a calculator to multiply two numbers, each with one sig-fig below the decimal, and report it to nine places. The calculator will allow that. But eight of those nine will have no meaning. The resolution of the calculation is one significant figure.

Likewise your models. Their lower limit of resolution of the effects of thermal energy flux is (+/-)4 W/m^2. The effect of any forcing or feedback that is smaller than that, is invisible to the model.

You wrote, in conclusion, “this combination of systemic bias error and random annual error would be large visible contributors to the calibration error which you identify, but as indicated above, they have very little effect on the uncertainty of temperature projection post-spin-up, apart from feedback error.

Rather, they have a large impact on the uncertainty of the temperature projection.

Even in the spin-up base climate state, you do not know that the initial spin-up air temperature is a correct representation of the equilibrium climate energy-state. The cloud fraction is wrong. The tropospheric thermal energy flux is wrong. Even if the air temperature is correct due to tuning, it is correct for the wrong reasons.

The underlying physics is not correct. The simulated equilibrium climate state is wrong.

There is an initial uncertainty interval that cannot subtract away. Instead, the incorrect initial state is projected incorrectly further, by way of a deficient physical theory.

It does not matter that GCM parameters have been chosen to reproduce the air temperatures of the calibration period. The spin-up climate state is wrong, the physics is wrong. and all the simulated observables have a large but hidden uncertainty.

[LWCF is not clear: LW (Long Wave) Cloud Fraction? .mod]

kribaez
Reply to  Pat Frank
September 30, 2019 5:24 am

Pat,
Firstly, I would like to apologise for calling you Dr Frank. It was not intentional at all.

Secondly, I would like to thank you for the long responses above. I do not underestimate the massive effort you have put into writing and defending your paper.

Thirdly, I will re-state that I agree with your qualitative conclusion that the GCMs are unreliable, and useless for informing decision-making. My concern here is with your methodology.

On many occasions, I have (professionally) built models of physical systems ranging from simple analytic functions to complex dynamical simulators. Uncertainty analysis for such systems is founded on the fundamental principle that if one can define the joint distribution of inputs, then equi-probable sampling of the input space will yield via the model the joint distribution of the outputs. This output space – defined via the model – is always strictly a conditional joint probability, since it does not and cannot include “model error” which arises from incorrect or incomplete specification of the model.

The above principle forms the foundation for uncertainty analysis.

The inverse problem, where we have a number of uncertain observations in the output space and we wish to narrow down the range of uncertainty on the input parameters which form the joint distribution of inputs, is still governed by the same principle. Typically this is done by frequentist inverse transform, a Bayesian method or brute-force filtering.

It is important to note that a “resolution problem” in the prediction of a key output or a “calibration problem in an input variable” is not hidden from view anywhere in the above foundational principle. On the contrary, sampling from the input distribution should reveal the output uncertainty – sufficient sometimes to justify additional data collection or scrapping the model. You seem to be denying this.

Even with the simplest model, there should be no esoteric existence of an uncertainty which is invisible to sampling via the model. If I consider a linear model of the form Y = bX. Sampling of an input distribution of b, will yield the correct uncertainty for Y at some value of X. Equally, if X happens to be a time variable, you can calculate the uncertainty in Y as sqrt(X^2 x var(b)). Alternatively, I can propagate an error of sqrt (var(b)) summed over unit timesteps. I will obtain the same answer for all three approaches. There is no hidden, esoteric uncertainty anywhere in this. If, instead, I have a measurement uncertainty in X of (+/-) 2, I can sample from a U(-2,+2) distribution and obtain the uncertainty of Y as being in the range [-2b, +2b]. If I have uncertainy in b AND a measurement uncertainty in Y, I can still sample the distribution of b and the error in X to correctly obtain the uncertainty in Y. Once again, there is no hidden esoteric uncertainty in any of this which is not revealed by appropriate sampling or indeed by correctly applied quadrature. There is however a fundamental dependence on the mathematical or the physical model which is under evaluation.

With large models, more often than not, many methods of dealing with this problem are impractical, and it is not unusual at all to find that engineers will test uncertainty by sampling methods applied to reduced models or emulators, supported by sensitivity tests applied to the main model. For such results to have any meaning, it is of critical importance that the reduced model is adequate to represent the key functional relationships honoured in the main model.

I have tried to point out above that your emulation model is inadequate to the task you are setting it here, because of its inability to distinguish between a component of flux, the net flux (balance) and a forcing. In the main model here (a GCM), these represent three different variables, with different magnitudes, and distinct properties. An uncertainty in a component of flux does not translate into an error in net flux at the end of the spin-up period. The mathematics of the problem will force the net flux into balance (actually with small fluctuations around zero), and the uncertainty in net flux is close to zero. THIS DOES NOT LEAVE AN INVISIBLE UNCERTAINTY IN NET FLUX, and nor does it represent any confusion between error and uncertainty. It is something which is forced by the mathematics of the problem.

The forcings which are subsequently applied are then by definition imposed exogenous changes to the near-zero net flux.

You correctly state that the spin-up leaves a condition where “the spin-up climate state is wrong” or “the simulated equilibrium state is wrong”. Yes, it does, without a doubt. Can we estimate what this does to the uncertainty in temperature projection arising uniquely from the uncertainty in cloud fraction? Only with great difficulty. We can obtain some approximation to it, but it requires a different type of analysis from the one you have carried out when comparing observed to modeled cloud fraction.

The uncertainty in temperature projection arising from the uncertainty in cloud characterisation is almost entirely associated with the uncertainty in the FEEDBACK to net flux, and you can quite legitimately argue that there does exist in this feedback system what you would describe as “a linear propagation of uncertainty”, but (a) it is NOT a linear propagation in time, it is closer to a linear propagation of flux uncertainty with temperature – which makes an enormous difference, and (b) the resulting integrated series (in flux) does not translate the entire calibration error in cloud flux into the feedback error in net flux, since systemic bias in and of itself has little effect on the net flux. What is left, in essence, is a gradient error i.e. the rate of change of feedback flux with respect to temperature.

You will find that this flux feedback error from clouds on its own, when translated reasonably into temperature uncertainty, is more than sufficient to challenge the ability of climate models to project temperatures in any meaningful way. If you had done this, I would applaud your paper. As it is, I believe it is very poorly founded, and I take no pleasure in stating this.

Reply to  kribaez
September 30, 2019 7:35 am

Note 8: mathematics is needed to explore meanings, understanding, and assumptions

I had quit this discussion for two or three days because I was very impressed with what kribaez was writing and was happy to watch. Clearly though, Pat is not. One of the problems is that after using a bit of mathematics, kribaez has moved to a substantial amount of words, which are very meaningful to him, but only somewhat meaningful to me. And then Pat can start arguing with or misunderstanding those words, and then resort to saying that only he understands the difference between error and uncertainty.

I feel it would be valuable to use some simple mathematical models to investigate the crux of Pat’s method and whether it has any predictive value that could feasibly be tested. There are currently 39 uses of the term “random walk” upthread, and the argument was made that Pat’s bounds are not produced by a random walk. I am happy to accept that, but I am more interested in the question “do Pat’s bounds constrain random walks in GCMs and if so can the bounds be tested by running the GCMs many times?” Someone is bound to say “no, there you are, confusing error with uncertainty again”.

Here at least are some questions for Pat which, if he graciously answers them, could get us started.

First, are the widths of the bounds at 2100 (say) pretty independent of the new forcing represented by the Delta-F_i’s? So, if all those Delta’s were zero, would we still get wide bounds? I believe the answer is yes, in which case we can explore the concepts without new CO2 forcing.

Second, if the emulation model was fitted to one particular GCM, rather than the given ensemble, could uncertainty bounds still be derived? I believe the answer is yes, and if so then it means we can explore the concepts without considering inter-model variability.

If those two answers are yes, then consider the following.

M(t) = a M(t-1) + B(t;m,s) (Equation *)

Would Pat’s emulator be of this form with a = 1 and B(t;m,s) a random variable of mean m (probably zero) and variance s^2? If so, would s^2 depend on something like the +/-4 W/m^2 TCF error during the calibration period?

If so, after T years, is the 1-sigma uncertainty estimated as sqrt(Ts^2)?

With simplified mathematics like this we might be able to come to a common understanding on what the paper says, and on what assumptions it has to make to get there. I hope that would help others as well as me.

kribaez
Reply to  See - owe to Rich
September 30, 2019 6:24 pm

” One of the problems is that after using a bit of mathematics, kribaez has moved to a substantial amount of words, which are very meaningful to him, but only somewhat meaningful to me.”

Yes, communication is very difficult, especially between people!

I can perhaps address some of your questions specifically.
“do Pat’s bounds constrain random walks in GCMs and if so can the bounds be tested by running the GCMs many times?” Pat’s time series in Forcing is not strictly a random walk but it is close. Statisticians would describe his timeseries in forcing as “a (nonstationary) integrating series of order one”. Variance is accumulated annually like a random walk, so the latter part of your post is essentially correct i.e. s^2 is derived from the +/- 4W/m2 TCF error and the 1 – sigma uncertainty in FLUX is estimated as sqrt(T x s^2). The second part of your question is what a lot of my “only somewhat meaningful” post was seeking to address. I was trying to emphasise that the type of resolution problem treated by Pat here is not invisible to Monte Carlo sampling. So, yes, IN THEORY, it is possible to test Pat’s uncertainty propagation for a single GCM by running the GCM many times, each time sampling a different input value of total cloud fraction (or LWCF) at the start of the spin-up period, and then capturing the distribution of the projected temperatures at some point in time. The envelope could then be compared with Pat’s projected uncertainty envelope and should be comparable if he is correct. In practice, however, this is impossible to do because each run would take many months. Ultimately, therefore a sampling approach can only be carried out on an emulator of some sort.

“First, are the widths of the bounds at 2100 (say) pretty independent of the new forcing represented by the Delta-F_i’s?” In Pat’s model, yes, the flux bounds are the same, though the temperature bounds will vary with the gradient parameter selected in Pat’s emulator.
“So, if all those Delta’s were zero, would we still get wide bounds?” In Pat’s model, yes. That is why I looked at sampling a U(-4,+4) distribution of cloud flux error on my primitive LTI model to examine the impact on uncertainty in net flux and temperature at the end of the spin-up period.

I hope this helps.

Reply to  kribaez
October 1, 2019 5:50 am

Kribaez,

“So, yes, IN THEORY, it is possible to test Pat’s uncertainty propagation for a single GCM by running the GCM many times, each time sampling a different input value of total cloud fraction (or LWCF) at the start of the spin-up period, and then capturing the distribution of the projected temperatures at some point in time.”

I don’t think I can agree with this. Uncertainty is not a random variable in and of itself and so it doesn’t cancel over many runs or over time. Merely inputting different values at the start of a run doesn’t capture the fact that uncertainty grows over time. If the assumption that CGM’s react in the same manner each time a run is made then a Monte Carlo analysis doesn’t tell you much about the uncertainty growth over time. It will give you an indication of the relationship between an error in the input and the resultant output of the CGM but that is not a measure of uncertainty. Uncertainty has to do with not knowing the output for sure even with a single input. If the CGM output is deterministic, i.e. reacts the same way each time for the same input, then where does the uncertainty of the output come into play? Uncertainty has to do with the output not being deterministic even with the same input each time.

Reply to  See - owe to Rich
October 1, 2019 6:57 am

kribaez, thank you, that is helpful, especially if Pat concurs. Although it might be a lot of modelling work, perhaps runs covering as few as 4 years would suffice. Those would give 1-sigma uncertainty bounds in the forcing domain of 4*sqrt(4) = +/- 8 W/m^2, which represents quite a large temperature uncertainty, and I hope that would be detectable.

Here is a follow up question. Have model runs already been done which might answer this question? In particular, what is the exact statistical description for the error bounds recorded in Panel A of Figure 6? I previously summarized Figure 6 as follows:

“ September 26, 2019 at 2:03 am
Pat, thanks, so the 1-sigma uncertainty bars in Panel B do relate one-to-one to the bars in Panel A, but they are calculated differently. Now I think the bars in Panel A can be used to predict the spread which would occur if new model runs were made. But your point in Panel B is that those runs could have strayed ever so far from reality because of the physical uncertainty in the parameters of those models. Is that a fair summary?”

This was never answered. It is important because it relates to whether Pat’s bounds do relate to the possible universe of model runs, which we have assumed above.

Reply to  See - owe to Rich
October 1, 2019 8:57 am

Reply to Tim Oct1 5:50am

Tim, your spiel is an example of why I think it is important to argue with mathematics. I understand, though, that some people find it hard to do that. I shall try here to put your words into mathematics and examine the consequences. In my Equation (*) from earlier I am going to take a=1 as read (for now), and use ‘z’ in place of ‘m’ because there are too many m’s; ‘z’ stands for zero, which is the value it might be expected to be. So we have:

M(t) = M(t-1) + B(t;z,s) (*)

Here are 5 points you made, with my riposte.

1. “Uncertainty is not a random variable in and of itself and so it doesn’t cancel over many runs or over time.”

As pointed out before, Pat treats uncertainties as random variables when he takes root mean square. So if u_1 and u_2 are “uncertainties”, real numbers, in combination he uses U(u1,u2) = sqrt(u1^2+u2^2). This only makes sense if u1 is considered to be a standard deviation to a random variable V1, and u2 likewise for an independent r.v V2. And this U expression does encapsulate a modest amount of cancellation, wherein V1 might be +0.303u1 and V2 might be -1.007u2. This is the case of 2 independent rulers being used, once each.

2. “Merely inputting different values at the start of a run doesn’t capture the fact that uncertainty grows over time.”

Why not? Let there be n input values m_1(0),…,m_n(0). Then M_i(1) = m_i(0) + B_i(1;z,s) has mean m_i(0)+z and variance s^2. If we observe the values of M_i(1), call them m_i(1), then a “good” estimate of z is

z* = sum_{i=1}^n (m_i(1)-m_i(0))/n

and a “good” estimate of s^2 is

s* = sum_{i=1}^n (m_i(1)-z*)/(n-1)

So, we had no clue how the model was going to evolve over one time step, but after our n trials we do have a clue. Alternatively, we did have a clue, by some theory like Pat’s, that s would be a known function, say s’, of +/- 4 W/m^2. Well, now we can start asking whether s* corroborates that inferred value s’. This is exactly what I am after. As for growing over time, after t steps the 1-sigma uncertainty grows to (s*)sqrt(t), under independence assumptions.

3. “It will give you an indication of the relationship between an error in the input and the resultant output of the GCM but that is not a measure of uncertainty.”

But m_1(0) was a given, fixed, input, and it can have no error other than in relation to reality which I’ll call R(0), when the error is m_1(0)-R(0). But does Pat’s theory of uncertainty relate to the difference between models and reality at time t, or to the plausible spread of model outputs at time t? I thought it was the latter, in which case input error is meaningless.

4. “Uncertainty has to do with not knowing the output for sure even with a single input.”

True, even though we know m_i(0), m_i(1) is an observation of the r.v. M_i(1) = m_i(0) + B_i(1;z,s).

5. “If the GCM output is deterministic, i.e. reacts the same way each time for the same input, then where does the uncertainty of the output come into play?”

That’s a reasonable question; I wonder if kribaez can answer: do GCM’s use pseudo-random numbers to randomize their runs?

Reply to  See - owe to Rich
October 1, 2019 11:46 am

Rich,

“As pointed out before, Pat treats uncertainties as random variables when he takes root mean square. So if u_1 and u_2 are “uncertainties”, real numbers, in combination he uses U(u1,u2) = sqrt(u1^2+u2^2). ”

When you square the uncertainty values they will always add, never subtract. Then when you take the square root you always get an interval, plus and minus, that was larger than it was. Thus the uncertainty grows. It never cancels like a random variable would.

“This only makes sense if u1 is considered to be a standard deviation to a random variable V1”

Why? The uncertainty is not a random variable. It is an uncertainty in the output. Doing the square simply eliminates the +/-. You then take the square root to get back to a +/- interval. Suppose only u1 has a value that is non-zero. The process simply gives you back what u1 is. It doesn’t convert it to a standard deviation of a random variable. Suppose u1 and u2 are both the same. You wind up with 1.414 * u1 as the new uncertainty interval. If you have 3 successive uncertainties you get 1.73 * u1. The uncertainty grows with each interval. Only if you do something that lessens the uncertainty mid-run will you see the uncertainty go down, e.g. u4 = .9 * u1. I don’t believe any of the CGM’s can do anything mid-run to lessen the uncertainty.

“Why not? Let there be n input values m_1(0),…,m_n(0). Then M_i(1) = m_i(0) + B_i(1;z,s) has mean m_i(0)+z and variance s^2. If we observe the values of M_i(1), call them m_i(1), then a “good” estimate of z is”

You are, once again, assuming uncertainty is a random variable with a mean and a variance. It isn’t. The uncertainty starts at the beginning of the run and unless something happens mid-run to change that value of uncertainty, it never changes, e.g. the +/- 4Wm^2. All that changes is what the total uncertainty becomes, e.g. the root mean square. It always grows, it never decreases.

“So, we had no clue how the model was going to evolve over one time step, but after our n trials we do have a clue.”

No, you don’t. Again, run after run, the CGM is going to give a deterministic output with an uncertainty that grows with each iteration. If with an input of v_1 you get an output of o_1 with an uncertainty of u_1 (an accumulation of error over how ever many iterations you make) you haven’t changed the amount of uncertainty in the output at all. Change the input to v_2 with an output of o_2 (a second deterministic output of the model) then you *still* have an output uncertainty of u_1.

You can do however Monte Carlo runs you want to make and you will never lessen the uncertainty associated with the output.

“But m_1(0) was a given, fixed, input, and it can have no error other than in relation to reality which I’ll call R(0), when the error is m_1(0)-R(0). But does Pat’s theory of uncertainty relate to the difference between models and reality at time t, or to the plausible spread of model outputs at time t? I thought it was the latter, in which case input error is meaningless.”

You are making this more complicated than it needs to be. If you have inputs v_1 and v_2 with outputs o_1 and o_2 they are both still subject to the uncertainty associated with the model output. You may get a feel for what an error equal to v1-v2 causes in the output but both are *still* uncertain outputs. In fact, unless v_1 – v2 is greater than the uncertainty how do you even know what the sign of the difference is?

“True, even though we know m_i(0), m_i(1) is an observation of the r.v. M_i(1) = m_i(0) + B_i(1;z,s).”

How do you know this? Unless B_1(,s) is greater than the uncertainty, how do you even know for sure what M_i(1) is?

kribaez
Reply to  See - owe to Rich
October 1, 2019 10:33 am

Tim,
What I am saying is genuinely fundamental. Vasquez and Whiting, whom Pat is citing, make use of Monte Carlo methodology in their paper(s) to estimate uncertainty arising from (both) systematic bias in calibration and random error. Here is a quote from Vasquez, Whiting and Meerschaert 2010:-
“To analyze random and systematic error effects using Monte
Carlo simulation, the approach proposed by Vasquez and Whiting
(1999) is used, which consists of, first, defining appropriate probability distributions for the random and systematic errors based
on evidence found from different data sources. Then bias limits
are defined for the systematic errors of the input variables of the
model. Samples are drawn using an appropriate probability distribution
for the systematic errors if a priori information is available.
Otherwise, a uniform distribution is used. For the random errors,
samples are taken from each of the probability distributions characterizing
the random error component of the input variables and
then the samples are passed through the computer model.”

There is no output uncertainty that is invisible to this approach.
And uncertainty in an output variable does not always increase with time. This depends on the problem being solved.

Reply to  kribaez
October 1, 2019 2:36 pm

“here is no output uncertainty that is invisible to this approach.”

Of course there is.

“And uncertainty in an output variable does not always increase with time. This depends on the problem being solved.”

Give me an example.

“Samples are drawn using an appropriate probability distribution for the systematic errors”

The uncertainty Pat used is a constant obtained this way: “We know from Lauer and Hamilton, 2013 that the annual average ±12.1% error in CMIP5 simulated cloud fraction (CF) produces an annual average ±4 W/m^2 error in long wave cloud forcing (LWCF).”

So what the uncertainty becomes over “n” iterations is the sqrt of {[(n) *(n+1)/2 ] * 16}. Do this over 80 years and you get an uncertainty of about +/- 220Wm^2 total uncertainty (I’m doing this in my head so forgive my inaccuracy). That is basically Pat’s Equation 6 in his paper. That’s enough to swamp the ability to tell what is going to happen in 80 years with the resolution required to distinguish a few degrees of warming.

1sky1
Reply to  See - owe to Rich
October 1, 2019 3:12 pm

[U]ncertainty in an output variable does not always increase with time. This depends on the problem being solved.

Indeed! By mistaking the 20-yr rms error estimate of 4 W/m^2 in modeling the difference between all-sky and clear sky LW emissions (so-called LCF) reported by Lauer and Hamilton as a characterization of a YEARLY increment in a CUMULATIVE sum of variances, Frank unwittingly adopts a totally wrong conception of error propagation in modeling time-series.

In the recursion relationship that represents a developing Markov chain

y(n) = a y(n-1) + b x(n)

a cumulative sum of stochastically independent, gaussian inputs x is obtained with a = b = 1. But then the output y is inherently unstable, wandering off into an ever-widening 1-D random walk. There simply is no evidence that calibrated GCMs behave that way, as Spencer demonstrated convincingly for the pre-industrial control runs in Figure 1 of his column of Sept.12.

Instead of the well-known Wiener process, GCM output and its error is far more closely represented by the Ornstein-Uhlenbeck process, with a < 1. This introduces a dynamically characteristic exponential decay to the effect of input at any time and produces autocorrelated “red” noise when the input is gaussian white noise. Such a system always satisfies the bounded input/bouded output (BIBO) stability criterion. A semblance of such a spectral characteristic is often found in actual geophysical time series. See: https://journals.ametsoc.org/doi/pdf/10.1175/1520-

Reply to  1sky1
October 1, 2019 4:05 pm

sky:
From the document in your link:

“Consider a discrete time series which depends only on its own immediate past value plus a random component.”

The annual mean of the overall discrepancy of +/- 4Wm^2 is not a “random component”. You can argue about what that mean value should be on an annual basis but it will be difficult to argue that there is no mean (average) value.

Once again, uncertainty is an interval, not a random variable. It only indicates the interval in which you might expect to find the variable but it doesn’t tell you where in that interval it is. Therefore it simply can’t cause a random walk. That uncertainty interval doesn’t take on a random value on each iteration. It isn’t plus one time and negative the next. It’s like describing the eccentric orbit of a satellite. You can certainly calculate the average distance of that satellite and say it is x miles +/- the eccentricity value. That eccentricity value doesn’t change randomly from year to year (theoretically). And it *is* an uncertainty value applied to the average distance of the satellite. If there were a small impetus being applied continuously over time to that satellite to cause it to expand its orbit (i.e. a “forcing”) then the uncertainty interval on the eccentricity would also grow over time. It wouldn’t cause a “random walk” in any way, shape, or form.

1sky1
Reply to  See - owe to Rich
October 1, 2019 5:00 pm

The annual mean of the overall discrepancy of +/- 4Wm^2 is not a “random component”.

A clear reading of everything I wrote here shows that 4Wm^2 is consistently treated as the rms value of the model-average LCF error, i.e. a 20-yr sample estimate of the variability of that random error that specifies a FIXED uncertainty. Pavlovian persistence in claiming otherwise is just throwing more peanut shells on a blast of hot air.

Reply to  1sky1
October 1, 2019 6:00 pm

sky:

“A clear reading of everything I wrote here shows that 4Wm^2 is consistently treated as the rms value of the model-average LCF error, i.e. a 20-yr sample estimate of the variability of that random error that specifies a FIXED uncertainty. Pavlovian persistence in claiming otherwise is just throwing more peanut shells on a blast of hot air.”

Maybe we are talking past each other. Pat is using +/- 4Wm^2 as an interval, not just a positive value of 4Wm^2. That is what taking a square root of an interval causes to happen. You still wind up with an interval.

That fixed uncertainty is an annual value the way Pat has done it. As an annual value it has to be combined over however many iterations are made.

Reply to  See - owe to Rich
October 1, 2019 5:40 pm

“4Wm^2 is consistently treated as the rms value of the model-average LCF error”
Plus it is rms of the error at individual grid points. Yet Eq 1 treats it as the rms error of a global average.

Reply to  See - owe to Rich
October 1, 2019 8:01 pm

From Nick Stokes: “Plus it is rms of the error at individual grid points.

From Lauer and Hamilton, page 3833: “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means. …

In both CMIP3 and CMIP5, the large intermodal spread and biases in CA and LWP contrast strikingly with a much smaller spread and better agreement of global average SCF and LCF with observations …

FIG. 7. Biases in simulated 20-yr-mean LWP from (left) the (top to bottom) four individual coupled CMIP5 models and (middle) their AMIP counterparts, with the smallest global average rmse in LWP. (my bold)”

Wrong again, Nick

kribaez
Reply to  See - owe to Rich
October 1, 2019 8:28 pm

See (and Tim),

“I wonder if kribaez can answer: do GCM’s use pseudo-random numbers to randomize their runs?” No. The model runs are deterministic, which means that if you repeat a run with identical initial conditions and inputs, then you will obtain the same result. Typically, however, the results that you will normally see represent an average of multiple runs (at least 5) for a given forcing scenario on the same model. Each of these runs is “kicked-off” from a different time-point from the spin-up saved results, and therefore each has slightly different initial conditions; each run then yields a slightly different outcome. The fluctuations visible in GCM runs come from the chaotic character of the governing mathematics and not from any input stochastic variation.

None of the above affects what I am saying about the validity of using Monte Carlo (MC) methodology to test for Pat’s uncertainty. If you want a truly simple, but very relevant example, I would suggest that you return to my post above (https://wattsupwiththat.com/2019/09/19/emulation-4-w-m-long-wave-cloud-forcing-error-and-meaning/#comment-2809839) where I discuss the linear relationship Y = bX. It is relevant because Pat has convinced himself that the temperature change from initial can be adequately emulated as a simple linear function of cumulative forcing F(t).

Let me fill in some blanks on this example. We are interested in the uncertainty in Y arising from a calibration problem in b, when X reaches 100. Our best measurements tell us (no more than) that b sits in an interval between 1 and 3, say. We decide to treat b as a random variable with a Uniform distribution given by U(1,3). We can readily calculate that the maximum possible value of Y is then 300 and the minimum value is 100. The range (spread) of possible values is therefore 200, and it happens to be uniformly distributed in this case, so we can say that when X is 100, Y is a RV with a pdf of U(100, 300). We do not know the error in Y, since we do not know the true value of b, but we do know its uncertainty, a point which Pat has made repeatedly.

Alternatively, we can calculate the variance of b and use standard quadrature to calculate the variance in Y at any value of X. Since b is uniformly distributed, its variance is given by (range)^2/12 = 2^2/12, which equals 1/3 in this instance. Its sd is then equal to sqrt(1/3). When X is 100, Var(Y) = 100^2* Var(b) = 100^2*1/3. Since Y is uniform, we can convert this back to a range: range = sqrt (12 * 100^2 * 1/3) = 200. We have the same answer as before.

The third alternative is to run a Monte Carlo which samples b as a RV with distribution U(1,3), and for each realisation, calculates Y = 100b. The resulting values of Y will all sit between 100 and 300 and will conform to a U(100, 300) distribution.

Tim, note
(a) that the uncertainty calculation here in this first case yields an uncertainty which will increase with increasing values of X
(b) what we have calculated here is a “resolution uncertainty” in Y
and (c) the Monte Carlo approach is perfectly capable of revealing that uncertainty.

Two other important things to note are firstly that the above example deals with a calibration uncertainty in the parameter value, b, and not the variable value, X, and, secondly, that the error or standard deviation of Y varies with X, and not with sqrt(X) for this particular linear problem.

Pat is concerned with resolution uncertainty in the variable, X, rather than the parameter, b, so now let’s consider the case where b is an accurately known value, and X carries uncertainty. If X can only be measured to an accuracy of (+/-)2, then the resulting uncertainty in Y will be (+/-)2b for all values of X. Again this can be confirmed by MC, although it is clearly not necessary in this case. There is no growth of uncertainty for this type of error, and no hidden resolution uncertainty in either of the two cases.
Now let’s turn to Pat’s specific claim. He argues that the the total temperature change, T(t), from the start of a forcing run varies linearly with cumulative forcing, F(t).

T(t) = bF(t) where b is a constant, assumed to be known in Pat’s model.
If we index the time into annual times, t1, t2, t3…ti…, we can write :-
T(ti) = bF(ti)
T(ti+1) = bF(ti+1)

Hence T(ti+1) – T(ti) = ΔTi+1 = b(F(ti+1)-F(ti)) = bΔF(ti+1)
T(ti+1) = b(F(ti) + ΔF(ti+1)) EQ 1

In my Y = bX example above, the calibration uncertainty in my variable, X, was ever-present, but it was applied to the total value of X, so it did not grow as X increased.
Pat, however, argues that his LWCF calibration error of (+/-)4 watts/m2 should be added to each annual INCREMENT of forcing, so that
ΔF(ti+1) becomes [ΔF(ti+1)+ error], where the error is drawn from a U(-4,4) distribution. The result is variance growth as per a random walk.

The problem is that there is no physical or theoretical justification for this type of uncertainty propagation in the GCM – or in a sophisticated emulator of the GCM. If there were, then yes it should in theory be revealed by MC testing on single GCMs. There IS a physical and a theoretical justification for a different type of uncertainty propagation from cloud characterisation, and that is via the feedback flux, and as I have already stated in a previous post, that yields a considerable resolution uncertainty in temperature projection, but it is propagated through temperature change rather than time and does not have the same theoretical form as Pat’s uncertainty growth. Pat’s emulator however is far too primitive to assess the effect of uncertainty in the feedback flux, since it has no ability to discriminate between a flux component, the net flux and a forcing – all of which are quite different variables in their magnitude and effect. To the extent that a feedback error would be visible at all in Pat’s model, it would affect his gradient term (parameter b in the above) rather than total forcing. Pat (instead) is trying to add a calibration error in a flux component (LWCF) to a forcing series, which is an exogenous deterministic input, but which he is treating in his emulator as a proxy for a net flux change.

kribaez
Reply to  See - owe to Rich
October 1, 2019 8:57 pm

Response to Tim Gorman,

““And uncertainty in an output variable does not always increase with time. This depends on the problem being solved.”

Give me an example.”

Sure. Drop a tennis ball with a coefficient of restitution of 0.3 onto the ground from a height estimated to be 6m (+/-)2. What is the uncertainty in the elevation of the tennis ball above the tennis court after 10 minutes?

Alternatively, have a look at the primitive LTI model above. You initialise it with an uncertainty of (+/-)4 in the net flux (balance). What is the uncertainty in net flux after 500 years?

Reply to  See - owe to Rich
October 1, 2019 11:17 pm

1sky1, “ a 20-yr sample estimate of the variability of that random error that specifies a FIXED uncertainty.

The pair-wise correlation of TCF error shows that it is not random (Table 1, paper page 7).

The calibration uncertainty is a characteristic of CMIP5 GCMS. It is a product of deficient theory. It shows up in every step. The uncertainty it puts into a projection must increase with every step.

Reply to  See - owe to Rich
October 1, 2019 11:37 pm

Pat,
“These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means”
An odd thing for you to highlight. He is emphasising that he is calculating correlation of spatial variability, not variability of a spatial average over time. And it is from that correlation that the rmse of 4 Wm⁻² is calculated.

“global average rmse in LWP. (my bold)”
That does not mean the rmse of the global average. It means, as it says, the global average of rmse.

1sky1
Reply to  See - owe to Rich
October 2, 2019 2:13 pm

The pair-wise correlation of TCF error shows that it is not random (/blockquote)

Spatial-lag autocorrelation of cloud-cover (sic!) error merely indicates that it doesn’t vary spatially like white noise. This doesn’t preclude red-noise random variation or various sporadic deficiencies in modeling. In any event, spatial variability is not the issue in the time-propagation of modeling error. As Lauer and Hamilton make clear, their LCF-error determinations “give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means.”

In other words, there’s a model time series of annual means of spatial variability that is statistically characterized by the FIXED cross-correlation and rms error over the prescribed 20-yr period of satellite observations. That 4 W/m^2 rms error of the aggregate model mean CANNOT legitimately be denominated PER ANNUM, let alone be compounded. Nor does it represent the average standard deviation of 20 individual years, as Frank has claimed in shocking violation of the algebraic law that square roots are NOT distributive. Moreover, the correlation of 0.93 with observations doesn’t point to any truly gross modeling deficiencies. While such do exist, they are not at all those ostensibly discovered here.

1sky1
Reply to  See - owe to Rich
October 2, 2019 2:16 pm

Moderator, please correct my botched brackets at the end of the first line.

Reply to  See - owe to Rich
October 2, 2019 9:48 pm

I wrote, “The pair-wise correlation of TCF error shows that it is not random ”

To which you (1sky1) replied, “Spatial-lag autocorrelation of cloud-cover (sic!) error merely indicates that it doesn’t vary spatially like white noise.

I directed you to the pair-wise TCF error correlations in Table 1, sky.

You quoted Lauer and Hamilton as, “annual means of spatial variability ” and followed that up with , “aggregate model mean CANNOT legitimately be denominated PER ANNUM

So, for you an annual mean is not per year. That is, (sum of magnitudes)/(number of years) is not magnitude/year.

Summarizing, you’re claiming that an annual mean is not an annual mean.

You wrote, “…Frank has claimed in shocking violation of the algebraic law that square roots are NOT distributive.

So you’re denying the validity of paper eqns. 3 & 4 and denying propagation of error as (+/-)sqrt(sum of variances).

You wrote, “Moreover, the correlation of 0.93 with observations doesn’t point to any truly gross modeling deficiencies..

That’s actually funny.

Two linear series, one of 0-1 in 0.01 steps and the other with steps of 0-100 has correlation 1.00 And yet the final values differ by 99.

That 99 could be error. The point is that correlation does not prescribe magnitude. Correlation of 0.93 does not tell us anything about the size of the error.

You wrote, “While such do exist, they are not at all those ostensibly discovered here.

I did not discover any modeling deficiencies. The deficiencies were discovered and reported by Lauer and Hamilton. I just derived one of the consequences of those deficiencies.

Reply to  See - owe to Rich
October 2, 2019 9:58 pm

Nick Stokes, “ … 20-yr annual means of total spatial variability .. is not variability of a spatial average over time.

Sure Nick. The variability of the error of each model is not the variability of the average error. Great point.

Meanwhile, “The overall comparisons of the annual mean cloud properties with observations are summarized for individual models and for the ensemble means by the Taylor diagrams for CA, LWP, SCF, and LCF shown in Fig. 3. These give the standard deviation and linear correlation with satellite observations of the total spatial variability calculated from 20-yr annual means. … (my bold)”

Reply to  See - owe to Rich
October 2, 2019 11:56 pm

Pat,
“That is, (sum of magnitudes)/(number of years) is not magnitude/year.
Summarizing, you’re claiming that an annual mean is not an annual mean.”

(sum of magnitudes)/(number of years) is not any kind of mean, since the sum is over n, which is not the number of years. High school kids know that. It is the elementary error in S6.2.

“Meanwhile,”
The paper is clear enough. But in any case Lauer himself said your interpretation is nonsense (via Brown):
“I have contacted Axel Lauer of the cited paper (Lauer and Hamilton, 2013) to make sure I am correct on this point and he told me via email that “The RMSE we calculated for the multi-model mean longwave cloud forcing in our 2013 paper is the RMSE of the average *geographical* pattern. This has nothing to do with an error estimate for the global mean value on a particular time scale.”.”

Reply to  kribaez
October 1, 2019 11:11 pm

kribaez, you wrote, “My concern here is with your methodology.

Have you seen the post from nick, above? He’s a physicist, apparently thoroughly familiar with uncertainty analysis.

You wrote, “equi-probable sampling of the input space will yield via the model the joint distribution of the outputs. This output space – defined via the model – … The above principle forms the foundation for uncertainty analysis.

An uncertainty equivalent to precision, kribaez, as you are assessing variability of model response, not accuracy versus known standards.

You wrote, “On the contrary, sampling from the input distribution should reveal the output uncertainty – sufficient sometimes to justify additional data collection or scrapping the model. You seem to be denying this.

As I understand your point, you’re proposing that sampling the uncertainty range in inputs establishes the uncertainty in outputs. I don’t deny that. I merely recognize it is an estimate of model precision. Not of model accuracy.

You wrote, “If I consider a linear model of the form Y = bX. Sampling of an input distribution of b, will yield the correct uncertainty for Y at some value of X

Yes, but it will not reveal the distance from the physically correct value of Y.

If you “have a measurement uncertainty in X of (+/-) 2,” and X enters into a sequential series of calculations to predict the behavior of Y in a sequential series of ‘n’ future states, then the (+/-)2 of X gets propagated through that series as the rss, sqrt[sum over n*(4)], so that the uncertainty in Y grows.

Your approach is fine for engineering models, kribaez, where parameters are calibrated to reproduce observables within some calibration bound. You must know well that calibrated engineering models do not reliably predict observables beyond those bounds.

However, prediction beuyond their calibration bounds is exactly what is being asked for GCMs, which are engineering models. Prediction of a forward state requires a different approach to uncertainty than monitoring model variability within its calibration bounds.

You wrote, “I have tried to point out above that your emulation model is inadequate to the task you are setting it here, because of its inability to distinguish between a component of flux, the net flux (balance) and a forcing.

Emulation eqn. 1 does not need to distinguish anything. The error term is brought in from outside that equation. The (+/-)4 Wm^2 of LWCF error from Lauer and Hamilton is clearly not a forcing. Neither is it a component of flux, nor a part of the net flux. It is a simulation uncertainty stemming from models that are in simulated flux balance.

You wrote, “An uncertainty in a component of flux does not translate into an error in net flux at the end of the spin-up period.

The error is in the physical theory of clouds as deployed in the model specifically, and in the over all deficiency in the physical theory in general.

Spin-up of a physically wrong climate-state does not lead to a representation of the physically correct state. The equilibrium simulation may be stable. But the simulation state is not known to be a physically correct representation of the climate energy-state.

An uncrtainty interval is an ignorance interval. A state of ignorance does not improve when it is projected forward into the unknown.

You wrote, “The mathematics of the problem will force the net flux into balance (actually with small fluctuations around zero), and the uncertainty in net flux is close to zero.

And the error with respect to the physically correct flux. What is that? You don’t know. Because the observational record is poor, and the physical theory is deficient.

You continued, “THIS DOES NOT LEAVE AN INVISIBLE UNCERTAINTY IN NET FLUX, …

It leaves a simulation error of unknown sign and magnitude in the total cloud fraction, which in turn produces a likewise error in long wave cloud forcing within the simulation. With respect to a simulation that is made to be in TOA balance, that uncertainty in thermal energy flux is indeed invisible.

continuing “ and nor does it represent any confusion between error and uncertainty.

It does, actually. None of what you wrote addresses accuracy at all.

continuing “ It is something which is forced by the mathematics of the problem.

When did mathematics determine the physics?

You wrote, “We can obtain some approximation to [the uncertainty in temperature projection], but it requires a different type of analysis from the one you have carried out when comparing observed to modeled cloud fraction.

Eqn. 1 shows that GCMs project temperature as a linear extrapolation of fractional GHG forcing, kribaez. Linear extrapolation is subject to linear propagation of error.

The displayed sensitivity of eqn. 1 to tropospheric W/m^2 forcing is the same as the sensitivity of GCMs to tropospheric W/m^2 forcing. A coherence of sensitivity to uncertainty in forcing strictly follows.

My analysis is clearly not how you’d have done it. But that does not make it incorrect.

You wrote, “The uncertainty in temperature projection arising from the uncertainty in cloud characterisation is almost entirely associated with the uncertainty in the FEEDBACK to net flux,

What you wrote, kribaez means that cloud feedback to CO2 forcing is hugely uncertain relative to the size of CO2 forcing. Neither you nor anyone else knows how clouds will (or will not) respond to CO2 forcing.

The (+/-)4 W/m^2 LWCF uncertainty is a measure of that uncertainty in cloud response, in a form (tropospheric thermal energy flux) that is directly applicable to the forcing introduced by CO2 emissions.

GCMs cannot resolve the effect of increased CO2 forcing on clouds. They cannot resolve tropospheric thermal energy flux to better than (+/-)4 W/m^2. They cannot resolve the effect of a 0.035 W/m^2 increase in thermal energy flux on clouds. They cannot resolve the cloud feedback in response to that 0.035 W/m^2 increase in forcing.

The effect of CO2 forcing on the climate is invisible to GCMs. That is why the uncertainty in air temperature increases year-to-year in a futures projection.

The ignorance about the difference between the physically correct temperature and the simulated temperature grows with every step in a projection. Growth of ignorance = growth of uncertainty.

You continued, “… you can quite legitimately argue that there does exist in this feedback system what you would describe as “a linear propagation of uncertainty”, but (a) it is NOT a linear propagation in time, …

The cloud feedback uncertainty is not about the physical response, kribaez. It’s about the ability of the model to simulate the response. That is a linear propagation in step-number, not a linear propagation in time. It’s about the model, not the climate.

continuing, “ …it is closer to a linear propagation of flux uncertainty with temperature – which makes an enormous difference, and (b) the resulting integrated series (in flux) does not translate the entire calibration error in cloud flux into the feedback error in net flux, since systemic bias in and of itself has little effect on the net flux.

The physically correct net thermal flux is not simulated to better than (+/-)4 W/m^2, kribaez. It does not matter how the model behaves, or how the simulated climate behaves. It does not matter that the TOA balance is maintained (which it always is in simulations).

What matters is that the tropospheric energy state is wrong. Cloud behavior is wrong. The impact of CO2 forcing is invisible. Each step in the sequence of steps has a (+/-)4 W/m^2 uncertainty in tropospheric thermal energy flux. The impact of CO2 forcing (0.035 W/m^2) is lost within it.

The physically correct behavior of air temperature in response to CO2 forcing can therefore not be known. The simulation wanders continually away from the physically correct state in the climate phase-space because clouds are incorrectly simulated. The distance between the simulation and physically correct trajectories is not known, and is known ever more poorly as the simulation proceeds.

The growth in uncertainty reflects this growth in ignorance. It is not saying anything about how the model is behaving. It is saying everything about what we actually known.

You wrote, “As it is, I believe it is very poorly founded, and I take no pleasure in stating this.

You’re a good guy, kribaez, thanks. But your analysis is misconceived.

You’re thinking of calibration uncertainty as physical error and as though uncertainty were revealed in the behavior of the model. It isn’t, and it isn’t.

Reply to  Pat Frank
October 2, 2019 6:49 am

“Words, words, I’m so sick of words…show me now!”. Thus sang Eliza Doolittle in “My Fair Lady”, though she was speaking of physical love rather than physical physics. For me, although some words are needed for explanation, I am going to keep on harping “show me the mathematics!”. In my Note 8 above I gave Equation (*), which is really important because with ‘a’ set to 1 it is I believe Pat’s Equation 1 (or 5) stripped to its bare bones, yet Pat hasn’t commented on it. Here it is again (with z for mean instead of m):

M(t) = a M(t-1) + B(t;z,s) (*)

I believe this equation is a good representation of Pat’s and the B distributions can be called uncertainty distributions, and they combine in a RMS fashion just as Pat asserts provided independence applies. Moreover, as kribaez would put it, these uncertainties are not “invisible to Monte Carlo sampling”, which means that statistical inferences can be made on z and s.

Show me where that is wrong; show me now! If it is in fact correct then it is a great basis for making progress against this problem, instead of just “waving our hands” (as mathematicians tend to say) that +/-4 W/m^2 indisputably propagates RMS-wise through the model, invisibly and without means of testing.

I’ll now throw another spanner in the works.

So far I have considered a=1. But what if in the GCMs a<1? Then

Var(M(t)) = a^2 Var(M(t-1)) + s^2 = a^(2t) Var(M(0)) + s^2 (1-a^(2t))/(1-a^2) (**)

This is the case of structural damping in the model, which my “Note 7: How does a cloud in 1990 affect me now?” suggests may actually be the case.

In this Equation (**), as t tends to infinity, Var(M(t)) tends to s^2/(1-a^2), a finite limit.

As I said before, you can linearize the model mean, but you can’t validly linearize the model error/uncertainty without knowing much more about the structure of the model, because its evolution is highly germane to the way that its uncertainty propagates.

Reply to  See - owe to Rich
October 2, 2019 7:17 am

Rich,

“M(t) = a M(t-1) + B(t;z,s) (*)

I believe this equation is a good representation of Pat’s and the B distributions can be called uncertainty distributions, and they combine in a RMS fashion just as Pat asserts provided independence applies.”

Your equation is wrong. B(t;z,s) doesn’t add to M. The equation should be
M(t) = a M(t-1) +/- B(t;z,s) (*)

It’s no different than saying: This one foot ruler = 12″ +/- 1″. The uncertainty specification doesn’t add to the 12″. It only says the ruler’s length can actually be between 11″ and 13″. It doesn’t say what the length really is so it can’t be considered as adding or subtracting anything specific.

If you’ll look carefully Pat’s equation says +/- 4Wm^2. It doesn’t say +4Wm^2.

“Moreover, as kribaez would put it, these uncertainties are not “invisible to Monte Carlo sampling”, which means that statistical inferences can be made on z and s.”

I think I covered this already. If the output of the CGM is deterministic, i.e. it gives the same answer every time for the same input values, then Monte Carlo analysis can’t tell you what the uncertainty is. Every output of every run will have an uncertainty associated with it. If the uncertainty factor remains +/- 4Wm^2 for all runs then that factor will compound over each iteration of each and every run, no matter what the input values actually are. You simply cannot reduce uncertainty by making multiple runs with different input values.

“In this Equation (**), as t tends to infinity, Var(M(t)) tends to s^2/(1-a^2), a finite limit.”

Uncertainty is a +/- specification. It’s a constant in Pat’s analysis = +/- 4Wm^2. How does a constant have a variance and standard deviation? How does a constant become damped and go to zero?

“As I said before, you can linearize the model mean, but you can’t validly linearize the model error/uncertainty”

As Pat keeps saying – “uncertainty is not error, error is not uncertainty”. Pat didn’t linearize the uncertainty factor, he came up with a constant identified as an average over a number of years. You can argue that his value is incorrect, that it should be something else. But then you need to actually show where Pat’s analysis used to develop the constant went wrong. It certainly looks legitimate to me.

kribaez
Reply to  Pat Frank
October 2, 2019 8:06 am

Pat,
Once again, thank you for the detailed response.

You are confirming, I think, that you accept that sampling from the full joint distribution of inputs when mapped via “the governing model” to outputs must yield the full output space. The only uncertainty which is invisible to this process arises from the choice or validity of the model itself. It is an important point and one which many of your supporters on this thread seem to have difficulty in accepting.

I was entirely comfortable with you comments until you got to:-
“If you “have a measurement uncertainty in X of (+/-) 2,” and X enters into a sequential series of calculations to predict the behavior of Y in a sequential series of ‘n’ future states, then the (+/-)2 of X gets propagated through that series as the rss, sqrt[sum over n*(4)], so that the uncertainty in Y grows.”

In the problem as set, this is a monstrously wrong answer, and I would invite you to spend a couple of minutes thinking seriously about this, because it speaks to one of the most controversial elements in your paper. In the hypothetical problem as set, we had a model Y = bX with an accurately known value of b, and we had a calibration error in X of (+/-)2. I repeat for emphasis that the measurand for the calibration is the variable X.

As I reported, the resulting uncertainty in Y is always (+/-)2b in this case, something which is simple to verify. Residuals are stationary. Confidence intervals are invariant with X. There is no growth of the uncertainty in Y as X varies.
If X is varying (as well) as some (unknown) function of time, then an operator might be asked to measure X at regular time intervals, and then record the change in X since the last measurement. We can now trivially rewrite our model as
Yi = b(Xi-1 + ΔXi ) where i is indexing the timesteps. Has the uncertainty in Y changed at all? The answer is no, it has not changed one whit, not at all. The reason is that the measurand is still X and NOT ΔX, so the new value of Xi is still (always) carrying an error bar of only (+/-2), and the uncertainty in Yi has a range of (+/-)2b. There is no integration of previous errors in the Xi values.
To get to your answer, you have to change the question, and in particular you have to change the measurand for the calibration error. The new question is:- Suppose that YOU HAVE NO ABILTY TO CALIBRATE X, but that you have the ability to calibrate the accuracy of the measured CHANGE IN X over each recorded timestep; from this, it is determined that each ΔXi carries an error bar of (+/-)2. What is the uncertainty in Y after n timesteps? This new problem now presents a classic integrating series of order 1 with the variance in X rising after n steps to nVar( ΔXi), or , as you put it, “the (+/-)2 of X gets propagated through that series as the rss, sqrt[sum over n*(4)]”.
The calibrated measurand in the first question is an already integrated series, and the uncertainty is stationary. The calibrated measurand in the second question is the first difference of that series in statistical jargon, and the uncertainty in its sum accumulates like a random walk. The difference is enormous. I would therefore invite you to consider very carefully whether in your calibration of LWCF, the RMSE you have found is a measure of error in the total LWCF (the first problem) or a measure of the error on the change in LWCF each year (the second problem).

As for the rest of your response, I have read it carefully, but still believe that the limitations of your emulator are leading you to add different animals together inappropriately, but I won’t bore you by repeating the same arguments. I think that we will have to respectfully agree to disagree.

I wish you well in any case and hope that the future feedback is not too painful.

Reply to  kribaez
October 2, 2019 12:43 pm

“The reason is that the measurand is still X and NOT ΔX, so the new value of Xi is still (always) carrying an error bar of only (+/-2), and the uncertainty in Yi has a range of (+/-)2b.”

I’m sure Pat will respond but it would seem you forgot that X(i-1) has already incurred an uncertainty of +/- 2. Then *another* uncertainty of +/- 2 is added with the new ΔX. You wind up with a series of uncertainty which accumulate as the root-sum-square.

I hate to keep going back to the one foot ruler that is 12″ +/- 1″ but it is the same thing. If you measure the width of a room, i.e. Y, then you get a formula of Y=bX where b is the number of times the ruler is laid end-to-end and X is the length in inches of the ruler. If you lay the ruler end-to-end ten times in order to measure the room the uncertainty in the measurement is not +/- 1″. The uncertainty compounds with each iteration.

Reply to  Pat Frank
October 2, 2019 2:12 pm

Reply to Tim Gorman Oct2 7:17am

Tim,

The problem is that you do not understand random variables (things which have a probability distribution) and random variates (actual values that a random variable takes) and standard deviations (the “bound” on a random variable), though I did try to explain this earlier. But I think this is the crux of Pat/Tim versus kribaez/Rich, so it is important to explore it.

So let’s take your rewriting of my equation:

M(t) = a M(t-1) +/- B(t;z,s)

Now my B(t;z,s) is a random variable, so taking +/- on it makes no sense. But there is a version we can write with a +/-, though it is really just a shorthand version of mine, and it is:

M(t) = a M(t-1) + z +/- s

which is like Pat’s Equation 5. To take your ruler example suppose we ordered a batch of 12” rulers and they all happen to be between 11.3 and 13.3”. Then we can think of z=0.3” as the bias in the rulers and s=1”, or +/-1” if you prefer, as the uncertainty in them. If we choose and use just one ruler then after t uses the bias will have accumulated as 0.3t” and the uncertainty will have accumulated as +/-t”.

However, we know that Pat does not use a single ruler, because he accumulates uncertainty proportional to sqrt(t). I challenge any mathematician to justify that without using the equivalent of my (*) equation which uses a random variable, and indeed Nick Stokes checked that Pat’s references do just that.

Re Monte Carlo sampling and determinism: yes, kribaez has said that with the exact same input you get the exact same output with a GCM, so in that case you can learn nothing about the uncertainty. But he has also said that if you perturb the input just a little then because weather is mathematically chaotic then the output changes a lot, and the difference between the runs tells you something about the uncertainty.

With respect Tim, I don’t think that you have sufficient mathematics to argue this point, but Pat has – I respect a lot of his maths even though I am worried about his assumptions. If Pat can write down some maths to argue this point against me, I’ll look very closely at it. The mathematical distinction between error and uncertainty is key to the whole debate, which cannot be resolved by words; show me the mathematics, starting from my Equation (*).

Reply to  See - owe to Rich
October 2, 2019 3:56 pm

Rich,

“Now my B(t;z,s) is a random variable, so taking +/- on it makes no sense”

If B is supposed to be the uncertainty then it is not a random variable. It is an interval, not a variable value. Like with the ruler example, the uncertainty doesn’t change from iteration to iteration. The error in the output from iteration to iteration might change but not the uncertainty interval. The error should always be inside the uncertainty interval.

“To take your ruler example suppose we ordered a batch of 12” rulers and they all happen to be between 11.3 and 13.3”. Then we can think of z=0.3” as the bias in the rulers and s=1”, or +/-1” if you prefer, as the uncertainty in them”

The key word you used is “between”. If they are all inside an interval then that interval is the uncertainty. If some were exactly 11.3″ then those would have a bias of 0.7″. If all the rest were exactly 13.3″ long then they would have a bias of 1.3″. But if you don’t know where in the interval from 11.3″ to 13.3″ each ruler can be then those are the uncertainty interval. Your uncertainty is +1.3″, -0.7″.

Bias (an alias for error) is *not* uncertainty. The word “between” indicates uncertainty.

“However, we know that Pat does not use a single ruler, because he accumulates uncertainty proportional to sqrt(t). I challenge any mathematician to justify that without using the equivalent of my (*) equation which uses a random variable, and indeed Nick Stokes checked that Pat’s references do just that.”

It doesn’t matter which ruler you use if the uncertainty interval is the same for all of them. Pat uses root-SUM-square, not root-mean-square. Root mean square is used in determining standard deviation. Just using the sum of the squares doesn’t make it into a probability distribution. Root-mean-square uses the sum of (x1^2 + x2^2 + x3^2 …..xn^2)/n (i.e. the variance). As Pat’s Eq. 6 shows, the uncertainty is just the sum, there is no division by n. Two different things.

“But he has also said that if you perturb the input just a little then because weather is mathematically chaotic then the output changes a lot, and the difference between the runs tells you something about the uncertainty.”

But it doesn’t tell you about the uncertainty. All it tells you is that the output of the model is based on a higher order function, e.g. squared, cubed, etc.

“With respect Tim, I don’t think that you have sufficient mathematics to argue this point, ”

What you think about me is irrelevant. I’ve given you the math. I know eough math to know that a Monte Carlo analysis can’t define uncertainty for a deterministic model. You have confused root-mean-square with root-sum-square and in doing so have convinced yourself that uncertainty is a probability function and not an interval. Error is a probability function and the error value will lie within the uncertainty interval. It’s that simple. And uncertainty grows with each iteration. You can’t cancel it using the central limit theory.

The whole issue here is that if what you are trying to measure is within the uncertainty interval then you simply don’t know if you have actually measured anything. If your uncertainty interval is +/- 0.1C and you are trying to define a difference of 0.01C between two outputs then you are only kidding yourself that you actually know there really is a difference of 0.01C!

Reply to  Pat Frank
October 2, 2019 9:25 pm

kribaez,

getting to the central issue you wrote, “we had a model Y = bX with an accurately known value of b, and we had a calibration error in X of (+/-)2. I repeat for emphasis that the measurand for the calibration is the variable X. As I reported, the resulting uncertainty in Y is always (+/-)2b in this case, something which is simple to verify. Residuals are stationary.

You’re presuming the uncertainty is due to random error. However, this condition is not known to be true. I apologize for not being clear about this in my response.

You’re also presuming the calculation is always single-step, Y = bX, with X varying but always (+/-)2. This does not describe the effect of uncertainty on sequential calculations involving X.

Let’s suppose your model, Y = bX and the uncertainty in X is (+/-)2 and that uncertainty is from normally distributed error.

Then the calculation Y = bX(+/-)2 puts an uncertainty in Y of (+/-)n.

In that case, Y_1 = bX_1(+/-)2 = Y(+/-)n_1, and Y_2 = bX_2(+/-)2 = Y_2(+/-)_n

Now suppose the sequence of states proceeds to a final state, Y_Final = Y_F = Y_1(+/-)n+Y_2(+/-)n + … +Y_f(+/-)n. Y_F is a sum; not an average.

Each Y includes a (+/-)_n. The uncertainty in the final Y_F = (+/-)sqrt(f*n^2).

If the (+/-)2 is normally distributed in X, then many measurements of X can reduce the magnitude of uncertainty in X and therefore reduce the magnitude of (+/-)n. But the uncertainty in the sum of Y_i will always be >(+/-)n.

In the case of climate models, total cloud fraction (TCF) error is systematic and non-normal (as it is in the historical air temperature measurements). The calibration error in LWCF is also not known to be normal.

You wrote, regarding your new model, “If X is varying (as well) as some (unknown) function of time, then an operator might be asked to measure X at regular time intervals, and then record the change in X since the last measurement.

We can now trivially rewrite our model asYi = b(Xi-1 + ΔXi ) where i is indexing the timesteps. Has the uncertainty in Y changed at all? The answer is no,…

The way you wrote it, ΔXi = X_i(+/-)2 – X_i-1(+/-)2, which means the uncertainty in ΔXi = sqrt[(2)^2+(2)^2] = (+/-)2.8.

Now we take Xi-1(+/-)2+ΔXi(+/-)2.8, and the new uncertainty in X = sqrt[(2)^2+(2.8)^2] = (+/-)3.5.

However, I take your point that the uncertainty in each re-measured X is constant (+/-)2 even though the magnitude of X may vary.

But your time-step model involves changes in X.

The time steps in an air temperature projection involve progressive changes in Y, in each step of which the uncertainty in X enters anew.

You wrote, “in particular you have to change the measurand for the calibration error.

No, actually. The “measurand” is unchanged. It remains as before. Your ΔXi always remains as you have it. The uncertainty in eqn. 1 is not in ΔFi (your ΔXi). The calibration error is independent of the ΔFi.

The (+/-)4 W/m^2 comes in as the simulation uncertainty characteristic of the model. It arises from the LWCF calibration error of CMIP5 GCMs.

You wrote, “The calibrated measurand in the first question is an already integrated series, and the uncertainty is stationary.

The uncertainty is stationary in your model. Not in GCM TCF error and so also not in LWCF calibration error.

Also, your model presumed solitary calculations of Y, which is not analogous to the analysis in the paper.

You wrote, “The calibrated measurand in the second question is the first difference of that series in statistical jargon, and the uncertainty in its sum accumulates like a random walk.

You’re treating the calibration uncertainty as a physical error. In a predictive uncertainty analysis, one never knows how the error accumulates. Sum of physical errors is not rss of predictive uncertainty.

You wrote, “I would therefore invite you to consider very carefully whether in your calibration of LWCF, the RMSE you have found is a measure of error in the total LWCF (the first problem) or a measure of the error on the change in LWCF each year (the second problem).

We know what the LWCF calibration error is.

It is derived from the annual rms of (simulation minus observed) TCF error. The LWCF rmse is not a physical error. It is a calibration error statistic. It does not sum and does not accumulate as a random walk.

The LWCF error is a measure of the ignorance of the magnitude of simulated tropospheric thermal energy flux.

When ΔFi enters that simulation, the impact of ΔFi on clouds is not known to better than allowed by the (+/-)4 W/m^2 uncertainty in tropospheric thermal energy flux.

Cloud response to CO2 forcing is not known to better than allowed by (+/-)4 W/m^2.

It’s a straight-forward concept, kribaez. Simulation TCF error and LWCF rmse say that one cannot know how clouds respond to GHG forcing.

That means one cannot know how the air temperature changes with GHG emissions.

The resolution of the GCMs are too coarse to resolve the effect of GHGs on the climate.

I’ll repost an analysis I posted elsewhere already. It gives another approach to the problem of model resolution in terms of CO2 forcing.

Reply to  Pat Frank
October 2, 2019 9:27 pm

kribaez,

this illustration might clarify the meaning of (+/-)4 W/m^2 of uncertainty in annual average LWCF.

The question to be addressed is what accuracy is necessary in simulated cloud fraction to resolve the annual impact of CO2 forcing?

We know from Lauer and Hamilton that the average CMIP5 (+/-)12.1% annual cloud fraction (CF) error produces an annual average (+/-)4 W/m^2 error in long wave cloud forcing (LWCF).

We also know that the annual average increase in CO2 forcing is about 0.035 W/m^2.

Assuming a linear relationship between cloud fraction error and LWCF error, the (+/-)12.1% CF error is proportionately responsible for (+/-)4 W/m^2 annual average LWCF error.

Then one can estimate the level of resolution necessary to reveal the annual average cloud fraction response to CO2 forcing as, (0.035 W/m^2/(+/-)4 W/m^2)*(+/-)12.1% cloud fraction = 0.11% change in cloud fraction.

This indicates that a climate model needs to be able to accurately simulate a 0.11% feedback response in cloud fraction to resolve the annual impact of CO2 emissions on the climate.

That is, the cloud feedback to a 0.035 W/m^2 annual CO2 forcing needs to be known, and able to be simulated, to a resolution of 0.11% in CF in order to know how clouds respond to annual CO2 forcing.

Alternatively, we know the total tropospheric cloud feedback effect is about -25 W/m^2. This is the cumulative influence of 67% global cloud fraction.

The annual tropospheric CO2 forcing is, again, about 0.035 W/m^2. The CF equivalent that produces this feedback energy flux is again linearly estimated as (0.035 W/m^2/25 W/m^2)*67% = 0.094%.

Assuming the linear relations are reasonable, both methods indicate that the model resolution needed to accurately simulate the annual cloud feedback response of the climate, to an annual 0.035 W/m^2 of CO2 forcing, is about 0.1% CF.

To achieve that level of resolution, the model must accurately simulate cloud type, cloud distribution and cloud height, as well as precipitation and tropical thunderstorms.

This analysis illustrates the meaning of the (+/-)4 W/m^2 LWCF error. That error indicates the overall level of ignorance concerning cloud response and feedback.

The CF ignorance is such that tropospheric thermal energy flux is never known to better than (+/-)4 W/m^2. This is true whether forcing from CO2 emissions is present or not.

GCMs cannot simulate cloud response to 0.1% accuracy. It is not possible to simulate how clouds will respond to CO2 forcing.

It is therefore not possible to simulate the effect of CO2 emissions, if any, on air temperature.

As the model steps through the projection, our knowledge of the consequent global CF steadily diminishes because a GCM cannot simulate the global cloud response to CO2 forcing, and thus cloud feedback, at all for any step.

It is true in every step of a simulation. And it means that projection uncertainty compounds because every erroneous intermediate climate state is subjected to further simulation error.

This is why the uncertainty in projected air temperature increases so dramatically. The model is step-by-step walking away from initial value knowledge further and further into ignorance.

On an annual average basis, the uncertainty in CF feedback is (+/-)144 times larger than the perturbation to be resolved.

The CF response is so poorly known, that even the first simulation step enters terra incognita.

kribaez
September 27, 2019 10:06 am

Dr Frank,
A continuation of my previous post.

You attach some importance to the fact that the TCF shows evidence of high lag-1 autocorrelation in individual models.
It is of some note that the comparison of observed to modeled was over a period (1980 to 2004) when temperatures were rising sharply in the models.
With the LTI model, I ran a projection of a 1% p.a. increase in CO2 for 70 years and then a constant forcing thereafter. I left in the system a random annual net flux error drawn from a N(0,0.3^2). I then calculated the flux feedback from cloud forcing as a linear function of Temp using 0.5 Watts/m2/K. Unsurprisingly, this series showed very high lag-1 autocorrelation. The reason is simply that the temperature over this period was rising almost linearly in the model, which meant that the flux feedback from cloud forcing was rising almost linearly by assumption.
Any series which is approximately linear in time will show an autocorrelation close to unity.
It is important to note that I also separately checked the flux feedback from cloud forcing over the period AFTER the temperature stabilised into small oscillations (from the random annual net flux error around a constant value). Even under these circumstances, the flux feedback series showed autocorrelation of 0.91. This arises directly from the autocorrelation in the temperature series, which is determined by the second term in Eq 3a above.
So autocorrelation in the flux series is unsurprising. The autocorrelation which you found in the residuals are also unsurprising, especially since the temperature over the period was rising sharply. Your observation may (and probably does) indicate an error in the effective cloud feedback, but such error does not propagate like an integrating series in flux.
In summary, I don’t doubt that there is a combination of systemic bias in TCF and poor estimation of temperature-dependent cloud feedback in the GCMs. However, I can still find no justification for your belief that uncertainty in TCF can be translated into an uncertainty in forcing, and even less that it propagates as an integrating series.

Reply to  kribaez
September 28, 2019 6:44 pm

The lag-1 autocorrelation is not a time series, kribaez. It’s a spatial lag-1, across latitude.

kribaez
Reply to  Pat Frank
September 29, 2019 3:43 am

Pat,
Thank you for this. My bad reading. And I agree with you that it implies deterministic or structural error in the models. I understood you to be drawing an inference that the autocorrelation supported your treatment of uncertainty propagation.

Matthew R Marler
September 28, 2019 11:09 am

kribaez: In summary, I don’t doubt that there is a combination of systemic bias in TCF and poor estimation of temperature-dependent cloud feedback in the GCMs. However, I can still find no justification for your belief that uncertainty in TCF can be translated into an uncertainty in forcing, and even less that it propagates as an integrating series.

I think you have made a case that uncertainty is greater than the estimate provided by Pat Frank.

As to the “one size fits all”, I would counter that this is a well-done first approximation to GCM model uncertainty related to uncertainty in one of the parameters, and it can be improved upon eventually, but not really soon. I look forward to reading more work on GCM prediction uncertainty.